← All toolsFiles & Documentspdf_handlerAll editions

EXTRACT_TABLES_PDF

Attempts to extract tabular data from a PDF by detecting aligned columns. Returns structured rows per page. Best with clean, structured PDFs like invoices and reports.

Parameters

NameTypeRequiredDescription
pathstringyesAbsolute path to the PDF file
page_numberintegerSpecific page to extract tables from (1-indexed, default: all pages)

How to use it

You normally trigger this by describing what you want in chat — the agent selects EXTRACT_TABLES_PDF automatically. For example:

Try saying
“use pdf handler to extract …”

In a workflow

As a step in a multi-step workflow DAG:

json
{
  "id": "s1",
  "agent": "pdf_handler",
  "action": "EXTRACT_TABLES_PDF",
  "args": {
    "path": "/Users/me/Documents/file.txt",
    "page_number": 1
  },
  "depends_on": [],
  "outputs": []
}

Direct call

For scripting, call it directly via POST /execute_tool. Every tool returns { success, message, data }.

bash
curl -X POST http://127.0.0.1:8000/execute_tool \
  -H "Content-Type: application/json" \
  -d '{"tool_name":"EXTRACT_TABLES_PDF","args":{"path":"/Users/me/Documents/file.txt","page_number":1}}'

Part of the pdf_handler plugin. Browse the full Plugin & Tool Catalog or the relevant feature guide.