EXTRACT_TABLES_PDF
Attempts to extract tabular data from a PDF by detecting aligned columns. Returns structured rows per page. Best with clean, structured PDFs like invoices and reports.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
path | string | yes | Absolute path to the PDF file |
page_number | integer | — | Specific page to extract tables from (1-indexed, default: all pages) |
How to use it
You normally trigger this by describing what you want in chat — the agent selects EXTRACT_TABLES_PDF automatically. For example:
Try saying
“use pdf handler to extract …”
In a workflow
As a step in a multi-step workflow DAG:
json
{
"id": "s1",
"agent": "pdf_handler",
"action": "EXTRACT_TABLES_PDF",
"args": {
"path": "/Users/me/Documents/file.txt",
"page_number": 1
},
"depends_on": [],
"outputs": []
}Direct call
For scripting, call it directly via POST /execute_tool. Every tool returns { success, message, data }.
bash
curl -X POST http://127.0.0.1:8000/execute_tool \
-H "Content-Type: application/json" \
-d '{"tool_name":"EXTRACT_TABLES_PDF","args":{"path":"/Users/me/Documents/file.txt","page_number":1}}'Part of the pdf_handler plugin. Browse the full Plugin & Tool Catalog or the relevant feature guide.