Document Parser
Document Parser can automatically parse PDF, JPG, PNG document to extract fields, tables, values, barcodes from invoices, statements, orders and other PDF and scanned documents.
Built-in document parser templates:
General Invoice Template
can parse invoices (English only) to invoice id, invoice date, extract total, tax, line items. Set thetemplateId
parameter to1
to use this template.
How to classify incoming documents before parsing them?
Use /pdf/classifier
endpoint (see below) to automatically sort / detect the class of the document based on AI or on custom keywords based rules.
For example, you can easily define rules to find which vendor provided the document to find which template to apply accordingly. See Document Classifier for more details.
Additional Information and Tools
- Document Parser Template Editor (or check a standalone version here)
- Document Parser Template Objects Guide
Available Methods
- [POST] /pdf/documentparser (output as JSON)
- [POST] /pdf/documentparser (output as XML)
- [POST] /pdf/documentparser (output as CSV)
- [POST] /pdf/documentparser (output as JSON, custom template code)
- [GET] /pdf/documentparser/templates
- [GET] /pdf/documentparser/templates/:id
[POST] /pdf/documentparser (output as JSON)
Description: This API method extracts data from documents based on a document parser extraction template. With this API method you may extract data from custom areas, by search, form fields, tables, multiple pages and more!
Tools and Guides:
See Also
Attributes
Hint: attributes should be inside JSON for POST request:
{
"url": "url-input-link"
}
Attributes |
---|
url required URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage. For uploading files via API please check Files Upload section. If you are randomly getting a Too Many Requests or Access Denied error for your input URL, please try to add cache: to enable built-in URL caching. You can also encrypt data for output files and decrypt data input files with user-controlled data encryption to learn more. |
httpusername optional HTTP auth user name if required to access source url |
httppassword optional HTTP auth password if required to access source url . |
templateId required Set ID of document parser template to be used. View and manage your templates at https://app.pdf.co/document-parser |
template optional You can pass the code of the document parser template to be used directly. |
inline optional Set to true to return results inside the response. Otherwise, the endpoint will return a link to the output file generated. |
outputFormat optional Default is JSON . You can override the default output format to CSV or XML to generate CSV or XML output accordingly. |
password optional Password of PDF file, The input must be in string format. |
async optional Runs processing asynchronously and returns JobId that you may use with /job/check to check the state of the background job (possible states: working , failed , aborted and success ). Must be one of: true , or false . |
name optional File name for the generated output, The input must be in string format. |
expiration optional Output link expiration in minutes. The default is 60 (i.e. 60 minutes or 1 hour). After this delay generated output file(s) (if any) will be auto-removed from PDF.co temporary files storage. Max allowed expiration period depends on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf, documents), please use PDF.co built-in Files Storage instead. |
profiles optional Use this parameter to set additional configurations for fine-tuning and extra options. Explore PDF.co knowledgebase for profile examples, The input must be in string format. |
- Method: POST
- URL: /v1/pdf/documentparser
Query parameters
No query parameters accepted.
Body payload
{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"outputFormat": "JSON",
"templateId": "1",
"async": false,
"inline": "true",
"password": "",
"profiles": ""
}
Example responses
/pdf/documentparser (output as JSON)
{
"body": {
"objects": [
{
"name": "companyName",
"objectType": "field",
"value": "Amazon Web Services, Inc",
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "companyName2",
"objectType": "field",
"value": "Amazon Web Services, Inc",
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "invoiceId",
"objectType": "field",
"value": "123456789",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "dateIssued",
"objectType": "field",
"value": "2018-04-03T00:00:00",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "dateDue",
"objectType": "field",
"value": "2018-04-03T00:00:00",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "bankAccount",
"objectType": "field",
"value": "123456789012",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "total",
"objectType": "field",
"value": 6.58,
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "subTotal",
"objectType": "field",
"value": ""
},
{
"name": "tax",
"objectType": "field",
"value": 1.01,
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"objectType": "table",
"name": "table",
"rows": []
}
],
"templateName": "Generic Invoice [en]",
"templateVersion": "4",
"timestamp": "2020-08-21T19:23:31"
},
"pageCount": 1,
"error": false,
"status": 200,
"name": "sample-invoice.json",
"remainingCredits": 60803
}
Code Snippet
CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"outputFormat": "JSON",
"templateId": "1",
"async": false,
"inline": "true",
"password": "",
"profiles": ""
}'
[POST] /pdf/documentparser (output as XML)
Description: Extracts data from pdf and scanned documents using a data extraction template (called Document Parser Template
). With this API method, you may extract data from custom areas, by search, form fields, tables, multiple pages, and more!
Tools and Guides:
See Also
Attributes
Hint: attributes should be inside JSON for POST request:
{
"url": "url-input-link"
}
Attributes |
---|
url required URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage. For uploading files via API please check Files Upload section. If you are randomly getting a Too Many Requests or Access Denied error for your input URL, please try to add cache: to enable built-in URL caching. You can also encrypt data for output files and decrypt data input files with user-controlled data encryption to learn more. |
httpusername optional HTTP auth user name if required to access source url |
httppassword optional HTTP auth password if required to access source url . |
templateId required Sets Id of document parser template to be used. View and manage your templates at https://app.pdf.co/document-parser |
template optional You can pass the code of the document parser template to be used directly. |
inline optional Set to true to return results inside the response. Otherwise, the endpoint will return a link to the output file generated. |
outputFormat optional Default is JSON . You can override the default output format to CSV or XML to generate CSV or XML output accordingly. |
password optional Password of PDF file. The input must be in string format. |
async optional Runs processing asynchronously and returns JobId that you may use with /job/check to check the state of the background job (possible states: working , failed , aborted and success ). Must be one of: true , or false . |
name optional File name for the generated output, The input must be in string format. |
expiration optional Output link expiration in minutes. The default is 60 (i.e. 60 minutes or 1 hour). After this delay generated output file(s) (if any) will be auto-removed from PDF.co temporary files storage. Max allowed expiration period depends on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf, documents), please use PDF.co built-in Files Storage instead. |
profiles optional Use this parameter to set additional configurations for fine-tuning and extra options. Explore PDF.co knowledgebase for profile examples, The input must be in string format. |
- Method: POST
- URL: /v1/pdf/documentparser
Query parameters
No query parameters accepted.
Body payload
{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"outputFormat": "XML",
"templateId": "1",
"async": false,
"inline": "true",
"password": "",
"profiles": ""
}
Example responses
/pdf/documentparser (output as XML)
{
"body": "<?xml version=\"1.0\" encoding=\"utf-16\"?>\r\n<parsingResult>\r\n <objects>\r\n <name>companyName</name>\r\n <objectType>field</objectType>\r\n <value>ACME Inc.</value>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n </objects>\r\n <objects>\r\n <name>companyName2</name>\r\n <objectType>field</objectType>\r\n <value>Lanny Lane Ltd.</value>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n </objects>\r\n <objects>\r\n <name>invoiceId</name>\r\n <objectType>field</objectType>\r\n <value>67893566</value>\r\n <pageIndex>0</pageIndex>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n </objects>\r\n <objects>\r\n <name>dateIssued</name>\r\n <objectType>field</objectType>\r\n <value>2019-01-05T00:00:00</value>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n </objects>\r\n <objects>\r\n <name>dateDue</name>\r\n <objectType>field</objectType>\r\n <value>2019-01-05T00:00:00</value>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n </objects>\r\n <objects>\r\n <name>bankAccount</name>\r\n <objectType>field</objectType>\r\n <value>\r\n </value>\r\n </objects>\r\n <objects>\r\n <name>total</name>\r\n <objectType>field</objectType>\r\n <value>1272.35</value>\r\n <pageIndex>0</pageIndex>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n </objects>\r\n <objects>\r\n <name>subTotal</name>\r\n <objectType>field</objectType>\r\n <value>1262.35</value>\r\n <pageIndex>0</pageIndex>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n </objects>\r\n <objects>\r\n <name>tax</name>\r\n <objectType>field</objectType>\r\n <value>10</value>\r\n <pageIndex>0</pageIndex>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n <rectangle>0</rectangle>\r\n </objects>\r\n <objects>\r\n <objectType>table</objectType>\r\n <name>table</name>\r\n <rows>\r\n <column1>\r\n <pageIndex>0</pageIndex>\r\n <value>2</value>\r\n </column1>\r\n <column2>\r\n <pageIndex>0</pageIndex>\r\n <value>Item 1</value>\r\n </column2>\r\n <column3>\r\n <pageIndex>0</pageIndex>\r\n <value>9.95</value>\r\n </column3>\r\n <column4>\r\n <pageIndex>0</pageIndex>\r\n <value>19.90</value>\r\n </column4>\r\n </rows>\r\n <rows>\r\n <column1>\r\n <pageIndex>0</pageIndex>\r\n <value>5</value>\r\n </column1>\r\n <column2>\r\n <pageIndex>0</pageIndex>\r\n <value>Item 2</value>\r\n </column2>\r\n <column3>\r\n <pageIndex>0</pageIndex>\r\n <value>20.00</value>\r\n </column3>\r\n <column4>\r\n <pageIndex>0</pageIndex>\r\n <value>100.00</value>\r\n </column4>\r\n </rows>\r\n <rows>\r\n <column1>\r\n <pageIndex>0</pageIndex>\r\n <value>1</value>\r\n </column1>\r\n <column2>\r\n <pageIndex>0</pageIndex>\r\n <value>Item 3</value>\r\n </column2>\r\n <column3>\r\n <pageIndex>0</pageIndex>\r\n <value>19.95</value>\r\n </column3>\r\n <column4>\r\n <pageIndex>0</pageIndex>\r\n <value>19.95</value>\r\n </column4>\r\n </rows>\r\n <rows>\r\n <column1>\r\n <pageIndex>0</pageIndex>\r\n <value>1</value>\r\n </column1>\r\n <column2>\r\n <pageIndex>0</pageIndex>\r\n <value>Item 4</value>\r\n </column2>\r\n <column3>\r\n <pageIndex>0</pageIndex>\r\n <value>123.00</value>\r\n </column3>\r\n <column4>\r\n <pageIndex>0</pageIndex>\r\n <value>123.00</value>\r\n </column4>\r\n </rows>\r\n <rows>\r\n <column1>\r\n <pageIndex>0</pageIndex>\r\n <value>10</value>\r\n </column1>\r\n <column2>\r\n <pageIndex>0</pageIndex>\r\n <value>Item 5</value>\r\n </column2>\r\n <column3>\r\n <pageIndex>0</pageIndex>\r\n <value>99.95</value>\r\n </column3>\r\n <column4>\r\n <pageIndex>0</pageIndex>\r\n <value>999.50</value>\r\n </column4>\r\n </rows>\r\n </objects>\r\n <elapsed>0.320434</elapsed>\r\n <templateName>Generic Invoice [en]</templateName>\r\n <templateVersion>4</templateVersion>\r\n <timestamp>2021-12-31T14:54:31</timestamp>\r\n</parsingResult>\r\n",
"pageCount": 1,
"error": false,
"status": 200,
"name": "sample-invoice.xml",
"remainingCredits": 99046120,
"credits": 42
}
Code Snippet
CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"outputFormat": "XML",
"templateId": "1",
"async": false,
"inline": "true",
"password": "",
"profiles": ""
}'
[POST] /pdf/documentparser (output as CSV)
Description: Gets data from documents using a data extraction template. With this API method you may extract data from custom areas, by search, form fields, tables, multiple pages and more!
Tools and Guides:
See Also
Attributes
Hint: attributes should be inside JSON for POST request:
{
"url": "url-input-link"
}
Attributes |
---|
url required URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage. For uploading files via API please check Files Upload section. If you are randomly getting a Too Many Requests or Access Denied error for your input URL, please try to add cache: to enable built-in URL caching. You can also encrypt data for output files and decrypt data input files with user-controlled data encryption to learn more. |
httpusername optional HTTP auth user name if required to access source url |
httppassword optional HTTP auth password if required to access source url . |
templateId required Sets Id of document parser template to be used. View and manage your templates at https://app.pdf.co/document-parser |
template optional You can pass the code of the document parser template to be used directly. |
inline optional Set to true to return results inside the response. Otherwise, the endpoint will return a link to the output file generated. |
outputFormat optional Default is JSON . You can override default output format to CSV or XML to generate CSV or XML output accordingly. |
password optional Password of PDF file, The input must be in string format. |
async optional Runs processing asynchronously and returns JobId that you may use with /job/check to check the state of the background job (possible states: working , failed , aborted and success ). Must be one of: true , or false . |
name optional File name for generated output, The input must be in string format. |
expiration optional Output link expiration in minutes. The default is 60 (i.e. 60 minutes or 1 hour). After this delay generated output file(s) (if any) will be auto-removed from PDF.co temporary files storage. Max allowed expiration period depends on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf, documents), please use PDF.co built-in Files Storage instead. |
profiles optional Use this parameter to set additional configurations for fine-tuning and extra options. Explore PDF.co knowledgebase for profile examples, The input must be in string format. |
- Method: POST
- URL: /v1/pdf/documentparser
Query parameters
No query parameters accepted.
Body payload
{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"templateId": "1",
"outputFormat": "CSV",
"generateCsvHeaders": true,
"async": false,
"inline": "true",
"password": ""
}
Example responses
/pdf/documentparser (output as CSV)
{
"body": "companyName,companyName2,invoiceId,dateIssued,dateDue,bankAccount,total,subTotal,tax,tableNames,tables\r\n\"Amazon Web Services, Inc\",\"Amazon Web Services, Inc\",123456789,2018-04-03T00:00:00,2018-04-03T00:00:00,123456789012,6.58,,1.01,table,\r\n\r\n",
"pageCount": 1,
"error": false,
"status": 200,
"name": "sample-invoice.csv",
"remainingCredits": 60804
}
Code Snippet
CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/sample-invoice.pdf",
"templateId": "1",
"outputFormat": "CSV",
"generateCsvHeaders": true,
"async": false,
"inline": "true",
"password": ""
}'
[POST] /pdf/documentparser (output as JSON, custom template code)
Description: Parses and gets data from documents using previously prepared custom data extraction templates. With this API method you may extract data from custom areas, by search, form fields, tables, multiple pages and more!
Tools and Guides:
See Also
Attributes
Hint: attributes should be inside JSON for POST request:
{
"url": "url-input-link"
}
Attributes |
---|
url required URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage. For uploading files via API please check Files Upload section. If you are randomly getting a Too Many Requests or Access Denied error for your input URL, please try to add cache: to enable built-in URL caching. You can also encrypt data for output files and decrypt data input files with user-controlled data encryption to learn more. |
httpusername optional HTTP auth user name if required to access source url |
httppassword optional HTTP auth password if required to access source url . |
templateId required Sets Id of document parser template to be used. View and manage your templates at https://app.pdf.co/document-parser |
template optional You can pass the code of the document parser template to be used directly. |
inline optional Set to true to return results inside the response. Otherwise, endpoint will return a link to the output file generated. |
outputFormat optional Default is JSON . You can override the default output format to CSV or XML to generate CSV or XML output accordingly. |
password optional Password of PDF file. Must be a String |
async optional Runs processing asynchronously. Returns JobId that you may use with /job/check to check the state of the background job (possible states: working , failed , aborted , and success ). Must be one of: true , or false . |
name optional File name for generated output, The input must be in string format. |
expiration optional Output link expiration in minutes. The default is 60 (i.e. 60 minutes or 1 hour). After this delay generated output file(s) (if any) will be auto-removed from PDF.co temporary files storage. Max allowed expiration period depends on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf, documents), please use PDF.co built-in Files Storage instead. |
profiles optional Use this parameter to set additional configurations for fine-tuning and extra options. Explore PDF.co knowledgebase for profile examples, The input must be in string format. |
- Method: POST
- URL: /v1/pdf/documentparser
Query parameters
No query parameters accepted.
Body payload
{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/MultiPageTable.pdf",
"template": "{\r\n \"templateVersion\": 3,\r\n \"templatePriority\": 0,\r\n \"sourceId\": \"Multipage Table Test\",\r\n \"detectionRules\": {\r\n \"keywords\": [\r\n \"Sample document with multi-page table\"\r\n ]\r\n },\r\n \"fields\": {\r\n \"total\": {\r\n \"type\": \"regex\",\r\n \"expression\": \"TOTAL \",\r\n \"dataType\": \"decimal\"\r\n }\r\n },\r\n \"tables\": [\r\n {\r\n \"name\": \"table1\",\r\n \"start\": {\r\n \"expression\": \"Item\\\\s+Description\\\\s+Price\\\\s+Qty\\\\s+Extended Price\"\r\n },\r\n \"end\": {\r\n \"expression\": \"TOTAL\\\\s+\\\\d+\\\\.\\\\d\\\\d\"\r\n },\r\n \"row\": {\r\n \"expression\": \"^\\\\s*(?<itemNo>\\\\d+)\\\\s+(?<description>.+?)\\\\s+(?<price>\\\\d+\\\\.\\\\d\\\\d)\\\\s+(?<qty>\\\\d+)\\\\s+(?<extPrice>\\\\d+\\\\.\\\\d\\\\d)\"\r\n },\r\n \"columns\": [\r\n {\r\n \"name\": \"itemNo\",\r\n \"type\": \"integer\"\r\n },\r\n {\r\n \"name\": \"description\",\r\n \"type\": \"string\"\r\n },\r\n {\r\n \"name\": \"price\",\r\n \"type\": \"decimal\"\r\n },\r\n {\r\n \"name\": \"qty\",\r\n \"type\": \"integer\"\r\n },\r\n {\r\n \"name\": \"extPrice\",\r\n \"type\": \"decimal\"\r\n }\r\n ],\r\n \"multipage\": true\r\n }\r\n ]\r\n}",
"outputFormat": "JSON",
"async": false,
"inline": "true",
"profiles": "",
"password": ""
}
Example responses
POST /pdf/documentparser
{
"body": {
"objects": [
{
"name": "companyName",
"objectType": "field",
"value": "Amazon Web Services, Inc",
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "companyName2",
"objectType": "field",
"value": "Amazon Web Services, Inc",
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "invoiceId",
"objectType": "field",
"value": "123456789",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "dateIssued",
"objectType": "field",
"value": "2018-04-03T00:00:00",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "dateDue",
"objectType": "field",
"value": "2018-04-03T00:00:00",
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "total",
"objectType": "field",
"value": 6.58,
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"name": "subTotal",
"objectType": "field",
"value": ""
},
{
"name": "tax",
"objectType": "field",
"value": 1.01,
"pageIndex": 0,
"rectangle": [
0,
0,
0,
0
]
},
{
"objectType": "table",
"name": "table",
"rows": []
}
],
"templateName": "Generic Invoice [en]",
"templateVersion": "4",
"timestamp": "2020-07-16T22:04:25"
},
"pageCount": 1,
"error": false,
"status": 200,
"name": "sample-invoice.json",
"remainingCredits": 77731
}
Code Snippet
CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/document-parser/MultiPageTable.pdf",
"template": "{\r\n \"templateVersion\": 3,\r\n \"templatePriority\": 0,\r\n \"sourceId\": \"Multipage Table Test\",\r\n \"detectionRules\": {\r\n \"keywords\": [\r\n \"Sample document with multi-page table\"\r\n ]\r\n },\r\n \"fields\": {\r\n \"total\": {\r\n \"type\": \"regex\",\r\n \"expression\": \"TOTAL \",\r\n \"dataType\": \"decimal\"\r\n }\r\n },\r\n \"tables\": [\r\n {\r\n \"name\": \"table1\",\r\n \"start\": {\r\n \"expression\": \"Item\\\\s+Description\\\\s+Price\\\\s+Qty\\\\s+Extended Price\"\r\n },\r\n \"end\": {\r\n \"expression\": \"TOTAL\\\\s+\\\\d+\\\\.\\\\d\\\\d\"\r\n },\r\n \"row\": {\r\n \"expression\": \"^\\\\s*(?<itemNo>\\\\d+)\\\\s+(?<description>.+?)\\\\s+(?<price>\\\\d+\\\\.\\\\d\\\\d)\\\\s+(?<qty>\\\\d+)\\\\s+(?<extPrice>\\\\d+\\\\.\\\\d\\\\d)\"\r\n },\r\n \"columns\": [\r\n {\r\n \"name\": \"itemNo\",\r\n \"type\": \"integer\"\r\n },\r\n {\r\n \"name\": \"description\",\r\n \"type\": \"string\"\r\n },\r\n {\r\n \"name\": \"price\",\r\n \"type\": \"decimal\"\r\n },\r\n {\r\n \"name\": \"qty\",\r\n \"type\": \"integer\"\r\n },\r\n {\r\n \"name\": \"extPrice\",\r\n \"type\": \"decimal\"\r\n }\r\n ],\r\n \"multipage\": true\r\n }\r\n ]\r\n}",
"outputFormat": "JSON",
"async": false,
"inline": "true",
"profiles": "",
"password": ""
}'
[GET] /pdf/documentparser/templates
Return all Document Parser data extraction templates for the current user. Please use GET
request.
Manage your Document Parser templates at https://app.pdf.co/document-parser/templates
- Method: GET
- URL: /v1/pdf/documentparser/templates
Query parameters
No query parameters accepted.
Body payload
No body parameters accepted.
Example responses
pdf/documentparser/templates
{
"templates": [
{
"id": 40,
"type": "user",
"title": "Untitled",
"description": "Untitled"
},
{
"id": 1,
"type": "system",
"title": "Invoice Parser",
"description": "Parses invoices and extracts invoice number, company name, due date, amount, tax"
}
],
"remainingCredits": 94229
}
Code Snippet
CURL
curl --location --request GET 'https://api.pdf.co/v1/pdf/documentparser/templates' \
--header 'Content-Type: application/json' \
--header 'x-api-key: '
[GET] /pdf/documentparser/templates/:id
Returns detailed information for document parser template by template’s id. Please use GET
request.
Manage your Document Parser templates at https://app.pdf.co/document-parser/templates
- Method: GET
- URL: /v1/pdf/documentparser/templates/:id
Query parameters
No query parameters accepted.
Body payload
No body parameters accepted.
Example responses
No example responses saved.
Code Snippet
CURL
curl --location --request GET 'https://api.pdf.co/v1/pdf/documentparser/templates/1' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw ''
Samples
- C# - Blood Test Results to JSON
- C# - Census table from life and annuity quote request pdf
- C# - Create Custom Template
- C# - Extract line items from tables on multiple pages
- C# - Parse From URL
- C# - Parse From URL Asynchronously
- C# - Parse Multipage Table
- C# - Parse Simple Document
- C# - Parse Uploaded File
- C# - Parse Uploaded File Asynchronously
- C# - Parse Uploaded File Asynchronously (Using TemplateId)
- C# - Parse and Generate HL7 Output
- C# - Parse with OCR
- C# - Parsing and reading data from Airline Tickets
- GoogleAppScript - Convert PDF Invoice to Google Sheet
- Java - Blood Test Results to JSON
- Java - Create Custom Template
- Java - Extract line items from tables on multiple pages
- Java - Parse From URL Asynchronously
- Java - Parse From Url
- Java - Parse Multipage Table
- Java - Parse Simple Document
- Java - Parse Uploaded File
- Java - Parse Uploaded File Asynchronously
- Java - Parse with OCR
- Java - Parsing and reading data from Airline Tickets
- JavaScript - Parse From Url (Node.js)
- JavaScript - Parse Uploaded File (Node.js)
- PHP - Blood Test Results to JSON
- PHP - Create Custom Template
- PHP - Extract line items from tables on multiple pages
- PHP - Parse From URL Asynchronously
- PHP - Parse Invoice and Fill Database (SQL Server)
- PHP - Parse Invoice and Save Table Data to mySql Database
- PHP - Parse Multipage Table
- PHP - Parse Simple Document
- PHP - Parse Uploaded File Asynchronously
- PHP - Parse with OCR
- PHP - Parsing and reading data from Airline Tickets
- Powershell - Parse From Uploaded File
- Powershell - Parse From Url
- Python - Parse From Uploaded File
- Python - Parse From Url
- Python - Parse PDF Invoice
- Salesforce - Document Parser Demo
- Salesforce - Parse Document and Get CSV Output
- SharePoint - Parse Invoice Information
- TEMPLATES-SAMPLES - Form IRS Form 1040
- TEMPLATES-SAMPLES - Form IRS Form 1099-DIV
- TEMPLATES-SAMPLES - Form IRS Form 1099-K
- TEMPLATES-SAMPLES - Form IRS Form W2
- TEMPLATES-SAMPLES - Invoice Get Email Address
- TEMPLATES-SAMPLES - Invoice Get Total And Tax
- TEMPLATES-SAMPLES - Invoice Simple Invoice
- TEMPLATES-SAMPLES - Invoice from Amazon AWS
- TEMPLATES-SAMPLES - Invoice from Digial Ocean Scanned
- TEMPLATES-SAMPLES - Invoice from Digital Ocean
- TEMPLATES-SAMPLES - Invoice from Google
- TEMPLATES-SAMPLES - Invoice from ManyChat
- TEMPLATES-SAMPLES - Invoice from PandaDoc
- TEMPLATES-SAMPLES - Invoice table with empty columns
- TEMPLATES-SAMPLES - Invoice with Hanging Rows
- TEMPLATES-SAMPLES - Invoice with Tax and Line Items
- TEMPLATES-SAMPLES - Invoice with line items in EUR
- TEMPLATES-SAMPLES - Invoice with line items in bordered table
- TEMPLATES-SAMPLES - Order form with line items and total
- TEMPLATES-SAMPLES - Report - Blood Test Results
- TEMPLATES-SAMPLES - Report Echocardiogram - Key Value Fields
- TEMPLATES-SAMPLES - Report HL7
- TEMPLATES-SAMPLES - Shipment Label from Amazon
- TEMPLATES-SAMPLES - Shipping Label from USPS
- TEMPLATES-SAMPLES - Statement from Bank of America
- TEMPLATES-SAMPLES - Statement from JPMorgan Chase
- TEMPLATES-SAMPLES - Statement from Wells Fargo
- TEMPLATES-SAMPLES - Statement of Assets
- TEMPLATES-SAMPLES - Table Auto Detection
- TEMPLATES-SAMPLES - Table Multiline Items Without Borders
- TEMPLATES-SAMPLES - Table Multiple Pages - Approach 2 - Define Column Coordinates
- TEMPLATES-SAMPLES - Table Multiple pages - Approach 1 - Detect Columns Automatically
- TEMPLATES-SAMPLES - Table Read From columns 2 and 3
- TEMPLATES-SAMPLES - Table Without Borders Auto Detection
- TEMPLATES-SAMPLES - Table from census table life and annuity quote request pdf
- TEMPLATES-SAMPLES - Table with Multiple Subitems
- TEMPLATES-SAMPLES - Text Extraction from Foldable Brochure Booklet
- TEMPLATES-SAMPLES - Ticket Airline
- VB.NET - Blood Test Results to JSON
- VB.NET - Census table from life and annuity quote request pdf
- VB.NET - Create Custom Template
- VB.NET - Extract line items from tables on multiple pages
- VB.NET - Parse From Url
- VB.NET - Parse Multipage Table
- VB.NET - Parse Simple Document
- VB.NET - Parse Uploaded File
- VB.NET - Parse Uploaded File Asynchronously
- VB.NET - Parse with OCR
- VB.NET - Parsing and reading data from Airline Tickets
- cURL - Document Parser Custom Template Code
- cURL - Document Parser Output as CSV
- cURL - Document Parser Output as JSON
- cURL - Document Parser Results
Copyright © 2016 - 2023 PDF.co