Link Search Menu Expand Document

Document Parser

Document Parser can automatically parse yor PDF, JPG, PNG document to extract fields, tables, values from invoices, statements, orders and other PDF and scanned documents.

Built-in document parser templates:

  • Invoices (currently, English language only) can be parsed by the built-in template 1 (set the templateId param to 1).

Auto classification Of Incoming Documents

Use /pdf/classifer endpoint (see below) to automatically sort / detect the class of the document based on keywords-based rules. For example, you can define rules to find which vendor provided the document to find which template to apply accordingly.

Helper tools:

Some Sample Templates

Simple templates (extracting data from fixed coordinates) can be easily created with Document Parser template editor

Complex templates are also supported. You can edit them using the editor and also can edit manually.

Sample PDF document: PDF document with tables and multiple pages

Below is the template that demonstrates parsing of multi-page table using only regular expressions for the table start, end, and rows. If regular expression cannot be used for every table row (for example, if the table contains empty cells), try the second template below (MultiPageTable-template2) showcasing another approach.

MultipageTable-template1.json:

{
  "templateVersion": 3,
  "templatePriority": 0,
  "sourceId": "Multipage Table Test",
  "detectionRules": {
    "keywords": []
  },
  "fields": {
    "total": {
      "type": "regex",
      "expression": "TOTAL ",
      "dataType": "decimal"
    }
  },
  "tables": [
    {
      "name": "table1",
      "start": {
        "expression": "Item\\s+Description\\s+Price\\s+Qty\\s+Extended Price"
      },
      "end": {
        "expression": "TOTAL\\s+\\d+\\.\\d\\d"
      },
      "row": {
        "expression": "^\\s*(?<itemNo>\\d+)\\s+(?<description>.+?)\\s+(?<price>\\d+\\.\\d\\d)\\s+(?<qty>\\d+)\\s+(?<extPrice>\\d+\\.\\d\\d)"
      },
      "columns": [
        {
          "name": "itemNo",
          "type": "integer"
        },
        {
          "name": "description",
          "type": "string"
        },
        {
          "name": "price",
          "type": "decimal"
        },
        {
          "name": "qty",
          "type": "integer"
        },
        {
          "name": "extPrice",
          "type": "decimal"
        }
      ],
      "multipage": true
    }
  ]
}

NOTE: put this template code into the template param as string. You can also just add this template into your templates and reference to it using templateId param.

Available methods

[POST] /pdf/documentparser (output as JSON)

Description: Gets data from documents using a data extraction template. With this API method you may extract data from custom areas, by search, form fields, tables, multiple pages and more!

Additional tools and guides:

Parameters

  • url required. URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage. For uploading files via API please check Files Upload section. If you are randomly getting Too Many Requests or Access Denied error for your input url, please try to add cache: to enable built-in url caching.
  • templateId. required. Sets Id of document parser template to be used. View and manage your templates at https://app.pdf.co/document-parser
  • template. optional. You can pass code of document parser template to be used directly.
  • inline. optional. Set to true to return results inside the response. Otherwise endpoint will return a link to output file generated.
  • outputFormat. optional. Default is JSON. You can override default output format to CSV or XML to generate CSV or XML output accordingly.
  • storeResult. optional. Set to true if you want to store results generated in PDF.co. You can view stored data extraction results at https://app.pdf.co/document-parser
  • password optional. Password of PDF file. Must be a String
  • async optional. Runs processing asynchronously. Returns Use JobId that you may use with /job/check to check state of the processing (possible states: working, failed, aborted and success). Must be one of: true, false.
  • encrypt optional. Enable encryption for output file. Must be one of: true, false.
  • name optional. File name for generated output. Must be a String.
  • profiles optional. Must be a String. You can set additional and extra options using this parameter that allows you to set custom configuration. See profiles samples for examples.

Description

  • Method: POST
  • URL: /v1/pdf/documentparser

Query parameters

No query parameters accepted.

Body payload

{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
    "outputFormat": "JSON",
    "templateId": "1",
    "async": false,
    "encrypt": "false",
    "inline": "true",
    "password": "",
    "profiles": "",
    "storeResult": false
}

Example responses

/pdf/documentparser (output as JSON)
{
    "body": {
        "objects": [
            {
                "name": "companyName",
                "objectType": "field",
                "value": "Amazon Web Services, Inc",
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "companyName2",
                "objectType": "field",
                "value": "Amazon Web Services, Inc",
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "invoiceId",
                "objectType": "field",
                "value": "123456789",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "dateIssued",
                "objectType": "field",
                "value": "2018-04-03T00:00:00",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "dateDue",
                "objectType": "field",
                "value": "2018-04-03T00:00:00",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "bankAccount",
                "objectType": "field",
                "value": "123456789012",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "total",
                "objectType": "field",
                "value": 6.58,
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "subTotal",
                "objectType": "field",
                "value": ""
            },
            {
                "name": "tax",
                "objectType": "field",
                "value": 1.01,
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "objectType": "table",
                "name": "table",
                "rows": []
            }
        ],
        "templateName": "Generic Invoice [en]",
        "templateVersion": "4",
        "timestamp": "2020-08-21T19:23:31"
    },
    "pageCount": 1,
    "error": false,
    "status": 200,
    "name": "sample-invoice.json",
    "remainingCredits": 60803
}

Code Snippets

CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
    "outputFormat": "JSON",
    "templateId": "1",
    "async": false,
    "encrypt": "false",
    "inline": "true",
    "password": "",
    "profiles": "",
    "storeResult": false
}'
JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "");

var raw = JSON.stringify({
 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
 "outputFormat": "JSON",
 "templateId": "1",
 "async": false,
 "encrypt": "false",
 "inline": "true",
 "password": "",
 "profiles": "",
 "storeResult": false
});

var requestOptions = {
	method: 'POST',
	headers: myHeaders,
	body: raw,
	redirect: 'follow'
};

fetch("https://api.pdf.co/v1/pdf/documentparser", requestOptions)
	.then(response => response.text())
	.then(result => console.log(result))
	.catch(error => console.log('error', error));
NodeJs
var request = require('request');
var options = {
	'method': 'POST',
	'url': 'https://api.pdf.co/v1/pdf/documentparser',
	'headers': {
		'Content-Type': 'application/json',
		'x-api-key': ''
	},
	body: JSON.stringify({
	 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
	 "outputFormat": "JSON",
	 "templateId": "1",
	 "async": false,
	 "encrypt": "false",
	 "inline": "true",
	 "password": "",
	 "profiles": "",
	 "storeResult": false
	})

};
request(options, function (error, response) {
	if (error) throw new Error(error);
	console.log(response.body);
});

PHP
<?php

$curl = curl_init();

curl_setopt_array($curl, array(
	CURLOPT_URL => 'https://api.pdf.co/v1/pdf/documentparser',
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_ENCODING => '',
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 0,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => 'POST',
	CURLOPT_POSTFIELDS =>'{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
    "outputFormat": "JSON",
    "templateId": "1",
    "async": false,
    "encrypt": "false",
    "inline": "true",
    "password": "",
    "profiles": "",
    "storeResult": false
}',
	CURLOPT_HTTPHEADER => array(
		'Content-Type: application/json',
		'x-api-key: '
	),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

Java
import java.io.*;
import okhttp3.*;
public class main {
	public static void main(String []args) throws IOException{
		OkHttpClient client = new OkHttpClient().newBuilder()
			.build();
		MediaType mediaType = MediaType.parse("application/json");
		RequestBody body = RequestBody.create(mediaType, "{\n    \"url\": \"https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf\",\n    \"outputFormat\": \"JSON\",\n    \"templateId\": \"1\",\n    \"async\": false,\n    \"encrypt\": \"false\",\n    \"inline\": \"true\",\n    \"password\": \"\",\n    \"profiles\": \"\",\n    \"storeResult\": false\n}");
		Request request = new Request.Builder()
			.url("https://api.pdf.co/v1/pdf/documentparser")
			.method("POST", body)
			.addHeader("Content-Type", "application/json")
			.addHeader("x-api-key", "")
			.build();
		Response response = client.newCall(request).execute();
		System.out.println(response.body().string());
	}
}

C#
using System;
using RestSharp;
namespace HelloWorldApplication {
	class HelloWorld {
		static void Main(string[] args) {
			var client = new RestClient("https://api.pdf.co/v1/pdf/documentparser");
			client.Timeout = -1;
			var request = new RestRequest(Method.POST);
			request.AddHeader("Content-Type", "application/json");
			request.AddHeader("x-api-key", "");
			var body = @"{" + "\n" +
			@"    ""url"": ""https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf""," + "\n" +
			@"    ""outputFormat"": ""JSON""," + "\n" +
			@"    ""templateId"": ""1""," + "\n" +
			@"    ""async"": false," + "\n" +
			@"    ""encrypt"": ""false""," + "\n" +
			@"    ""inline"": ""true""," + "\n" +
			@"    ""password"": """"," + "\n" +
			@"    ""profiles"": """"," + "\n" +
			@"    ""storeResult"": false" + "\n" +
			@"}";
			request.AddParameter("application/json", body,  ParameterType.RequestBody);
			IRestResponse response = client.Execute(request);
			Console.WriteLine(response.Content);
		}
	}
}

Python
import requests
import json

url = "https://api.pdf.co/v1/pdf/documentparser"

payload = json.dumps({
 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
 "outputFormat": "JSON",
 "templateId": "1",
 "async": False,
 "encrypt": "false",
 "inline": "true",
 "password": "",
 "profiles": "",
 "storeResult": False
})
headers = {
	'Content-Type': 'application/json',
	'x-api-key': ''
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Powershell
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$headers.Add("x-api-key", "")

$body = "{`n    `"url`": `"https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf`",`n    `"outputFormat`": `"JSON`",`n    `"templateId`": `"1`",`n    `"async`": false,`n    `"encrypt`": `"false`",`n    `"inline`": `"true`",`n    `"password`": `"`",`n    `"profiles`": `"`",`n    `"storeResult`": false`n}"

$response = Invoke-RestMethod 'https://api.pdf.co/v1/pdf/documentparser' -Method 'POST' -Headers $headers -Body $body
$response | ConvertTo-Json

[POST] /pdf/documentparser (output as XML)

Description: Gets data from documents using a data extraction template. With this API method you may extract data from custom areas, by search, form fields, tables, multiple pages and more!

Additional tools and guides:

Parameters

  • url required. URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage. For uploading files via API please check Files Upload section. If you are randomly getting Too Many Requests or Access Denied error for your input url, please try to add cache: to enable built-in url caching.
  • templateId. required. Sets Id of document parser template to be used. View and manage your templates at https://app.pdf.co/document-parser
  • template. optional. You can pass code of document parser template to be used directly.
  • inline. optional. Set to true to return results inside the response. Otherwise endpoint will return a link to output file generated.
  • outputFormat. optional. Default is JSON. You can override default output format to CSV or XML to generate CSV or XML output accordingly.
  • storeResult. optional. Set to true if you want to store results generated in PDF.co. You can view stored data extraction results at https://app.pdf.co/document-parser
  • password optional. Password of PDF file. Must be a String
  • async optional. Runs processing asynchronously. Returns Use JobId that you may use with /job/check to check state of the processing (possible states: working, failed, aborted and success). Must be one of: true, false.
  • encrypt optional. Enable encryption for output file. Must be one of: true, false.
  • name optional. File name for generated output. Must be a String.
  • profiles optional. Must be a String. You can set additional and extra options using this parameter that allows you to set custom configuration. See profiles samples for examples.

Description

  • Method: POST
  • URL: /v1/pdf/documentparser

Query parameters

No query parameters accepted.

Body payload

{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
    "outputFormat": "JSON",
    "templateId": "1",
    "async": false,
    "encrypt": "false",
    "inline": "true",
    "password": "",
    "profiles": "",
    "storeResult": false
}

Example responses

/pdf/documentparser (output as JSON)
{
    "body": {
        "objects": [
            {
                "name": "companyName",
                "objectType": "field",
                "value": "Amazon Web Services, Inc",
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "companyName2",
                "objectType": "field",
                "value": "Amazon Web Services, Inc",
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "invoiceId",
                "objectType": "field",
                "value": "123456789",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "dateIssued",
                "objectType": "field",
                "value": "2018-04-03T00:00:00",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "dateDue",
                "objectType": "field",
                "value": "2018-04-03T00:00:00",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "bankAccount",
                "objectType": "field",
                "value": "123456789012",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "total",
                "objectType": "field",
                "value": 6.58,
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "subTotal",
                "objectType": "field",
                "value": ""
            },
            {
                "name": "tax",
                "objectType": "field",
                "value": 1.01,
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "objectType": "table",
                "name": "table",
                "rows": []
            }
        ],
        "templateName": "Generic Invoice [en]",
        "templateVersion": "4",
        "timestamp": "2020-08-21T19:23:31"
    },
    "pageCount": 1,
    "error": false,
    "status": 200,
    "name": "sample-invoice.json",
    "remainingCredits": 60803
}

Code Snippets

CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
    "outputFormat": "JSON",
    "templateId": "1",
    "async": false,
    "encrypt": "false",
    "inline": "true",
    "password": "",
    "profiles": "",
    "storeResult": false
}'
JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "");

var raw = JSON.stringify({
 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
 "outputFormat": "JSON",
 "templateId": "1",
 "async": false,
 "encrypt": "false",
 "inline": "true",
 "password": "",
 "profiles": "",
 "storeResult": false
});

var requestOptions = {
	method: 'POST',
	headers: myHeaders,
	body: raw,
	redirect: 'follow'
};

fetch("https://api.pdf.co/v1/pdf/documentparser", requestOptions)
	.then(response => response.text())
	.then(result => console.log(result))
	.catch(error => console.log('error', error));
NodeJs
var request = require('request');
var options = {
	'method': 'POST',
	'url': 'https://api.pdf.co/v1/pdf/documentparser',
	'headers': {
		'Content-Type': 'application/json',
		'x-api-key': ''
	},
	body: JSON.stringify({
	 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
	 "outputFormat": "JSON",
	 "templateId": "1",
	 "async": false,
	 "encrypt": "false",
	 "inline": "true",
	 "password": "",
	 "profiles": "",
	 "storeResult": false
	})

};
request(options, function (error, response) {
	if (error) throw new Error(error);
	console.log(response.body);
});

PHP
<?php

$curl = curl_init();

curl_setopt_array($curl, array(
	CURLOPT_URL => 'https://api.pdf.co/v1/pdf/documentparser',
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_ENCODING => '',
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 0,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => 'POST',
	CURLOPT_POSTFIELDS =>'{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
    "outputFormat": "JSON",
    "templateId": "1",
    "async": false,
    "encrypt": "false",
    "inline": "true",
    "password": "",
    "profiles": "",
    "storeResult": false
}',
	CURLOPT_HTTPHEADER => array(
		'Content-Type: application/json',
		'x-api-key: '
	),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

Java
import java.io.*;
import okhttp3.*;
public class main {
	public static void main(String []args) throws IOException{
		OkHttpClient client = new OkHttpClient().newBuilder()
			.build();
		MediaType mediaType = MediaType.parse("application/json");
		RequestBody body = RequestBody.create(mediaType, "{\n    \"url\": \"https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf\",\n    \"outputFormat\": \"JSON\",\n    \"templateId\": \"1\",\n    \"async\": false,\n    \"encrypt\": \"false\",\n    \"inline\": \"true\",\n    \"password\": \"\",\n    \"profiles\": \"\",\n    \"storeResult\": false\n}");
		Request request = new Request.Builder()
			.url("https://api.pdf.co/v1/pdf/documentparser")
			.method("POST", body)
			.addHeader("Content-Type", "application/json")
			.addHeader("x-api-key", "")
			.build();
		Response response = client.newCall(request).execute();
		System.out.println(response.body().string());
	}
}

C#
using System;
using RestSharp;
namespace HelloWorldApplication {
	class HelloWorld {
		static void Main(string[] args) {
			var client = new RestClient("https://api.pdf.co/v1/pdf/documentparser");
			client.Timeout = -1;
			var request = new RestRequest(Method.POST);
			request.AddHeader("Content-Type", "application/json");
			request.AddHeader("x-api-key", "");
			var body = @"{" + "\n" +
			@"    ""url"": ""https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf""," + "\n" +
			@"    ""outputFormat"": ""JSON""," + "\n" +
			@"    ""templateId"": ""1""," + "\n" +
			@"    ""async"": false," + "\n" +
			@"    ""encrypt"": ""false""," + "\n" +
			@"    ""inline"": ""true""," + "\n" +
			@"    ""password"": """"," + "\n" +
			@"    ""profiles"": """"," + "\n" +
			@"    ""storeResult"": false" + "\n" +
			@"}";
			request.AddParameter("application/json", body,  ParameterType.RequestBody);
			IRestResponse response = client.Execute(request);
			Console.WriteLine(response.Content);
		}
	}
}

Python
import requests
import json

url = "https://api.pdf.co/v1/pdf/documentparser"

payload = json.dumps({
 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
 "outputFormat": "JSON",
 "templateId": "1",
 "async": False,
 "encrypt": "false",
 "inline": "true",
 "password": "",
 "profiles": "",
 "storeResult": False
})
headers = {
	'Content-Type': 'application/json',
	'x-api-key': ''
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Powershell
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$headers.Add("x-api-key", "")

$body = "{`n    `"url`": `"https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf`",`n    `"outputFormat`": `"JSON`",`n    `"templateId`": `"1`",`n    `"async`": false,`n    `"encrypt`": `"false`",`n    `"inline`": `"true`",`n    `"password`": `"`",`n    `"profiles`": `"`",`n    `"storeResult`": false`n}"

$response = Invoke-RestMethod 'https://api.pdf.co/v1/pdf/documentparser' -Method 'POST' -Headers $headers -Body $body
$response | ConvertTo-Json

[POST] /pdf/documentparser (output as CSV)

Description: Gets data from documents using a data extraction template. With this API method you may extract data from custom areas, by search, form fields, tables, multiple pages and more!

Additional tools and guides:

  • Document Parser Template Editor
  • Template Manual Coding Guide

  • url required. URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage. For uploading files via API please check Files Upload section. If you are randomly getting Too Many Requests or Access Denied error for your input url, please try to add cache: to enable built-in url caching.
  • templateId. required. Sets Id of document parser template to be used. View and manage your templates at https://app.pdf.co/document-parser
  • template. optional. You can pass code of document parser template to be used directly.
  • inline. optional. Set to true to return results inside the response. Otherwise endpoint will return a link to output file generated.
  • outputFormat. optional. Default is JSON. You can override default output format to CSV or XML to generate CSV or XML output accordingly.
  • storeResult. optional. Set to true if you want to store results generated in PDF.co. You can view stored data extraction results at https://app.pdf.co/document-parser
  • password optional. Password of PDF file. Must be a String
  • async optional. Runs processing asynchronously. Returns Use JobId that you may use with /job/check to check state of the processing (possible states: working, failed, aborted and success). Must be one of: true, false.
  • encrypt optional. Enable encryption for output file. Must be one of: true, false.
  • name optional. File name for generated output. Must be a String.
  • profiles optional. Must be a String. You can set additional and extra options using this parameter that allows you to set custom configuration. See profiles samples for examples.

Description

  • Method: POST
  • URL: /v1/pdf/documentparser

Query parameters

No query parameters accepted.

Body payload

{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
    "templateId": "1",
    "outputFormat": "CSV",
    "generateCsvHeaders": true,

    "async": false,
    "encrypt": "false",
    "inline": "true",
    "password": "",
    "storeResult": false

}

Example responses

/pdf/documentparser (output as CSV)
{
    "body": "companyName,companyName2,invoiceId,dateIssued,dateDue,bankAccount,total,subTotal,tax,tableNames,tables\r\n\"Amazon Web Services, Inc\",\"Amazon Web Services, Inc\",123456789,2018-04-03T00:00:00,2018-04-03T00:00:00,123456789012,6.58,,1.01,table,\r\n\r\n",
    "pageCount": 1,
    "error": false,
    "status": 200,
    "name": "sample-invoice.csv",
    "remainingCredits": 60804
}

Code Snippets

CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
    "templateId": "1",
    "outputFormat": "CSV",
    "generateCsvHeaders": true,

    "async": false,
    "encrypt": "false",
    "inline": "true",
    "password": "",
    "storeResult": false

}'
JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "");

var raw = JSON.stringify({
 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
 "templateId": "1",
 "outputFormat": "CSV",
 "generateCsvHeaders": true,
 "async": false,
 "encrypt": "false",
 "inline": "true",
 "password": "",
 "storeResult": false
});

var requestOptions = {
	method: 'POST',
	headers: myHeaders,
	body: raw,
	redirect: 'follow'
};

fetch("https://api.pdf.co/v1/pdf/documentparser", requestOptions)
	.then(response => response.text())
	.then(result => console.log(result))
	.catch(error => console.log('error', error));
NodeJs
var request = require('request');
var options = {
	'method': 'POST',
	'url': 'https://api.pdf.co/v1/pdf/documentparser',
	'headers': {
		'Content-Type': 'application/json',
		'x-api-key': ''
	},
	body: JSON.stringify({
	 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
	 "templateId": "1",
	 "outputFormat": "CSV",
	 "generateCsvHeaders": true,
	 "async": false,
	 "encrypt": "false",
	 "inline": "true",
	 "password": "",
	 "storeResult": false
	})

};
request(options, function (error, response) {
	if (error) throw new Error(error);
	console.log(response.body);
});

PHP
<?php

$curl = curl_init();

curl_setopt_array($curl, array(
	CURLOPT_URL => 'https://api.pdf.co/v1/pdf/documentparser',
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_ENCODING => '',
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 0,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => 'POST',
	CURLOPT_POSTFIELDS =>'{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
    "templateId": "1",
    "outputFormat": "CSV",
    "generateCsvHeaders": true,

    "async": false,
    "encrypt": "false",
    "inline": "true",
    "password": "",
    "storeResult": false

}',
	CURLOPT_HTTPHEADER => array(
		'Content-Type: application/json',
		'x-api-key: '
	),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

Java
import java.io.*;
import okhttp3.*;
public class main {
	public static void main(String []args) throws IOException{
		OkHttpClient client = new OkHttpClient().newBuilder()
			.build();
		MediaType mediaType = MediaType.parse("application/json");
		RequestBody body = RequestBody.create(mediaType, "{\n    \"url\": \"https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf\",\n    \"templateId\": \"1\",\n    \"outputFormat\": \"CSV\",\n    \"generateCsvHeaders\": true,\n\n    \"async\": false,\n    \"encrypt\": \"false\",\n    \"inline\": \"true\",\n    \"password\": \"\",\n    \"storeResult\": false\n\n}");
		Request request = new Request.Builder()
			.url("https://api.pdf.co/v1/pdf/documentparser")
			.method("POST", body)
			.addHeader("Content-Type", "application/json")
			.addHeader("x-api-key", "")
			.build();
		Response response = client.newCall(request).execute();
		System.out.println(response.body().string());
	}
}

C#
using System;
using RestSharp;
namespace HelloWorldApplication {
	class HelloWorld {
		static void Main(string[] args) {
			var client = new RestClient("https://api.pdf.co/v1/pdf/documentparser");
			client.Timeout = -1;
			var request = new RestRequest(Method.POST);
			request.AddHeader("Content-Type", "application/json");
			request.AddHeader("x-api-key", "");
			var body = @"{" + "\n" +
			@"    ""url"": ""https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf""," + "\n" +
			@"    ""templateId"": ""1""," + "\n" +
			@"    ""outputFormat"": ""CSV""," + "\n" +
			@"    ""generateCsvHeaders"": true," + "\n" +
			@"" + "\n" +
			@"    ""async"": false," + "\n" +
			@"    ""encrypt"": ""false""," + "\n" +
			@"    ""inline"": ""true""," + "\n" +
			@"    ""password"": """"," + "\n" +
			@"    ""storeResult"": false" + "\n" +
			@"" + "\n" +
			@"}";
			request.AddParameter("application/json", body,  ParameterType.RequestBody);
			IRestResponse response = client.Execute(request);
			Console.WriteLine(response.Content);
		}
	}
}

Python
import requests
import json

url = "https://api.pdf.co/v1/pdf/documentparser"

payload = json.dumps({
 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf",
 "templateId": "1",
 "outputFormat": "CSV",
 "generateCsvHeaders": True,
 "async": False,
 "encrypt": "false",
 "inline": "true",
 "password": "",
 "storeResult": False
})
headers = {
	'Content-Type': 'application/json',
	'x-api-key': ''
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Powershell
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$headers.Add("x-api-key", "")

$body = "{`n    `"url`": `"https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/sample-invoice.pdf`",`n    `"templateId`": `"1`",`n    `"outputFormat`": `"CSV`",`n    `"generateCsvHeaders`": true,`n`n    `"async`": false,`n    `"encrypt`": `"false`",`n    `"inline`": `"true`",`n    `"password`": `"`",`n    `"storeResult`": false`n`n}"

$response = Invoke-RestMethod 'https://api.pdf.co/v1/pdf/documentparser' -Method 'POST' -Headers $headers -Body $body
$response | ConvertTo-Json

[POST] /pdf/documentparser (output as JSON, custom template code)

Description: Parses and gets data from documents using previously prepared custom data extraction template. With this API method you may extract data from custom areas, by search, form fields, tables, multiple pages and more!

Template tools and guides:

Parameters

  • url required. URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage. For uploading files via API please check Files Upload section. If you are randomly getting Too Many Requests or Access Denied error for your input url, please try to add cache: to enable built-in url caching.
  • templateId. required. Sets Id of document parser template to be used. View and manage your templates at https://app.pdf.co/document-parser
  • template. optional. You can pass code of document parser template to be used directly.
  • inline. optional. Set to true to return results inside the response. Otherwise endpoint will return a link to output file generated.
  • outputFormat. optional. Default is JSON. You can override default output format to CSV or XML to generate CSV or XML output accordingly.
  • storeResult. optional. Set to true if you want to store results generated in PDF.co. You can view stored data extraction results at https://app.pdf.co/document-parser
  • password optional. Password of PDF file. Must be a String
  • async optional. Runs processing asynchronously. Returns Use JobId that you may use with /job/check to check state of the processing (possible states: working, failed, aborted and success). Must be one of: true, false.
  • encrypt optional. Enable encryption for output file. Must be one of: true, false.
  • name optional. File name for generated output. Must be a String.
  • profiles optional. Must be a String. You can set additional and extra options using this parameter that allows you to set custom configuration. See profiles samples for examples.

Description

  • Method: POST
  • URL: /v1/pdf/documentparser

Query parameters

No query parameters accepted.

Body payload

{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf",
    "template": "{\r\n  \"templateVersion\": 3,\r\n  \"templatePriority\": 0,\r\n  \"sourceId\": \"Multipage Table Test\",\r\n  \"detectionRules\": {\r\n    \"keywords\": [\r\n      \"Sample document with multi-page table\"\r\n    ]\r\n  },\r\n  \"fields\": {\r\n    \"total\": {\r\n      \"type\": \"regex\",\r\n      \"expression\": \"TOTAL \",\r\n      \"dataType\": \"decimal\"\r\n    }\r\n  },\r\n  \"tables\": [\r\n    {\r\n      \"name\": \"table1\",\r\n      \"start\": {\r\n        \"expression\": \"Item\\\\s+Description\\\\s+Price\\\\s+Qty\\\\s+Extended Price\"\r\n      },\r\n      \"end\": {\r\n        \"expression\": \"TOTAL\\\\s+\\\\d+\\\\.\\\\d\\\\d\"\r\n      },\r\n      \"row\": {\r\n        \"expression\": \"^\\\\s*(?<itemNo>\\\\d+)\\\\s+(?<description>.+?)\\\\s+(?<price>\\\\d+\\\\.\\\\d\\\\d)\\\\s+(?<qty>\\\\d+)\\\\s+(?<extPrice>\\\\d+\\\\.\\\\d\\\\d)\"\r\n      },\r\n      \"columns\": [\r\n        {\r\n          \"name\": \"itemNo\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"description\",\r\n          \"type\": \"string\"\r\n        },\r\n        {\r\n          \"name\": \"price\",\r\n          \"type\": \"decimal\"\r\n        },\r\n        {\r\n          \"name\": \"qty\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"extPrice\",\r\n          \"type\": \"decimal\"\r\n        }\r\n      ],\r\n      \"multipage\": true\r\n    }\r\n  ]\r\n}",
    "outputFormat": "JSON",
    "async": false,
    "encrypt": "false",
    "inline": "true",
    "profiles": "",
    "password": "",
    "storeResult": false
}

Example responses

POST /pdf/documentparser
{
    "body": {
        "objects": [
            {
                "name": "companyName",
                "objectType": "field",
                "value": "Amazon Web Services, Inc",
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "companyName2",
                "objectType": "field",
                "value": "Amazon Web Services, Inc",
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "invoiceId",
                "objectType": "field",
                "value": "123456789",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "dateIssued",
                "objectType": "field",
                "value": "2018-04-03T00:00:00",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "dateDue",
                "objectType": "field",
                "value": "2018-04-03T00:00:00",
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "total",
                "objectType": "field",
                "value": 6.58,
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "name": "subTotal",
                "objectType": "field",
                "value": ""
            },
            {
                "name": "tax",
                "objectType": "field",
                "value": 1.01,
                "pageIndex": 0,
                "rectangle": [
                    0,
                    0,
                    0,
                    0
                ]
            },
            {
                "objectType": "table",
                "name": "table",
                "rows": []
            }
        ],
        "templateName": "Generic Invoice [en]",
        "templateVersion": "4",
        "timestamp": "2020-07-16T22:04:25"
    },
    "pageCount": 1,
    "error": false,
    "status": 200,
    "name": "sample-invoice.json",
    "remainingCredits": 77731
}

Code Snippets

CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf",
    "template": "{\r\n  \"templateVersion\": 3,\r\n  \"templatePriority\": 0,\r\n  \"sourceId\": \"Multipage Table Test\",\r\n  \"detectionRules\": {\r\n    \"keywords\": [\r\n      \"Sample document with multi-page table\"\r\n    ]\r\n  },\r\n  \"fields\": {\r\n    \"total\": {\r\n      \"type\": \"regex\",\r\n      \"expression\": \"TOTAL \",\r\n      \"dataType\": \"decimal\"\r\n    }\r\n  },\r\n  \"tables\": [\r\n    {\r\n      \"name\": \"table1\",\r\n      \"start\": {\r\n        \"expression\": \"Item\\\\s+Description\\\\s+Price\\\\s+Qty\\\\s+Extended Price\"\r\n      },\r\n      \"end\": {\r\n        \"expression\": \"TOTAL\\\\s+\\\\d+\\\\.\\\\d\\\\d\"\r\n      },\r\n      \"row\": {\r\n        \"expression\": \"^\\\\s*(?<itemNo>\\\\d+)\\\\s+(?<description>.+?)\\\\s+(?<price>\\\\d+\\\\.\\\\d\\\\d)\\\\s+(?<qty>\\\\d+)\\\\s+(?<extPrice>\\\\d+\\\\.\\\\d\\\\d)\"\r\n      },\r\n      \"columns\": [\r\n        {\r\n          \"name\": \"itemNo\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"description\",\r\n          \"type\": \"string\"\r\n        },\r\n        {\r\n          \"name\": \"price\",\r\n          \"type\": \"decimal\"\r\n        },\r\n        {\r\n          \"name\": \"qty\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"extPrice\",\r\n          \"type\": \"decimal\"\r\n        }\r\n      ],\r\n      \"multipage\": true\r\n    }\r\n  ]\r\n}",
    "outputFormat": "JSON",
    "async": false,
    "encrypt": "false",
    "inline": "true",
    "profiles": "",
    "password": "",
    "storeResult": false
}'
JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "");

var raw = JSON.stringify({
 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf",
 "template": "{\r\n  \"templateVersion\": 3,\r\n  \"templatePriority\": 0,\r\n  \"sourceId\": \"Multipage Table Test\",\r\n  \"detectionRules\": {\r\n    \"keywords\": [\r\n      \"Sample document with multi-page table\"\r\n    ]\r\n  },\r\n  \"fields\": {\r\n    \"total\": {\r\n      \"type\": \"regex\",\r\n      \"expression\": \"TOTAL \",\r\n      \"dataType\": \"decimal\"\r\n    }\r\n  },\r\n  \"tables\": [\r\n    {\r\n      \"name\": \"table1\",\r\n      \"start\": {\r\n        \"expression\": \"Item\\\\s+Description\\\\s+Price\\\\s+Qty\\\\s+Extended Price\"\r\n      },\r\n      \"end\": {\r\n        \"expression\": \"TOTAL\\\\s+\\\\d+\\\\.\\\\d\\\\d\"\r\n      },\r\n      \"row\": {\r\n        \"expression\": \"^\\\\s*(?<itemNo>\\\\d+)\\\\s+(?<description>.+?)\\\\s+(?<price>\\\\d+\\\\.\\\\d\\\\d)\\\\s+(?<qty>\\\\d+)\\\\s+(?<extPrice>\\\\d+\\\\.\\\\d\\\\d)\"\r\n      },\r\n      \"columns\": [\r\n        {\r\n          \"name\": \"itemNo\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"description\",\r\n          \"type\": \"string\"\r\n        },\r\n        {\r\n          \"name\": \"price\",\r\n          \"type\": \"decimal\"\r\n        },\r\n        {\r\n          \"name\": \"qty\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"extPrice\",\r\n          \"type\": \"decimal\"\r\n        }\r\n      ],\r\n      \"multipage\": true\r\n    }\r\n  ]\r\n}",
 "outputFormat": "JSON",
 "async": false,
 "encrypt": "false",
 "inline": "true",
 "profiles": "",
 "password": "",
 "storeResult": false
});

var requestOptions = {
	method: 'POST',
	headers: myHeaders,
	body: raw,
	redirect: 'follow'
};

fetch("https://api.pdf.co/v1/pdf/documentparser", requestOptions)
	.then(response => response.text())
	.then(result => console.log(result))
	.catch(error => console.log('error', error));
NodeJs
var request = require('request');
var options = {
	'method': 'POST',
	'url': 'https://api.pdf.co/v1/pdf/documentparser',
	'headers': {
		'Content-Type': 'application/json',
		'x-api-key': ''
	},
	body: JSON.stringify({
	 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf",
	 "template": "{\r\n  \"templateVersion\": 3,\r\n  \"templatePriority\": 0,\r\n  \"sourceId\": \"Multipage Table Test\",\r\n  \"detectionRules\": {\r\n    \"keywords\": [\r\n      \"Sample document with multi-page table\"\r\n    ]\r\n  },\r\n  \"fields\": {\r\n    \"total\": {\r\n      \"type\": \"regex\",\r\n      \"expression\": \"TOTAL \",\r\n      \"dataType\": \"decimal\"\r\n    }\r\n  },\r\n  \"tables\": [\r\n    {\r\n      \"name\": \"table1\",\r\n      \"start\": {\r\n        \"expression\": \"Item\\\\s+Description\\\\s+Price\\\\s+Qty\\\\s+Extended Price\"\r\n      },\r\n      \"end\": {\r\n        \"expression\": \"TOTAL\\\\s+\\\\d+\\\\.\\\\d\\\\d\"\r\n      },\r\n      \"row\": {\r\n        \"expression\": \"^\\\\s*(?<itemNo>\\\\d+)\\\\s+(?<description>.+?)\\\\s+(?<price>\\\\d+\\\\.\\\\d\\\\d)\\\\s+(?<qty>\\\\d+)\\\\s+(?<extPrice>\\\\d+\\\\.\\\\d\\\\d)\"\r\n      },\r\n      \"columns\": [\r\n        {\r\n          \"name\": \"itemNo\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"description\",\r\n          \"type\": \"string\"\r\n        },\r\n        {\r\n          \"name\": \"price\",\r\n          \"type\": \"decimal\"\r\n        },\r\n        {\r\n          \"name\": \"qty\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"extPrice\",\r\n          \"type\": \"decimal\"\r\n        }\r\n      ],\r\n      \"multipage\": true\r\n    }\r\n  ]\r\n}",
	 "outputFormat": "JSON",
	 "async": false,
	 "encrypt": "false",
	 "inline": "true",
	 "profiles": "",
	 "password": "",
	 "storeResult": false
	})

};
request(options, function (error, response) {
	if (error) throw new Error(error);
	console.log(response.body);
});

PHP
<?php

$curl = curl_init();

curl_setopt_array($curl, array(
	CURLOPT_URL => 'https://api.pdf.co/v1/pdf/documentparser',
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_ENCODING => '',
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 0,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => 'POST',
	CURLOPT_POSTFIELDS =>'{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf",
    "template": "{\\r\\n  \\"templateVersion\\": 3,\\r\\n  \\"templatePriority\\": 0,\\r\\n  \\"sourceId\\": \\"Multipage Table Test\\",\\r\\n  \\"detectionRules\\": {\\r\\n    \\"keywords\\": [\\r\\n      \\"Sample document with multi-page table\\"\\r\\n    ]\\r\\n  },\\r\\n  \\"fields\\": {\\r\\n    \\"total\\": {\\r\\n      \\"type\\": \\"regex\\",\\r\\n      \\"expression\\": \\"TOTAL \\",\\r\\n      \\"dataType\\": \\"decimal\\"\\r\\n    }\\r\\n  },\\r\\n  \\"tables\\": [\\r\\n    {\\r\\n      \\"name\\": \\"table1\\",\\r\\n      \\"start\\": {\\r\\n        \\"expression\\": \\"Item\\\\\\\\s+Description\\\\\\\\s+Price\\\\\\\\s+Qty\\\\\\\\s+Extended Price\\"\\r\\n      },\\r\\n      \\"end\\": {\\r\\n        \\"expression\\": \\"TOTAL\\\\\\\\s+\\\\\\\\d+\\\\\\\\.\\\\\\\\d\\\\\\\\d\\"\\r\\n      },\\r\\n      \\"row\\": {\\r\\n        \\"expression\\": \\"^\\\\\\\\s*(?<itemNo>\\\\\\\\d+)\\\\\\\\s+(?<description>.+?)\\\\\\\\s+(?<price>\\\\\\\\d+\\\\\\\\.\\\\\\\\d\\\\\\\\d)\\\\\\\\s+(?<qty>\\\\\\\\d+)\\\\\\\\s+(?<extPrice>\\\\\\\\d+\\\\\\\\.\\\\\\\\d\\\\\\\\d)\\"\\r\\n      },\\r\\n      \\"columns\\": [\\r\\n        {\\r\\n          \\"name\\": \\"itemNo\\",\\r\\n          \\"type\\": \\"integer\\"\\r\\n        },\\r\\n        {\\r\\n          \\"name\\": \\"description\\",\\r\\n          \\"type\\": \\"string\\"\\r\\n        },\\r\\n        {\\r\\n          \\"name\\": \\"price\\",\\r\\n          \\"type\\": \\"decimal\\"\\r\\n        },\\r\\n        {\\r\\n          \\"name\\": \\"qty\\",\\r\\n          \\"type\\": \\"integer\\"\\r\\n        },\\r\\n        {\\r\\n          \\"name\\": \\"extPrice\\",\\r\\n          \\"type\\": \\"decimal\\"\\r\\n        }\\r\\n      ],\\r\\n      \\"multipage\\": true\\r\\n    }\\r\\n  ]\\r\\n}",
    "outputFormat": "JSON",
    "async": false,
    "encrypt": "false",
    "inline": "true",
    "profiles": "",
    "password": "",
    "storeResult": false
}',
	CURLOPT_HTTPHEADER => array(
		'Content-Type: application/json',
		'x-api-key: '
	),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

Java
import java.io.*;
import okhttp3.*;
public class main {
	public static void main(String []args) throws IOException{
		OkHttpClient client = new OkHttpClient().newBuilder()
			.build();
		MediaType mediaType = MediaType.parse("application/json");
		RequestBody body = RequestBody.create(mediaType, "{\n    \"url\": \"https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf\",\n    \"template\": \"{\\r\\n  \\\"templateVersion\\\": 3,\\r\\n  \\\"templatePriority\\\": 0,\\r\\n  \\\"sourceId\\\": \\\"Multipage Table Test\\\",\\r\\n  \\\"detectionRules\\\": {\\r\\n    \\\"keywords\\\": [\\r\\n      \\\"Sample document with multi-page table\\\"\\r\\n    ]\\r\\n  },\\r\\n  \\\"fields\\\": {\\r\\n    \\\"total\\\": {\\r\\n      \\\"type\\\": \\\"regex\\\",\\r\\n      \\\"expression\\\": \\\"TOTAL \\\",\\r\\n      \\\"dataType\\\": \\\"decimal\\\"\\r\\n    }\\r\\n  },\\r\\n  \\\"tables\\\": [\\r\\n    {\\r\\n      \\\"name\\\": \\\"table1\\\",\\r\\n      \\\"start\\\": {\\r\\n        \\\"expression\\\": \\\"Item\\\\\\\\s+Description\\\\\\\\s+Price\\\\\\\\s+Qty\\\\\\\\s+Extended Price\\\"\\r\\n      },\\r\\n      \\\"end\\\": {\\r\\n        \\\"expression\\\": \\\"TOTAL\\\\\\\\s+\\\\\\\\d+\\\\\\\\.\\\\\\\\d\\\\\\\\d\\\"\\r\\n      },\\r\\n      \\\"row\\\": {\\r\\n        \\\"expression\\\": \\\"^\\\\\\\\s*(?<itemNo>\\\\\\\\d+)\\\\\\\\s+(?<description>.+?)\\\\\\\\s+(?<price>\\\\\\\\d+\\\\\\\\.\\\\\\\\d\\\\\\\\d)\\\\\\\\s+(?<qty>\\\\\\\\d+)\\\\\\\\s+(?<extPrice>\\\\\\\\d+\\\\\\\\.\\\\\\\\d\\\\\\\\d)\\\"\\r\\n      },\\r\\n      \\\"columns\\\": [\\r\\n        {\\r\\n          \\\"name\\\": \\\"itemNo\\\",\\r\\n          \\\"type\\\": \\\"integer\\\"\\r\\n        },\\r\\n        {\\r\\n          \\\"name\\\": \\\"description\\\",\\r\\n          \\\"type\\\": \\\"string\\\"\\r\\n        },\\r\\n        {\\r\\n          \\\"name\\\": \\\"price\\\",\\r\\n          \\\"type\\\": \\\"decimal\\\"\\r\\n        },\\r\\n        {\\r\\n          \\\"name\\\": \\\"qty\\\",\\r\\n          \\\"type\\\": \\\"integer\\\"\\r\\n        },\\r\\n        {\\r\\n          \\\"name\\\": \\\"extPrice\\\",\\r\\n          \\\"type\\\": \\\"decimal\\\"\\r\\n        }\\r\\n      ],\\r\\n      \\\"multipage\\\": true\\r\\n    }\\r\\n  ]\\r\\n}\",\n    \"outputFormat\": \"JSON\",\n    \"async\": false,\n    \"encrypt\": \"false\",\n    \"inline\": \"true\",\n    \"profiles\": \"\",\n    \"password\": \"\",\n    \"storeResult\": false\n}");
		Request request = new Request.Builder()
			.url("https://api.pdf.co/v1/pdf/documentparser")
			.method("POST", body)
			.addHeader("Content-Type", "application/json")
			.addHeader("x-api-key", "")
			.build();
		Response response = client.newCall(request).execute();
		System.out.println(response.body().string());
	}
}

C#
using System;
using RestSharp;
namespace HelloWorldApplication {
	class HelloWorld {
		static void Main(string[] args) {
			var client = new RestClient("https://api.pdf.co/v1/pdf/documentparser");
			client.Timeout = -1;
			var request = new RestRequest(Method.POST);
			request.AddHeader("Content-Type", "application/json");
			request.AddHeader("x-api-key", "");
			var body = @"{" + "\n" +
			@"    ""url"": ""https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf""," + "\n" +
			@"    ""template"": ""{\r\n  \""templateVersion\"": 3,\r\n  \""templatePriority\"": 0,\r\n  \""sourceId\"": \""Multipage Table Test\"",\r\n  \""detectionRules\"": {\r\n    \""keywords\"": [\r\n      \""Sample document with multi-page table\""\r\n    ]\r\n  },\r\n  \""fields\"": {\r\n    \""total\"": {\r\n      \""type\"": \""regex\"",\r\n      \""expression\"": \""TOTAL \"",\r\n      \""dataType\"": \""decimal\""\r\n    }\r\n  },\r\n  \""tables\"": [\r\n    {\r\n      \""name\"": \""table1\"",\r\n      \""start\"": {\r\n        \""expression\"": \""Item\\\\s+Description\\\\s+Price\\\\s+Qty\\\\s+Extended Price\""\r\n      },\r\n      \""end\"": {\r\n        \""expression\"": \""TOTAL\\\\s+\\\\d+\\\\.\\\\d\\\\d\""\r\n      },\r\n      \""row\"": {\r\n        \""expression\"": \""^\\\\s*(?<itemNo>\\\\d+)\\\\s+(?<description>.+?)\\\\s+(?<price>\\\\d+\\\\.\\\\d\\\\d)\\\\s+(?<qty>\\\\d+)\\\\s+(?<extPrice>\\\\d+\\\\.\\\\d\\\\d)\""\r\n      },\r\n      \""columns\"": [\r\n        {\r\n          \""name\"": \""itemNo\"",\r\n          \""type\"": \""integer\""\r\n        },\r\n        {\r\n          \""name\"": \""description\"",\r\n          \""type\"": \""string\""\r\n        },\r\n        {\r\n          \""name\"": \""price\"",\r\n          \""type\"": \""decimal\""\r\n        },\r\n        {\r\n          \""name\"": \""qty\"",\r\n          \""type\"": \""integer\""\r\n        },\r\n        {\r\n          \""name\"": \""extPrice\"",\r\n          \""type\"": \""decimal\""\r\n        }\r\n      ],\r\n      \""multipage\"": true\r\n    }\r\n  ]\r\n}""," + "\n" +
			@"    ""outputFormat"": ""JSON""," + "\n" +
			@"    ""async"": false," + "\n" +
			@"    ""encrypt"": ""false""," + "\n" +
			@"    ""inline"": ""true""," + "\n" +
			@"    ""profiles"": """"," + "\n" +
			@"    ""password"": """"," + "\n" +
			@"    ""storeResult"": false" + "\n" +
			@"}";
			request.AddParameter("application/json", body,  ParameterType.RequestBody);
			IRestResponse response = client.Execute(request);
			Console.WriteLine(response.Content);
		}
	}
}

Python
import requests
import json

url = "https://api.pdf.co/v1/pdf/documentparser"

payload = json.dumps({
 "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf",
 "template": "{\r\n  \"templateVersion\": 3,\r\n  \"templatePriority\": 0,\r\n  \"sourceId\": \"Multipage Table Test\",\r\n  \"detectionRules\": {\r\n    \"keywords\": [\r\n      \"Sample document with multi-page table\"\r\n    ]\r\n  },\r\n  \"fields\": {\r\n    \"total\": {\r\n      \"type\": \"regex\",\r\n      \"expression\": \"TOTAL \",\r\n      \"dataType\": \"decimal\"\r\n    }\r\n  },\r\n  \"tables\": [\r\n    {\r\n      \"name\": \"table1\",\r\n      \"start\": {\r\n        \"expression\": \"Item\\\\s+Description\\\\s+Price\\\\s+Qty\\\\s+Extended Price\"\r\n      },\r\n      \"end\": {\r\n        \"expression\": \"TOTAL\\\\s+\\\\d+\\\\.\\\\d\\\\d\"\r\n      },\r\n      \"row\": {\r\n        \"expression\": \"^\\\\s*(?<itemNo>\\\\d+)\\\\s+(?<description>.+?)\\\\s+(?<price>\\\\d+\\\\.\\\\d\\\\d)\\\\s+(?<qty>\\\\d+)\\\\s+(?<extPrice>\\\\d+\\\\.\\\\d\\\\d)\"\r\n      },\r\n      \"columns\": [\r\n        {\r\n          \"name\": \"itemNo\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"description\",\r\n          \"type\": \"string\"\r\n        },\r\n        {\r\n          \"name\": \"price\",\r\n          \"type\": \"decimal\"\r\n        },\r\n        {\r\n          \"name\": \"qty\",\r\n          \"type\": \"integer\"\r\n        },\r\n        {\r\n          \"name\": \"extPrice\",\r\n          \"type\": \"decimal\"\r\n        }\r\n      ],\r\n      \"multipage\": true\r\n    }\r\n  ]\r\n}",
 "outputFormat": "JSON",
 "async": False,
 "encrypt": "false",
 "inline": "true",
 "profiles": "",
 "password": "",
 "storeResult": False
})
headers = {
	'Content-Type': 'application/json',
	'x-api-key': ''
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Powershell
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$headers.Add("x-api-key", "")

$body = "{`n    `"url`": `"https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf`",`n    `"template`": `"{`\r`\n  `\`"templateVersion`\`": 3,`\r`\n  `\`"templatePriority`\`": 0,`\r`\n  `\`"sourceId`\`": `\`"Multipage Table Test`\`",`\r`\n  `\`"detectionRules`\`": {`\r`\n    `\`"keywords`\`": [`\r`\n      `\`"Sample document with multi-page table`\`"`\r`\n    ]`\r`\n  },`\r`\n  `\`"fields`\`": {`\r`\n    `\`"total`\`": {`\r`\n      `\`"type`\`": `\`"regex`\`",`\r`\n      `\`"expression`\`": `\`"TOTAL `\`",`\r`\n      `\`"dataType`\`": `\`"decimal`\`"`\r`\n    }`\r`\n  },`\r`\n  `\`"tables`\`": [`\r`\n    {`\r`\n      `\`"name`\`": `\`"table1`\`",`\r`\n      `\`"start`\`": {`\r`\n        `\`"expression`\`": `\`"Item`\`\`\`\s+Description`\`\`\`\s+Price`\`\`\`\s+Qty`\`\`\`\s+Extended Price`\`"`\r`\n      },`\r`\n      `\`"end`\`": {`\r`\n        `\`"expression`\`": `\`"TOTAL`\`\`\`\s+`\`\`\`\d+`\`\`\`\.`\`\`\`\d`\`\`\`\d`\`"`\r`\n      },`\r`\n      `\`"row`\`": {`\r`\n        `\`"expression`\`": `\`"^`\`\`\`\s*(?<itemNo>`\`\`\`\d+)`\`\`\`\s+(?<description>.+?)`\`\`\`\s+(?<price>`\`\`\`\d+`\`\`\`\.`\`\`\`\d`\`\`\`\d)`\`\`\`\s+(?<qty>`\`\`\`\d+)`\`\`\`\s+(?<extPrice>`\`\`\`\d+`\`\`\`\.`\`\`\`\d`\`\`\`\d)`\`"`\r`\n      },`\r`\n      `\`"columns`\`": [`\r`\n        {`\r`\n          `\`"name`\`": `\`"itemNo`\`",`\r`\n          `\`"type`\`": `\`"integer`\`"`\r`\n        },`\r`\n        {`\r`\n          `\`"name`\`": `\`"description`\`",`\r`\n          `\`"type`\`": `\`"string`\`"`\r`\n        },`\r`\n        {`\r`\n          `\`"name`\`": `\`"price`\`",`\r`\n          `\`"type`\`": `\`"decimal`\`"`\r`\n        },`\r`\n        {`\r`\n          `\`"name`\`": `\`"qty`\`",`\r`\n          `\`"type`\`": `\`"integer`\`"`\r`\n        },`\r`\n        {`\r`\n          `\`"name`\`": `\`"extPrice`\`",`\r`\n          `\`"type`\`": `\`"decimal`\`"`\r`\n        }`\r`\n      ],`\r`\n      `\`"multipage`\`": true`\r`\n    }`\r`\n  ]`\r`\n}`",`n    `"outputFormat`": `"JSON`",`n    `"async`": false,`n    `"encrypt`": `"false`",`n    `"inline`": `"true`",`n    `"profiles`": `"`",`n    `"password`": `"`",`n    `"storeResult`": false`n}"

$response = Invoke-RestMethod 'https://api.pdf.co/v1/pdf/documentparser' -Method 'POST' -Headers $headers -Body $body
$response | ConvertTo-Json

[GET] /pdf/documentparser/templates

Return all data extraction templates for document parser for this user. Please use GET request.

Description

  • Method: GET
  • URL: /v1/pdf/documentparser/templates

Query parameters

No query parameters accepted.

Body payload

No body parameters accepted.

Example responses

pdf/documentparser/templates
{
    "templates": [
        {
            "id": 40,
            "type": "user",
            "title": "Untitled",
            "description": "Untitled"
        },
        {
            "id": 1,
            "type": "system",
            "title": "Invoice Parser",
            "description": "Parses invoices and extracts invoice number, company name, due date, amount, tax"
        }
    ],
    "remainingCredits": 94229
}

Code Snippets

CURL
curl --location --request GET 'https://api.pdf.co/v1/pdf/documentparser/templates' \
--header 'Content-Type: application/json' \
--header 'x-api-key: '
JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "");

var requestOptions = {
	method: 'GET',
	headers: myHeaders,
	redirect: 'follow'
};

fetch("https://api.pdf.co/v1/pdf/documentparser/templates", requestOptions)
	.then(response => response.text())
	.then(result => console.log(result))
	.catch(error => console.log('error', error));
NodeJs
var request = require('request');
var options = {
	'method': 'GET',
	'url': 'https://api.pdf.co/v1/pdf/documentparser/templates',
	'headers': {
		'Content-Type': 'application/json',
		'x-api-key': ''
	}
};
request(options, function (error, response) {
	if (error) throw new Error(error);
	console.log(response.body);
});

PHP
<?php

$curl = curl_init();

curl_setopt_array($curl, array(
	CURLOPT_URL => 'https://api.pdf.co/v1/pdf/documentparser/templates',
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_ENCODING => '',
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 0,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => 'GET',
	CURLOPT_HTTPHEADER => array(
		'Content-Type: application/json',
		'x-api-key: '
	),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

Java
import java.io.*;
import okhttp3.*;
public class main {
	public static void main(String []args) throws IOException{
		OkHttpClient client = new OkHttpClient().newBuilder()
			.build();
		Request request = new Request.Builder()
			.url("https://api.pdf.co/v1/pdf/documentparser/templates")
			.method("GET", null)
			.addHeader("Content-Type", "application/json")
			.addHeader("x-api-key", "")
			.build();
		Response response = client.newCall(request).execute();
		System.out.println(response.body().string());
	}
}

C#
using System;
using RestSharp;
namespace HelloWorldApplication {
	class HelloWorld {
		static void Main(string[] args) {
			var client = new RestClient("https://api.pdf.co/v1/pdf/documentparser/templates");
			client.Timeout = -1;
			var request = new RestRequest(Method.GET);
			request.AddHeader("Content-Type", "application/json");
			request.AddHeader("x-api-key", "");
			IRestResponse response = client.Execute(request);
			Console.WriteLine(response.Content);
		}
	}
}

Python
import requests
import json

url = "https://api.pdf.co/v1/pdf/documentparser/templates"

payload={}
headers = {
	'Content-Type': 'application/json',
	'x-api-key': ''
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

Powershell
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$headers.Add("x-api-key", "")

$response = Invoke-RestMethod 'https://api.pdf.co/v1/pdf/documentparser/templates' -Method 'GET' -Headers $headers
$response | ConvertTo-Json

[GET] /pdf/documentparser/templates/:id

Returns detailed information for document parser template by template’s id. Please use GET request.

Description

  • Method: GET
  • URL: /v1/pdf/documentparser/templates/:id

Query parameters

No query parameters accepted.

Body payload

No body parameters accepted.

Example responses

No example responses saved.

Code Snippets

CURL
curl --location --request GET 'https://api.pdf.co/v1/pdf/documentparser/templates/1' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw ''
JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "");

var raw = "";

var requestOptions = {
	method: 'GET',
	headers: myHeaders,
	body: raw,
	redirect: 'follow'
};

fetch("https://api.pdf.co/v1/pdf/documentparser/templates/1", requestOptions)
	.then(response => response.text())
	.then(result => console.log(result))
	.catch(error => console.log('error', error));
NodeJs
var request = require('request');
var options = {
	'method': 'GET',
	'url': 'https://api.pdf.co/v1/pdf/documentparser/templates/1',
	'headers': {
		'Content-Type': 'application/json',
		'x-api-key': ''
	}
};
request(options, function (error, response) {
	if (error) throw new Error(error);
	console.log(response.body);
});

PHP
<?php

$curl = curl_init();

curl_setopt_array($curl, array(
	CURLOPT_URL => 'https://api.pdf.co/v1/pdf/documentparser/templates/1',
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_ENCODING => '',
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 0,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => 'GET',
	CURLOPT_HTTPHEADER => array(
		'Content-Type: application/json',
		'x-api-key: '
	),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

Java
import java.io.*;
import okhttp3.*;
public class main {
	public static void main(String []args) throws IOException{
		OkHttpClient client = new OkHttpClient().newBuilder()
			.build();
		Request request = new Request.Builder()
			.url("https://api.pdf.co/v1/pdf/documentparser/templates/1")
			.method("GET", null)
			.addHeader("Content-Type", "application/json")
			.addHeader("x-api-key", "")
			.build();
		Response response = client.newCall(request).execute();
		System.out.println(response.body().string());
	}
}

C#
using System;
using RestSharp;
namespace HelloWorldApplication {
	class HelloWorld {
		static void Main(string[] args) {
			var client = new RestClient("https://api.pdf.co/v1/pdf/documentparser/templates/1");
			client.Timeout = -1;
			var request = new RestRequest(Method.GET);
			request.AddHeader("Content-Type", "application/json");
			request.AddHeader("x-api-key", "");
			var body = @"";
			request.AddParameter("application/json", body,  ParameterType.RequestBody);
			IRestResponse response = client.Execute(request);
			Console.WriteLine(response.Content);
		}
	}
}

Python
import requests
import json

url = "https://api.pdf.co/v1/pdf/documentparser/templates/1"

payload = ""
headers = {
	'Content-Type': 'application/json',
	'x-api-key': ''
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

Powershell
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$headers.Add("x-api-key", "")

$body = ""

$response = Invoke-RestMethod 'https://api.pdf.co/v1/pdf/documentparser/templates/1' -Method 'GET' -Headers $headers -Body $body
$response | ConvertTo-Json

[GET] /pdf/documentparser/results

Return all document parser results for this user. Please use GET request.

Description

  • Method: GET
  • URL: /v1/pdf/documentparser/results

Query parameters

  • templateId: 1

Body payload

No body parameters accepted.

Example responses

JSON pdf/documentparser/results
{
    "results": [
        {
            "id": 74,
            "templateId": 40,
            "body": {
                "fields": {
                    "date": {
                        "value": ""
                    },
                    "amount": {
                        "value": "2",
                        "pageIndex": 0
                    }
                },
                "sourceId": null,
                "templateId": null,
                "templateVersion": "3"
            },
            "createdAt": "2020-03-23T11:17:30.152Z",
            "filename": "EINPresswire-Report-512260784-bytescout-announces-release-of-its-data-extraction-tools-for-on-cloud-deployments.pdf"
        }
    ],
    "remainingCredits": 94220
}

Code Snippets

CURL
curl --location --request GET 'https://api.pdf.co/v1/pdf/documentparser/results?templateId=1' \
--header 'Content-Type: application/json' \
--header 'x-api-key: '
JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "");

var requestOptions = {
	method: 'GET',
	headers: myHeaders,
	redirect: 'follow'
};

fetch("https://api.pdf.co/v1/pdf/documentparser/results?templateId=1", requestOptions)
	.then(response => response.text())
	.then(result => console.log(result))
	.catch(error => console.log('error', error));
NodeJs
var request = require('request');
var options = {
	'method': 'GET',
	'url': 'https://api.pdf.co/v1/pdf/documentparser/results?templateId=1',
	'headers': {
		'Content-Type': 'application/json',
		'x-api-key': ''
	}
};
request(options, function (error, response) {
	if (error) throw new Error(error);
	console.log(response.body);
});

PHP
<?php

$curl = curl_init();

curl_setopt_array($curl, array(
	CURLOPT_URL => 'https://api.pdf.co/v1/pdf/documentparser/results?templateId=1',
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_ENCODING => '',
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 0,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => 'GET',
	CURLOPT_HTTPHEADER => array(
		'Content-Type: application/json',
		'x-api-key: '
	),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

Java
import java.io.*;
import okhttp3.*;
public class main {
	public static void main(String []args) throws IOException{
		OkHttpClient client = new OkHttpClient().newBuilder()
			.build();
		Request request = new Request.Builder()
			.url("https://api.pdf.co/v1/pdf/documentparser/results?templateId=1")
			.method("GET", null)
			.addHeader("Content-Type", "application/json")
			.addHeader("x-api-key", "")
			.build();
		Response response = client.newCall(request).execute();
		System.out.println(response.body().string());
	}
}

C#
using System;
using RestSharp;
namespace HelloWorldApplication {
	class HelloWorld {
		static void Main(string[] args) {
			var client = new RestClient("https://api.pdf.co/v1/pdf/documentparser/results?templateId=1");
			client.Timeout = -1;
			var request = new RestRequest(Method.GET);
			request.AddHeader("Content-Type", "application/json");
			request.AddHeader("x-api-key", "");
			IRestResponse response = client.Execute(request);
			Console.WriteLine(response.Content);
		}
	}
}

Python
import requests
import json

url = "https://api.pdf.co/v1/pdf/documentparser/results?templateId=1"

payload={}
headers = {
	'Content-Type': 'application/json',
	'x-api-key': ''
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

Powershell
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$headers.Add("x-api-key", "")

$response = Invoke-RestMethod 'https://api.pdf.co/v1/pdf/documentparser/results?templateId=1' -Method 'GET' -Headers $headers
$response | ConvertTo-Json

[POST] /pdf/documentparser/results

Description Create document parser result for this user. Please use POST request.

Input Parameters

ParamDescription
templateIdoptional. Create document parser result with this template id. Must be a String.
resultoptional. JSONB storage for storing document parser result. Must be a String.
resultTypeoptional. Result format. Valid values: JSON, YAML, XML, CSV. Must be a String.
fileUrloptional. URL to source PDF File. Must be a String.

Status Errors

CodeDescription
200The request has succeeded
400bad input parameters
401unauthorized

Description

  • Method: POST
  • URL: /v1/pdf/documentparser/results

Query parameters

No query parameters accepted.

Body payload

{
    "fileUrl": "https://github.com/bytescout/ByteScout-SDK-SourceCode/raw/master/Document%20Parser%20SDK/DigitalOcean.pdf",
    "templateId": 48,
    "formatType": "CSV",
    "result": "companyName,companyName2,invoiceId,dateIssued,dateDue,total,subTotal,tax\r\n,,,,,450.00,,\r\n\r\n"
}

Example responses

No example responses saved.

Code Snippets

CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/documentparser/results' \
--header 'Content-Type: application/json' \
--header 'x-api-key: ' \
--data-raw '{
    "fileUrl": "https://github.com/bytescout/ByteScout-SDK-SourceCode/raw/master/Document%20Parser%20SDK/DigitalOcean.pdf",
    "templateId": 48,
    "formatType": "CSV",
    "result": "companyName,companyName2,invoiceId,dateIssued,dateDue,total,subTotal,tax\r\n,,,,,450.00,,\r\n\r\n"
}'
JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "");

var raw = JSON.stringify({
 "fileUrl": "https://github.com/bytescout/ByteScout-SDK-SourceCode/raw/master/Document%20Parser%20SDK/DigitalOcean.pdf",
 "templateId": 48,
 "formatType": "CSV",
 "result": "companyName,companyName2,invoiceId,dateIssued,dateDue,total,subTotal,tax\r\n,,,,,450.00,,\r\n\r\n"
});

var requestOptions = {
	method: 'POST',
	headers: myHeaders,
	body: raw,
	redirect: 'follow'
};

fetch("https://api.pdf.co/v1/pdf/documentparser/results", requestOptions)
	.then(response => response.text())
	.then(result => console.log(result))
	.catch(error => console.log('error', error));
NodeJs
var request = require('request');
var options = {
	'method': 'POST',
	'url': 'https://api.pdf.co/v1/pdf/documentparser/results',
	'headers': {
		'Content-Type': 'application/json',
		'x-api-key': ''
	},
	body: JSON.stringify({
	 "fileUrl": "https://github.com/bytescout/ByteScout-SDK-SourceCode/raw/master/Document%20Parser%20SDK/DigitalOcean.pdf",
	 "templateId": 48,
	 "formatType": "CSV",
	 "result": "companyName,companyName2,invoiceId,dateIssued,dateDue,total,subTotal,tax\r\n,,,,,450.00,,\r\n\r\n"
	})

};
request(options, function (error, response) {
	if (error) throw new Error(error);
	console.log(response.body);
});

PHP
<?php

$curl = curl_init();

curl_setopt_array($curl, array(
	CURLOPT_URL => 'https://api.pdf.co/v1/pdf/documentparser/results',
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_ENCODING => '',
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 0,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => 'POST',
	CURLOPT_POSTFIELDS =>'{
    "fileUrl": "https://github.com/bytescout/ByteScout-SDK-SourceCode/raw/master/Document%20Parser%20SDK/DigitalOcean.pdf",
    "templateId": 48,
    "formatType": "CSV",
    "result": "companyName,companyName2,invoiceId,dateIssued,dateDue,total,subTotal,tax\\r\\n,,,,,450.00,,\\r\\n\\r\\n"
}',
	CURLOPT_HTTPHEADER => array(
		'Content-Type: application/json',
		'x-api-key: '
	),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

Java
import java.io.*;
import okhttp3.*;
public class main {
	public static void main(String []args) throws IOException{
		OkHttpClient client = new OkHttpClient().newBuilder()
			.build();
		MediaType mediaType = MediaType.parse("application/json");
		RequestBody body = RequestBody.create(mediaType, "{\n    \"fileUrl\": \"https://github.com/bytescout/ByteScout-SDK-SourceCode/raw/master/Document%20Parser%20SDK/DigitalOcean.pdf\",\n    \"templateId\": 48,\n    \"formatType\": \"CSV\",\n    \"result\": \"companyName,companyName2,invoiceId,dateIssued,dateDue,total,subTotal,tax\\r\\n,,,,,450.00,,\\r\\n\\r\\n\"\n}");
		Request request = new Request.Builder()
			.url("https://api.pdf.co/v1/pdf/documentparser/results")
			.method("POST", body)
			.addHeader("Content-Type", "application/json")
			.addHeader("x-api-key", "")
			.build();
		Response response = client.newCall(request).execute();
		System.out.println(response.body().string());
	}
}

C#
using System;
using RestSharp;
namespace HelloWorldApplication {
	class HelloWorld {
		static void Main(string[] args) {
			var client = new RestClient("https://api.pdf.co/v1/pdf/documentparser/results");
			client.Timeout = -1;
			var request = new RestRequest(Method.POST);
			request.AddHeader("Content-Type", "application/json");
			request.AddHeader("x-api-key", "");
			var body = @"{" + "\n" +
			@"    ""fileUrl"": ""https://github.com/bytescout/ByteScout-SDK-SourceCode/raw/master/Document%20Parser%20SDK/DigitalOcean.pdf""," + "\n" +
			@"    ""templateId"": 48," + "\n" +
			@"    ""formatType"": ""CSV""," + "\n" +
			@"    ""result"": ""companyName,companyName2,invoiceId,dateIssued,dateDue,total,subTotal,tax\r\n,,,,,450.00,,\r\n\r\n""" + "\n" +
			@"}";
			request.AddParameter("application/json", body,  ParameterType.RequestBody);
			IRestResponse response = client.Execute(request);
			Console.WriteLine(response.Content);
		}
	}
}

Python
import requests
import json

url = "https://api.pdf.co/v1/pdf/documentparser/results"

payload = json.dumps({
 "fileUrl": "https://github.com/bytescout/ByteScout-SDK-SourceCode/raw/master/Document%20Parser%20SDK/DigitalOcean.pdf",
 "templateId": 48,
 "formatType": "CSV",
 "result": "companyName,companyName2,invoiceId,dateIssued,dateDue,total,subTotal,tax\r\n,,,,,450.00,,\r\n\r\n"
})
headers = {
	'Content-Type': 'application/json',
	'x-api-key': ''
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Powershell
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$headers.Add("x-api-key", "")

$body = "{`n    `"fileUrl`": `"https://github.com/bytescout/ByteScout-SDK-SourceCode/raw/master/Document%20Parser%20SDK/DigitalOcean.pdf`",`n    `"templateId`": 48,`n    `"formatType`": `"CSV`",`n    `"result`": `"companyName,companyName2,invoiceId,dateIssued,dateDue,total,subTotal,tax`\r`\n,,,,,450.00,,`\r`\n`\r`\n`"`n}"

$response = Invoke-RestMethod 'https://api.pdf.co/v1/pdf/documentparser/results' -Method 'POST' -Headers $headers -Body $body
$response | ConvertTo-Json

[DELETE] /pdf/documentparser/results/:id

DELETE document parser result with given id. Please use DELETE request.

Description

  • Method: DELETE
  • URL: /v1/pdf/documentparser/results/:id

Query parameters

No query parameters accepted.

Body payload

No body parameters accepted.

Example responses

No example responses saved.

Code Snippets

CURL
curl --location --request DELETE 'https://api.pdf.co/v1/pdf/documentparser/results/' \
--header 'Content-Type: application/json' \
--header 'x-api-key: '
JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "");

var requestOptions = {
	method: 'DELETE',
	headers: myHeaders,
	redirect: 'follow'
};

fetch("https://api.pdf.co/v1/pdf/documentparser/results/", requestOptions)
	.then(response => response.text())
	.then(result => console.log(result))
	.catch(error => console.log('error', error));
NodeJs
var request = require('request');
var options = {
	'method': 'DELETE',
	'url': 'https://api.pdf.co/v1/pdf/documentparser/results/',
	'headers': {
		'Content-Type': 'application/json',
		'x-api-key': ''
	}
};
request(options, function (error, response) {
	if (error) throw new Error(error);
	console.log(response.body);
});

PHP
<?php

$curl = curl_init();

curl_setopt_array($curl, array(
	CURLOPT_URL => 'https://api.pdf.co/v1/pdf/documentparser/results/',
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_ENCODING => '',
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 0,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => 'DELETE',
	CURLOPT_HTTPHEADER => array(
		'Content-Type: application/json',
		'x-api-key: '
	),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

Java
import java.io.*;
import okhttp3.*;
public class main {
	public static void main(String []args) throws IOException{
		OkHttpClient client = new OkHttpClient().newBuilder()
			.build();
		MediaType mediaType = MediaType.parse("application/json");
		RequestBody body = RequestBody.create(mediaType, "");
		Request request = new Request.Builder()
			.url("https://api.pdf.co/v1/pdf/documentparser/results/")
			.method("DELETE", body)
			.addHeader("Content-Type", "application/json")
			.addHeader("x-api-key", "")
			.build();
		Response response = client.newCall(request).execute();
		System.out.println(response.body().string());
	}
}

C#
using System;
using RestSharp;
namespace HelloWorldApplication {
	class HelloWorld {
		static void Main(string[] args) {
			var client = new RestClient("https://api.pdf.co/v1/pdf/documentparser/results/");
			client.Timeout = -1;
			var request = new RestRequest(Method.DELETE);
			request.AddHeader("Content-Type", "application/json");
			request.AddHeader("x-api-key", "");
			IRestResponse response = client.Execute(request);
			Console.WriteLine(response.Content);
		}
	}
}

Python
import requests
import json

url = "https://api.pdf.co/v1/pdf/documentparser/results/"

payload={}
headers = {
	'Content-Type': 'application/json',
	'x-api-key': ''
}

response = requests.request("DELETE", url, headers=headers, data=payload)

print(response.text)

Powershell
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$headers.Add("x-api-key", "")

$response = Invoke-RestMethod 'https://api.pdf.co/v1/pdf/documentparser/results/' -Method 'DELETE' -Headers $headers
$response | ConvertTo-Json