PDF To XLSX
Related Knowledgebase-Explore Samples
Convert PDF to spreadsheet with layout and fonts preserved.
Available Methods
[POST] /pdf/convert/to/xls (xlsx output)
Auto classification Of Incoming Documents
Use /pdf/classifier
(Document Classifier) endpoint to automatically sort / detect the class of the document based on keywords-based rules. For example, you can define rules to find which vendor provided the document to find which template to apply accordingly.
Parameters
url
required. URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage. For uploading files via API please check Files Upload section. If you are randomly gettingToo Many Requests
orAccess Denied
error for your input url, please try to addcache:
to enable built-in url caching. You can also encrypt data for output files and decrypt data input files with user-controlled data encryption (uses strongAES
encryption with your own keys). Click here to learn more.httpusername
(optional) - http auth user name if required to access sourceurl
.httppassword
(optional) - http auth password if required to access sourceurl
.pages
optional. Comma-separated list of page indices (or ranges) to process. IMPORTANT: the very first page starts at0
(zero). To set a range use the dash-
, for example:0,2-5,7-
. To set a range from index to the last page use range like this:2-
(from page #3 as the index starts at zero and till the of the document). For ALL pages just leave this param empty. Example:0,2-5,7-
means first page, then 3rd page to 6th page, and then the range from 8th (index =7
) page till the end of the document. Must be a String.unwrap
optional. Unwrap lines into a single line within table cells whenlineGrouping
is enabled. Must be one of:true
,false
.rect
optional. Defines coordinates for extraction, e.g.51.8, 114.8, 235.5, 204.0
. You can use PDF.co PDF Viewer with coordinates to easily select and copy coordinates. Must be a String.lang
optional. Sets language for OCR (text from image) to use for scanned PDF, PNG, JPG documents input when extracting text. Default is “eng”. Other languages are also supported:deu
,spa
,chi_sim
,jpn
and many others (full list of supported OCR languages is here. You can also use 2 languages simultaneously like this:eng+deu
orjpn+kor
(any combination).inline
optional. Must be one of:true
to return data as inline orfalse
to return link to output file (default).lineGrouping
optional. optional. Line grouping within table cells. Set to1
to enable the grouping. Must be a String.encrypt
(legacy, now all files are stored at the encrypted cloud storage by default.async
optional. Runs processing asynchronously. Returns UseJobId
that you may use with/job/check
to check state of the processing (possible states:working
,failed
,aborted
andsuccess
). Must be one of:true
,false
.name
optional. File name for generated output. Must be a String.expiration
(optional). Output link expiration in minutes. Default is60
(i.e. 60 minutes or 1 hour). After this delay generated output file(s) (if any) will be auto-removed from PDF.co temporary files storage. Max allowed expiration period depends on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf, documents), please use PDF.co built-in Files Storage instead.profiles
optional. Must be a String. Use this param to set additional configuration for fine tuning and extra options. Explore PDF.co knowledgebase for profile examples. For example, to change CSV separator:{ 'profiles': [ { 'profile1': { 'CSVSeparatorSymbol': ';' } } ] }
- Method: POST
- URL: /v1/pdf/convert/to/xlsx
Query parameters
- :
Body payload
{
"url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-excel/sample.pdf",
"async": false
}
Example responses
/pdf/convert/to/xls
{
"url": "https://pdf-temp-files.s3.amazonaws.com/60c6b9f50280495a9567f73a0a394252/sample.xlsx",
"pageCount": 1,
"error": false,
"status": 200,
"name": "sample.xlsx",
"remainingCredits": 60568
}
Code Snippet
CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/xlsx?=' \
--header 'x-api-key: ' \
--data-raw '{
"url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-excel/sample.pdf",
"async": false
}'
Knowledgebase
PDF to CSV/PDF to JSON - Enable consideration of font colors when separating columns
PDF to CSV - English language scanned PDF’s output csv contains characters of other language
PDF To Text, PDF to CSV, PDF To JSON, PDF TO XML - Extracting information about vector drawings
PDF to JSON/PDF to Text - fixing malformed PDF or incorrectly embedded font
PDF to CSV/XLS conversion fixing spaces in columns detection
PDF to JSON/XML/CSV/Text - forcing rotation of PDF prior to data extraction
PDF To XML and forcing OCR for text extraction from scanned images inside PDF
PDF to XLS, PDF to CSV, PDF to JSON - Issue with incorrect text line grouping split of cells
PDF to XML, PDF to JSON, PDF to CSV - Line Grouping Mode not working
PDF to CSV/PDF to JSON - Has some disappearing or weird Unicode characters
PDF to XLS fixing some numbers are removed after the conversion
PDF to Text with scanned documents configuring OCR text corrections
PDF to JSON/XML/CSV/Text - optimizing speed when it is slow due to a huge number of vector objects
PDF to CSV - Number of columns don’t match in a 2-page table.
PDF to CSV output adds strange  character when opening in Excel
PDF to JSON - How to separate header and body text based on font size
PDF to CSV/JSON/XML - Performing explicit page rotation before converting PDF to XML/JSON/CSV
PDF to XML - Enable saving images inside xml as base64 encoded strings
Samples
- Blazor - Convert PDF To XLSX From Uploaded File
- C# - Convert PDF To XLSX From Uploaded File
- C# - Convert PDF To XLSX From URL
- C# - Convert PDF To XLSX From URL Asynchronously
- cURL - Convert PDF to XLSX
- Java - Convert PDF To XLSX From Uploaded File
- Java - Convert PDF To XLSX From URL
- JavaScript - Convert PDF To XLSX From Uploaded File (Node.js)
- JavaScript - Convert PDF To XLSX From Uploaded File (Node.js) - Async API
- JavaScript - Convert PDF To XLSX From URL (Node.js)
- JavaScript - Convert PDF To XLSX From URL (Node.js) - Async API
- JavaScript - Convert PDF To XLSX in JQuery
- JavaScript - Convert PDF To XLSX in JQuery - Async API
- PHP - Convert PDF To XLSX Asynchronously
- PHP - Convert PDF To XLSX From Uploaded File
- PowerShell - Convert PDF To XLSX From Uploaded File
- PowerShell - Convert PDF To XLSX From URL
- PowerShell - Convert PDF To XLSX From URL Asynchronously
- Python - Advanced Conversion Options
- Python - Advanced Conversion Options With Rotated Input
- Python - Convert PDF To Excel From Uploaded File
- Python - Convert PDF To Excel From Uploaded File Asynchronously
- VB.NET - Convert PDF To XLSX From Uploaded File
- VB.NET - Convert PDF To XLSX From URL
- VB.NET - Convert PDF To XLSX From URL Asynchronously
Copyright © 2016 - 2022 PDF.co