Link Search Menu Expand Document


Related Knowledgebase-Explore Samples

Convert PDF to spreadsheet with layout and fonts preserved.

Available Methods

[POST] /pdf/convert/to/xls (xlsx output)

Auto classification Of Incoming Documents

Use /pdf/classifier (Document Classifier) endpoint to automatically sort / detect the class of the document based on keywords-based rules. For example, you can define rules to find which vendor provided the document to find which template to apply accordingly.


  • url required. URL to the source file. Supports links from Google Drive, Dropbox and from built-in files storage. For uploading files via API please check Files Upload section. If you are randomly getting Too Many Requests or Access Denied error for your input url, please try to add cache: to enable built-in url caching. You can also encrypt data for output files and decrypt data input files with user-controlled data encryption (uses strong AES encryption with your own keys). Click here to learn more.
  • httpusername (optional) - http auth user name if required to access source url.
  • httppassword (optional) - http auth password if required to access source url.
  • pages optional. Comma-separated list of page indices (or ranges) to process. IMPORTANT: the very first page starts at 0 (zero). To set a range use the dash -, for example: 0,2-5,7-. To set a range from index to the last page use range like this: 2- (from page #3 as the index starts at zero and till the of the document). For ALL pages just leave this param empty. Example: 0,2-5,7- means first page, then 3rd page to 6th page, and then the range from 8th (index = 7) page till the end of the document. Must be a String.
  • unwrap optional. Unwrap lines into a single line within table cells when lineGrouping is enabled. Must be one of: true, false.
  • rect optional. Defines coordinates for extraction, e.g. 51.8, 114.8, 235.5, 204.0. You can use PDF Viewer with coordinates to easily select and copy coordinates. Must be a String.
  • lang optional. Sets language for OCR (text from image) to use for scanned PDF, PNG, JPG documents input when extracting text. Default is “eng”. Other languages are also supported: deu, spa, chi_sim, jpn and many others (full list of supported OCR languages is here. You can also use 2 languages simultaneously like this: eng+deu or jpn+kor (any combination).
  • inline optional. Must be one of: true to return data as inline or false to return link to output file (default).
  • lineGrouping optional. optional. Line grouping within table cells. Set to 1 to enable the grouping. Must be a String.

  • encrypt (legacy, now all files are stored at the encrypted cloud storage by default.
  • async optional. Runs processing asynchronously. Returns Use JobId that you may use with /job/check to check state of the processing (possible states: working, failed, aborted and success). Must be one of: true, false.
  • name optional. File name for generated output. Must be a String.
  • expiration (optional). Output link expiration in minutes. Default is 60 (i.e. 60 minutes or 1 hour). After this delay generated output file(s) (if any) will be auto-removed from temporary files storage. Max allowed expiration period depends on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf, documents), please use built-in Files Storage instead.
  • profiles optional. Must be a String. Use this param to set additional configuration for fine tuning and extra options. Explore knowledgebase for profile examples. For example, to change CSV separator: { 'profiles': [ { 'profile1': { 'CSVSeparatorSymbol': ';' } } ] }

  • Method: POST
  • URL: /v1/pdf/convert/to/xlsx

Query parameters

  • :

Body payload

    "url": "",
    "async": false

Example responses

    "url": "",
    "pageCount": 1,
    "error": false,
    "status": 200,
    "name": "sample.xlsx",
    "remainingCredits": 60568

Code Snippet

curl --location --request POST '' \
--header 'x-api-key: ' \
--data-raw '{
    "url": "",
    "async": false