PDF Search Text
Search text in PDF and get coordinates. Supports regular expressions.
Available Methods
[POST] /pdf/find
url
required. URL to the source file. Supports links from Google Drive, Dropbox and from built-in PDF.co files storage.
For uploading files via API please check Files Upload section.
If you are randomly gettingToo Many Requests
orAccess Denied
error for your input url, please try to addcache:
to enable built-in url caching.
You can also encrypt data for output files and decrypt data input files with user-controlled data encryption (uses strongAES
encryption with your own keys). Click here to learn more.httpusername
optional - http auth user name if required to access sourceurl
.httppassword
optional - http auth password if required to access sourceurl
.searchString
text to search. Can contain regular expressions if you setregexSearch
param totrue
.pages
optional. Comma-separated list of page indices (or ranges) to process. IMPORTANT: the very first page starts at0
(zero). To set a range use the dash-
, for example:0,2-5,7-
. To set a range from index to the last page use range like this:2-
(from page #3 as the index starts at zero and till the of the document). For ALL pages just leave this param empty. Example:0,2-5,7-
means first page, then 3rd page to 6th page, and then the range from 8th (index =7
) page till the end of the document. Must be a String.inline
optional. Must be one of:true
,false
.wordMatchingMode
optional. Must be a String. Values can be either ‘SmartMatch’, ‘ExactMatch’ or ‘None’.password
optional. Password of PDF file. Must be a StringregexSearch
optional. Must be one of:true
,false
.async
optional. Runs processing asynchronously. ReturnsJobId
that you may use with/job/check
to check state of the background job (possible states:working
,failed
,aborted
andsuccess
). Must be one of:true
,false
.name
optional. File name for generated output. Must be a String.expiration
optional. Output link expiration in minutes. Default is60
(i.e. 60 minutes or 1 hour). After this delay generated output file(s) (if any) will be auto-removed from PDF.co temporary files storage. Max allowed expiration period depends on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf, documents), please use PDF.co built-in Files Storage instead.profiles
optional. Must be a String. Use this parameter to set additional configuration for fine tuning and extra options. Explore PDF.co knowledgebase for profile examples.- Method: POST
- URL: /v1/pdf/find
Query parameters
No query parameters accepted.
Body payload
{
"async": "false",
"encrypt": "false",
"url": "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-to-text/sample.pdf",
"searchString": "Invoice Date \\d+/\\d+/\\d+",
"regexSearch": "true",
"name": "output",
"pages": "0-",
"inline": "true",
"wordMatchingMode": "",
"password": ""
}
Example responses
/pdf/find
{
"body": [
{
"text": "Invoice Date 01/01/2016",
"left": 436.5400085449219,
"top": 130.4599995137751,
"width": 122.85311957550027,
"height": 11.040000486224898,
"pageIndex": 0,
"bounds": {
"location": {
"isEmpty": false,
"x": 436.54,
"y": 130.46
},
"size": "122.853119, 11.0400009",
"x": 436.54,
"y": 130.46,
"width": 122.853119,
"height": 11.0400009,
"left": 436.54,
"top": 130.46,
"right": 559.3931,
"bottom": 141.5,
"isEmpty": false
},
"elementCount": 1,
"elements": [
{
"index": 0,
"left": 436.5400085449219,
"top": 130.4599995137751,
"width": 122.85311957550027,
"height": 11.040000486224898,
"angle": 0,
"text": "Invoice Date 01/01/2016",
"isNewLine": true,
"fontIsBold": true,
"fontIsItalic": false,
"fontName": "Helvetica-Bold",
"fontSize": 11,
"fontColor": "0, 0, 0",
"fontColorAsOleColor": 0,
"fontColorAsHtmlColor": "#000000",
"bounds": {
"location": {
"isEmpty": false,
"x": 436.54,
"y": 130.46
},
"size": "122.853119, 11.0400009",
"x": 436.54,
"y": 130.46,
"width": 122.853119,
"height": 11.0400009,
"left": 436.54,
"top": 130.46,
"right": 559.3931,
"bottom": 141.5,
"isEmpty": false
}
}
]
}
],
"pageCount": 1,
"error": false,
"status": 200,
"name": "output",
"remainingCredits": 59970
}
Code Snippet
CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/find' \
--header 'x-api-key: ' \
--header 'Content-Type: application/json' \
--data-raw '{
"async": "false",
"encrypt": "false",
"url": "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-to-text/sample.pdf",
"searchString": "Invoice Date \\d+/\\d+/\\d+",
"regexSearch": "true",
"name": "output",
"pages": "0-",
"inline": "true",
"wordMatchingMode": "",
"password": ""
}'
Samples
- C# - Async file upload and async Search Text
- C# - PDF Text Search from URL
- C# - PDF Text Search from URL Asynchronously
- C# - PDF Text Search from Uploaded File
- C# - PDF Text Search from Uploaded File Asynchronously
- Java - PDF Text Search from URL
- Java - PDF Text Search from URL Asynchronously
- Java - PDF Text Search from Uploaded File
- Java - PDF Text Search from Uploaded File Asynchronously
- JavaScript - PDF Text Search from URL (Node js)
- JavaScript - PDF Text Search from URL (Node js) - Async API
- JavaScript - PDF Text Search from Uploaded File (Node js)
- JavaScript - PDF Text Search from Uploaded File (Node js) - Async API
- PowerShell - PDF Text Search from URL
- PowerShell - PDF Text Search from URL Asynchronously
- PowerShell - PDF Text Search from Uploaded File
- PowerShell - PDF Text Search from Uploaded File Asynchronously
- Python - PDF Text Search from Uploaded File
- Python - PDF Text Search from Uploaded File Asynchronously
- VB.NET - Async file upload and async Search Text
- VB.NET - PDF Text Search from URL
- VB.NET - PDF Text Search from URL Asynchronously
- VB.NET - PDF Text Search from Uploaded File
- VB.NET - PDF Text Search from Uploaded File Asynchronously
- Zapier - Search in PDF
- cURL - PDF Search Text
Copyright © 2016 - 2023 PDF.co