PDF Search Text
Search text in PDF and get coordinates. Supports regular expressions.
Available Methods
[POST] /pdf/find
Attributes |
---|
url required URL to the source file. Supports links from Google Drive, Dropbox, and PDF.co built-in files storage. To upload files via API, Check out the Files Upload section. Note: If you experience intermittent Too Many Requests or Access Denied errors, please try to add cache: to enable built-in URL caching. (e.g cache:https://example.com/file1.pdf ) For data security, you have the option to encrypt output files and decrypt input files. Learn more about user-controlled data encryption. |
httpusername optional HTTP auth user name if required to access source url . |
httppassword optional HTTP auth password if required to access source url . |
searchString required Text to search can support regular expressions if you set the regexSearch param to true . |
pages optional Comma-separated list of page indices (or ranges) to process. IMPORTANT: the very first page starts at 0 (zero).To set a range use the dash - , for example: 0,2-5,7- .To set a range from index to the last page use range like this: 2- (from page #3 as the index starts at zero and till the of the document) for ALL pages just leave this param empty. Example: 0,2-5,7- means the first page, then the 3rd page to the 6th page, and then the range from the 8th (index = 7 ) page till the end of the document, The input must be in string format. |
inline optional Must be one of: true , or false . |
wordMatchingMode optional Values can be either ‘SmartMatch’, ‘ExactMatch’, or ‘None’. |
password optional Password of the PDF file, The input must be in string format. |
regexSearch optional Must be one of: true , or false . |
async optional Set async to true for long processes to run in the background, API will then return a jobId which you can use with /job/check endpoint to check the status of the process and retrieve the output while you can proceed with other tasks without waiting for this process to finish. IMPORTANT: Also set the inline param to true to get a direct link to the final output pdf in both sync and async modes. Otherwise, you will be getting a direct link to pdf in sync mode but also a link to the .json file in the async mode. |
profiles optional Use this parameter to set additional configurations for fine-tuning and extra options. Explore PDF.co knowledgebase for profile examples, The input must be in string format. |
- Method: POST
- URL: /v1/pdf/find
Query parameters
No query parameters accepted.
Body payload
{
"async": "false",
"url": "pdfco-test-files.s3.us-west-2.amazonaws.compdf-to-text/sample.pdf",
"searchString": "Invoice Date \\d+/\\d+/\\d+",
"regexSearch": "true",
"name": "output",
"pages": "0-",
"inline": "true",
"wordMatchingMode": "",
"password": ""
}
Example responses
/pdf/find
{
"body": [
{
"text": "Invoice Date 01/01/2016",
"left": 436.5400085449219,
"top": 130.4599995137751,
"width": 122.85311957550027,
"height": 11.040000486224898,
"pageIndex": 0,
"bounds": {
"location": {
"isEmpty": false,
"x": 436.54,
"y": 130.46
},
"size": "122.853119, 11.0400009",
"x": 436.54,
"y": 130.46,
"width": 122.853119,
"height": 11.0400009,
"left": 436.54,
"top": 130.46,
"right": 559.3931,
"bottom": 141.5,
"isEmpty": false
},
"elementCount": 1,
"elements": [
{
"index": 0,
"left": 436.5400085449219,
"top": 130.4599995137751,
"width": 122.85311957550027,
"height": 11.040000486224898,
"angle": 0,
"text": "Invoice Date 01/01/2016",
"isNewLine": true,
"fontIsBold": true,
"fontIsItalic": false,
"fontName": "Helvetica-Bold",
"fontSize": 11,
"fontColor": "0, 0, 0",
"fontColorAsOleColor": 0,
"fontColorAsHtmlColor": "#000000",
"bounds": {
"location": {
"isEmpty": false,
"x": 436.54,
"y": 130.46
},
"size": "122.853119, 11.0400009",
"x": 436.54,
"y": 130.46,
"width": 122.853119,
"height": 11.0400009,
"left": 436.54,
"top": 130.46,
"right": 559.3931,
"bottom": 141.5,
"isEmpty": false
}
}
]
}
],
"pageCount": 1,
"error": false,
"status": 200,
"name": "output",
"remainingCredits": 59970
}
Code Snippet
CURL
curl --location --request POST 'https://api.pdf.co/v1/pdf/find' \
--header 'x-api-key: ' \
--header 'Content-Type: application/json' \
--data-raw '{
"async": "false",
"url": "pdfco-test-files.s3.us-west-2.amazonaws.compdf-to-text/sample.pdf",
"searchString": "Invoice Date \\d+/\\d+/\\d+",
"regexSearch": "true",
"name": "output",
"pages": "0-",
"inline": "true",
"wordMatchingMode": "",
"password": ""
}'
Samples
- C# - Async file upload and async Search Text
- C# - PDF Text Search from URL
- C# - PDF Text Search from URL Asynchronously
- C# - PDF Text Search from Uploaded File
- C# - PDF Text Search from Uploaded File Asynchronously
- Java - PDF Text Search from URL
- Java - PDF Text Search from URL Asynchronously
- Java - PDF Text Search from Uploaded File
- Java - PDF Text Search from Uploaded File Asynchronously
- JavaScript - PDF Text Search from URL (Node js)
- JavaScript - PDF Text Search from URL (Node js) - Async API
- JavaScript - PDF Text Search from Uploaded File (Node js)
- JavaScript - PDF Text Search from Uploaded File (Node js) - Async API
- PowerShell - PDF Text Search from URL
- PowerShell - PDF Text Search from URL Asynchronously
- PowerShell - PDF Text Search from Uploaded File
- PowerShell - PDF Text Search from Uploaded File Asynchronously
- Python - PDF Text Search from Uploaded File
- Python - PDF Text Search from Uploaded File Asynchronously
- Salesforce - Search Text From PDF
- VB.NET - Async file upload and async Search Text
- VB.NET - PDF Text Search from URL
- VB.NET - PDF Text Search from URL Asynchronously
- VB.NET - PDF Text Search from Uploaded File
- VB.NET - PDF Text Search from Uploaded File Asynchronously
- Zapier - Search in PDF
- cURL - PDF Search Text
Copyright © 2016 - 2023 PDF.co