Parse From Url - Powershell
Document Parser sample in Powershell demonstrating ‘Parse From Url’
MultiPageTable-template1.yml
templateName: Multipage Table Test
templateVersion: 4
templatePriority: 0
detectionRules:
keywords:
- Sample document with multi-page table
objects:
- name: total
objectType: field
fieldProperties:
fieldType: macros
expression: TOTAL{{Spaces}}({{Number}})
regex: true
dataType: decimal
- name: table1
objectType: table
tableProperties:
start:
expression: Item{{Spaces}}Description{{Spaces}}Price
regex: true
end:
expression: TOTAL{{Spaces}}{{Number}}
regex: true
row:
expression: '{{LineStart}}{{Spaces}}(?<itemNo>{{Digits}}){{Spaces}}(?<description>{{SentenceWithSingleSpaces}}){{Spaces}}(?<price>{{Number}}){{Spaces}}(?<qty>{{Digits}}){{Spaces}}(?<extPrice>{{Number}})'
regex: true
columns:
- name: itemNo
dataType: integer
- name: description
dataType: string
- name: price
dataType: decimal
- name: qty
dataType: integer
- name: extPrice
dataType: decimal
multipage: true
ParseFromUrl.ps1
# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
$API_KEY = "***********************************"
# Source PDF file url
$SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf"
# Destination JSON file name
$DestinationFile = ".\result.json"
try {
# Parse url
// Template text. Use Document Parser (https://pdf.co/document-parser, https://app.pdf.co/document-parser)
# to create templates.
# Read template from file:
$templateContent = [IO.File]::ReadAllText(".\MultiPageTable-template1.yml")
# Prepare URL for `Document Parser` API call
$query = "https://api.pdf.co/v1/pdf/documentparser"
# Content
$Body = @{
"url" = $SourceFileUrl;
"template" = $templateContent;
}
# Execute request
$jsonResponse = Invoke-RestMethod -Method 'Post' -Headers @{ "x-api-key" = $API_KEY } -Uri $query -Body ($Body|ConvertTo-Json) -ContentType "application/json"
if ($jsonResponse.error -eq $false) {
# Get URL of generated HTML file
$resultFileUrl = $jsonResponse.url;
# Download output file
Invoke-WebRequest -Headers @{ "x-api-key" = $API_KEY } -OutFile $DestinationFile -Uri $resultFileUrl
Write-Host "Generated output file saved as `"$($DestinationFile)`" file."
}
else {
# Display service reported error
Write-Host $jsonResponse.message
}
}
catch {
# Display request error
Write-Host $_.Exception
}
run.bat
@echo off
powershell -NoProfile -ExecutionPolicy Bypass -Command "& .\ParseFromUrl.ps1"
echo Script finished with errorlevel=%errorlevel%
pause
PDF.co Web API: the Web API with a set of tools for documents manipulation, data conversion, data extraction, splitting and merging of documents. Includes image recognition, built-in OCR, barcode generation and barcode decoders to decode bar codes from scans, pictures and pdf.
Download Source Code (.zip)
return to the previous page explore Document Parser endpoint
Copyright © 2016 - 2023 PDF.co