Parse From Url - VB.NET
Document Parser sample in VB.NET demonstrating ‘Parse From Url’
Module1.vb
Imports System.Collections.Specialized
Imports System.IO
Imports System.Net
Imports System.Text
Imports Newtonsoft.Json.Linq
Module Module1
' The authentication key (API Key).
' Get your own by registering at https://app.pdf.co
Const API_KEY As String = "***********************************"
' Source PDF file Url
' You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
Const SourceFileUrl As String = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/document-parser/MultiPageTable.pdf"
' Destination TXT file name
Const DestinationFile As String = ".\result.json"
Sub Main()
' Template text. Use Document Parser (https://pdf.co/document-parser, https://app.pdf.co/document-parser)
' to create templates.
' Read template from file
Dim templateText As String = File.ReadAllText("MultiPageTable-template1.yml")
' Create standard .NET web client instance
Dim webClient As WebClient = New WebClient()
' Set API Key
webClient.Headers.Add("x-api-key", API_KEY)
Try
' PARSE UPLOADED PDF DOCUMENT
' URL for `Document Parser` API call
Dim query As String = "https://api.pdf.co/v1/pdf/documentparser"
Dim requestBody As New NameValueCollection()
requestBody.Add("url", SourceFileUrl)
requestBody.Add("template", templateText)
' Execute request
Dim responseBytes As Byte() = webClient.UploadValues(query, "POST", requestBody)
Dim response As String = Encoding.UTF8.GetString(responseBytes)
' Parse JSON response
Dim json As JObject = JObject.Parse(response)
If json("error").ToObject(Of Boolean) = False Then
' Get URL of generated JSON file
Dim resultFileUrl As String = json("url")
' Download JSON file
webClient.DownloadFile(resultFileUrl, DestinationFile)
Console.WriteLine("Generated JSON file saved as {0} file.", DestinationFile)
Else
Console.WriteLine(json("message").ToString())
End If
Catch ex As WebException
Console.WriteLine(ex.ToString())
End Try
webClient.Dispose()
Console.WriteLine()
Console.WriteLine("Press any key...")
Console.ReadKey()
End Sub
End Module
MultiPageTable-template1.yml
templateName: Multipage Table Test
templateVersion: 4
templatePriority: 0
detectionRules:
keywords:
- Sample document with multi-page table
objects:
- name: total
objectType: field
fieldProperties:
fieldType: macros
expression: TOTAL{{Spaces}}({{Number}})
regex: true
dataType: decimal
- name: table1
objectType: table
tableProperties:
start:
expression: Item{{Spaces}}Description{{Spaces}}Price
regex: true
end:
expression: TOTAL{{Spaces}}{{Number}}
regex: true
row:
expression: '{{LineStart}}{{Spaces}}(?<itemNo>{{Digits}}){{Spaces}}(?<description>{{SentenceWithSingleSpaces}}){{Spaces}}(?<price>{{Number}}){{Spaces}}(?<qty>{{Digits}}){{Spaces}}(?<extPrice>{{Number}})'
regex: true
columns:
- name: itemNo
dataType: integer
- name: description
dataType: string
- name: price
dataType: decimal
- name: qty
dataType: integer
- name: extPrice
dataType: decimal
multipage: true
PDF.co Web API: the Web API with a set of tools for documents manipulation, data conversion, data extraction, splitting and merging of documents. Includes image recognition, built-in OCR, barcode generation and barcode decoders to decode bar codes from scans, pictures and pdf.
Download Source Code (.zip)
return to the previous page explore Document Parser endpoint
Copyright © 2016 - 2023 PDF.co