How to parse with OCR for document parser API in Java using PDF.co Web API

PDF.co Web API is the flexible Web API that includes full set of functions from e-signature requests to data extraction, OCR, images recognition, pdf splitting and pdf splitting. Can also generate barcodes and read barcodes from images, scans and pdf.

On-demand (REST Web API) version:
 Web API (on-demand version)

On-premise offline SDK for Windows:
 60 Day Free Trial (on-premise)

ByteScoutWebApiExample.iml

      
<?xml version="1.0" encoding="UTF-8"?> <module type="JAVA_MODULE" version="4"> <component name="NewModuleRootManager" inherit-compiler-output="true"> <exclude-output /> <content url="file://$MODULE_DIR{code}quot;> <sourceFolder url="file://$MODULE_DIR$/src" isTestSource="false" /> </content> <orderEntry type="inheritedJdk" /> <orderEntry type="sourceFolder" forTests="false" /> <orderEntry type="library" name="com.google.code.gson:gson:2.8.1" level="project" /> <orderEntry type="library" name="com.squareup.okhttp3:okhttp:3.8.1" level="project" /> </component> </module>

DigitalOcean.yml

      
--- templateVersion: 3 templatePriority: 0 sourceId: DigitalOcean Invoice detectionRules: keywords: # Template will match documents containing the following phrases: - DigitalOcean - 101 Avenue of the Americas - Invoice Number fields: # Static field that will "DigitalOcean" to the result companyName: type: static expression: DigitalOcean # Macro field that will find the text "Invoice Number: 1234567" and return "1234567" to the result invoiceId: type: macros expression: 'Invoice Number: ({{Digits}})' # Macro field that will find the text "Date Issued: February 1, 2016" and return the date "February 1, 2016" in ISO format to the result dateIssued: type: macros expression: 'Date Issued: ({{SmartDate}})' dataType: date dateFormat: auto-mdy # Macro field that will find the text "Total:

{codeFileName}

      
{code}


10.00" and return "110.00" to the result
total:
type: macros
expression: 'Total: {{Dollar}}({{Number}})'
dataType: decimal
# Static field that will "USD" to the result
currency:
type: static
expression: USD
tables:
- name: table1
# The table will start after the text "Description Hours"
start:
expression: 'Description{{Spaces}}Hours'
# The table will end before the text "Total:"
end:
expression: 'Total:'
# Macro expression that will find table rows "Website-Dev (1GB) 744 01-01 00:00 01-31 23:59

{codeFileName}

      
{code}


0.00", etc.
row:
# Groups <description>, <hours>, <start>, <end> and <unitPrice> will become columns in the result table.
expression: '{{LineStart}}{{Spaces}}(?<description>{{SentenceWithSingleSpaces}}){{Spaces}}(?<hours>{{Digits}}){{Spaces}}(?<start>{{2Digits}}{{Minus}}{{2Digits}}{{Space}}{{2Digits}}{{Colon}}{{2Digits}}){{Spaces}}(?<end>{{2Digits}}{{Minus}}{{2Digits}}{{Space}}{{2Digits}}{{Colon}}{{2Digits}}){{Spaces}}{{Dollar}}(?<unitPrice>{{Number}})'
# Suggest data types for table columns (missing columns will have the default "string" type):
columns:
- name: hours
type: integer
- name: unitPrice
type: decimal

VIDEO

ON-PREMISE OFFLINE SDK

Get 60 Day Free Trial

See also:

ON-DEMAND REST WEB API

Get Your API Key

See also:

Related Samples: