PDF.co Document Parser is a cloud-based platform that provides tools for automated data extraction and document processing. It allows you to extract structured data from PDF and scanned documents, automate document workflows, and integrate with other software applications using APIs.
PDF.co Document Parser is designed for businesses and individuals who need to automate document processing workflows, extract data from large volumes of documents, and integrate with other software applications. It can be used for a wide range of applications, such as invoice processing, document archiving, data entry, and more. The Document Parser platform includes several tools that can be used to automate document processing workflows such as Template Editor, Document Parser API, PDF Extractor API, and PDF Converter API.
Here is a sample tutorial on extracting phone numbers using PDF.co Web API.
- Open PDF.co Account
- Document Parser Page
- Create New Template
- Run Template Result
- Created Template ID
- Request Tester Menu
- Request Tester Tool
- Add Template ID
- Run Request Result
- Extracted JSON Output
We will use this sample PDF document and will extract the phone number using the PDF.co Document Parser.
Step 1: Open PDF.co Account
- Log into your PDF.co account and click the Document Parser menu.
Step 2: Document Parser Page
- On your Document Parser page, click on the New Template button.
Step 3: Create New Template
Let’s extract the phone number and create a template ID.
- First, click on the Load Test PDF or Image button to load the source file.
- Next, hit the Add Object button and select the Add FIELD from the RECTANGLE selection.
- Then, drag the rectangle into the phone number to extract the data and set your objective name properties.
- Once the template is all setup, click on the Run Template button to see the result.
Step 4: Run Template Result
- Once the template runs successfully and return the extracted phone number. Click the Save Template and Return button.
Step 5: Created Template ID
- Here’s the template ID containing the extracted phone number.
Now, we will use this template ID to pass the extracted data to PDF.co.
Step 6: Request Tester Menu
- Let’s go back to the PDF.co dashboard and click on the Request Tester menu.
Step 7: Request Tester Tool
- For the Choose PDF.co API Endpoint field, select the /pdf/documentparser (Output as JSON). This API endpoint extracts data from documents based on a document parser extraction template. With this API method, you may extract data from custom areas, by searching, form fields, tables, multiple pages, and more! You may also choose XML or CSV as an output format.
- For the Input parameters field, you can override the URL param with a link or input with a file.
Step 8: Add Template ID
- For the templateId param, add the ID of the template you created from PDF.co Document Parser Template Editor. Here’s a quick way to create a template.
Note: The default URL will automatically be replaced once you’ve added your source file.
Step 9: Run Request Result
- Great! The PDF.co successfully process our request and returned an output URL. Kindly click on the resulting URL to view the output or directly download the output file.
Step 10: Extracted JSON Output
- Here’s the extracted phone number from a PDF document in JSON format.
This tutorial taught you how to extract phone numbers using PDF.co Web API. You learned how to create a new template and a template ID. You also learned how to use the PDF.co Document Parser API to extract data from a PDF document.