How to Extract Phone Number from a PDF using PDF.co Web API
PDF.co Document Parser API is a cloud-based platform that offers tools to automatically extract data and process documents. It helps you extract organized data from both PDF and scanned documents, streamline document workflows, and connect with other software through APIs.
PDF.co Document Parser is designed for businesses and individuals who want to automate document processing, extract data from large document volumes, and integrate with other software. It can be used for various purposes like processing invoices, archiving documents, entering data, and more. The platform provides tools such as Template Editor, Document Parser API, PDF Extractor API, and PDF Converter API to automate document processing workflows.
In this tutorial, we will guide you through the process of extracting phone numbers from PDF documents using the PDF.co Document Parser API. This API allows you to automate the extraction of structured data from PDF files, making it easier to retrieve specific information like phone numbers.
Getting Started
Here's an easy guide that explains how to extract phone numbers. Just follow these steps one by one.
Log in to Your PDF.co Account
To get started, please sign in to your PDF.co account. After signing in, go to the menu and select API Tools.
Create a New Document Parser Template
In the API Tools page, locate the Extract section. Click on the Document Parser Templates.
On your Document Parser Templates page, click on the "Create New Template" button to create a template for the Document Parser.
To extract the phone number and create a template, follow these steps:
Step 1: Load File
Start by clicking on the "Load Test PDF or Image" button to load the file you want to extract data from.
Step 2: Add Object
Click on the "Add Object" button and choose "Add FIELD from RECTANGLE selection" option.
Step 3: Set Properties
Drag the rectangle over the phone number area to specify what data you want to extract, and set the properties for the objective name.
Step 4: Run Template
Once you have set up the template, click on the "Run Template" button to see the result.
Run Template Result
Once the template successfully extracts the phone number, you can choose to either directly download the file in your preferred output format or save the template for future use.
Now, we will demonstrate how to utilize the Template ID after you have saved the template and to pass the extracted data to PDF.co.
Copy Template ID
To use the extracted data, we will copy the Template ID that was generated and then click on the API Tools menu.
Set Up the Request Tester
Once you're on the API Tools page, find and click on the Request Tester section.
To set up the Request Tester for PDF.co, follow these steps:
Step 1: Select Endpoint
In the PDF.co API Endpoint field, search for and choose the /pdf/documentparser endpoint. This endpoint is used for extracting specific data from a PDF document. It allows you to select output formats like JSON, CSV, or XML.
Step 2: Set Input Parameters
In the Input parameters field, you have two options. You can either provide a URL link to the PDF document or upload the file directly as your input.
Step 3: Specify Desired Output
In the JSON code, specify the desired output format (e.g., JSON, CSV, or XML). Additionally, include the Template ID value that contains the extracted phone number. This will ensure that the desired data is included in the output.
After completing the configuration, you can simply click the "Run Request" button to send your request to PDF.co.
Run Request Result
Congratulations! Your request has been successfully processed by the PDF.co API, and it has generated the file you requested. To view the output, click on the generated file.
JSON Output
Here is the phone number extracted from a PDF document, provided in JSON format. To obtain the extracted phone number in JSON format, simply click on the " Download as file" option.
In this tutorial, you were shown how to extract phone numbers using the PDF.co Document Parser API. You learned the process of creating a new template and obtaining a template ID. Additionally, you gained knowledge on using the PDF.co Document Parser API to extract data from a PDF document.
By leveraging the Document Parser API, you can automate the extraction of phone numbers from PDF documents, saving time and effort in manual data extraction tasks.