How to Extract Phone Number from a PDF using PDF.co Web API

PDF.co Document Parser API is a cloud-based platform that offers tools to automatically extract data and process documents. It helps you extract organized data from both PDF and scanned documents, streamline document workflows, and connect with other software through APIs.

PDF.co Document Parser is designed for businesses and individuals who want to automate document processing, extract data from large document volumes, and integrate with other software. It can be used for various purposes like processing invoices, archiving documents, entering data, and more. The platform provides tools such as Template Editor, Document Parser API, PDF Extractor API, and PDF Converter API to automate document processing workflows.

In this tutorial, we will guide you through the process of extracting phone numbers from PDF documents using the PDF.co Document Parser API. This API allows you to automate the extraction of structured data from PDF files, making it easier to retrieve specific information like phone numbers.

Getting Started

Here's an easy guide that explains how to extract phone numbers. Just follow these steps one by one.

Screenshot of sample PDF invoice
Screenshot of sample PDF invoice

Log in to Your PDF.co Account

To get started, please sign in to your PDF.co account. After signing in, go to the menu and select API Tools.

Screenshot of PDF.co Dashboard
Screenshot of PDF.co Dashboard

Create a New Document Parser Template

In the API Tools page, locate the Extract section. Click on the Document Parser Templates.

Screenshot of API Tools Page
Screenshot of API Tools Page

On your Document Parser Templates page, click on the "Create New Template" button to create a template for the Document Parser.

Screenshot of Document Parser Templates
Screenshot of Document Parser Templates

To extract the phone number and create a template, follow these steps:

Step 1: Load File

Start by clicking on the "Load Test PDF or Image" button to load the file you want to extract data from.

Step 2: Add Object

Click on the "Add Object" button and choose "Add FIELD from RECTANGLE selection" option.

Step 3: Set Properties

Drag the rectangle over the phone number area to specify what data you want to extract, and set the properties for the objective name.

Step 4: Run Template

Once you have set up the template, click on the "Run Template" button to see the result.

Screenshot of Document Parser Template Editor
Screenshot of Document Parser Template Editor

Run Template Result

Once the template successfully extracts the phone number, you can choose to either directly download the file in your preferred output format or save the template for future use.

Screenshot of Extracted Phone Number
Screenshot of Extracted Phone Number

Now, we will demonstrate how to utilize the Template ID after you have saved the template and to pass the extracted data to PDF.co.

Copy Template ID

To use the extracted data, we will copy the Template ID that was generated and then click on the API Tools menu.

Screenshot of Template ID
Screenshot of Template ID

Set Up the Request Tester

Once you're on the API Tools page, find and click on the Request Tester section.

Screenshot of Request Tester Tab
Screenshot of Request Tester Tab

To set up the Request Tester for PDF.co, follow these steps:

Step 1: Select Endpoint

In the PDF.co API Endpoint field, search for and choose the /pdf/documentparser endpoint. This endpoint is used for extracting specific data from a PDF document. It allows you to select output formats like JSON, CSV, or XML.

Step 2: Set Input Parameters

In the Input parameters field, you have two options. You can either provide a URL link to the PDF document or upload the file directly as your input.

Step 3: Specify Desired Output

In the JSON code, specify the desired output format (e.g., JSON, CSV, or XML). Additionally, include the Template ID value that contains the extracted phone number. This will ensure that the desired data is included in the output.

Screenshot of Request Tester Page
Screenshot of Request Tester Page

After completing the configuration, you can simply click the "Run Request" button to send your request to PDF.co.

Run Request Result

Congratulations! Your request has been successfully processed by the PDF.co API, and it has generated the file you requested. To view the output, click on the generated file.

Screenshot of PDF.co Request Result
Screenshot of PDF.co Request Result

JSON Output

Here is the phone number extracted from a PDF document, provided in JSON format. To obtain the extracted phone number in JSON format, simply click on the " Download as file" option.

Screenshot of Extracted Data in JSON Format
Screenshot of Extracted Data in JSON Format

In this tutorial, you were shown how to extract phone numbers using the PDF.co Document Parser API. You learned the process of creating a new template and obtaining a template ID. Additionally, you gained knowledge on using the PDF.co Document Parser API to extract data from a PDF document.

By leveraging the Document Parser API, you can automate the extraction of phone numbers from PDF documents, saving time and effort in manual data extraction tasks.