How to Extract Phone Number from a PDF using PDF.co Web API

PDF.co Document Parser is a cloud-based platform that provides tools for automated data extraction and document processing. It allows you to extract structured data from PDF and scanned documents, automate document workflows, and integrate with other software applications using APIs.

PDF.co Document Parser is designed for businesses and individuals who need to automate document processing workflows, extract data from large volumes of documents, and integrate with other software applications. It can be used for a wide range of applications, such as invoice processing, document archiving, data entry, and more. The Document Parser platform includes several tools that can be used to automate document processing workflows such as Template Editor, Document Parser API, PDF Extractor API, and PDF Converter API.

Here is a sample tutorial on extracting phone numbers using PDF.co Web API.

  1. Open PDF.co Account
  2. Document Parser Page
  3. Create New Template
  4. Run Template Result
  5. Created Template ID
  6. Request Tester Menu
  7. Request Tester Tool
  8. Add Template ID
  9. Run Request Result
  10. Extracted JSON Output

We will use this sample PDF document and will extract the phone number using the PDF.co Document Parser.

Sample PDF Document
Sample PDF Document

Here’s a simple step-by-step tutorial to extract phone numbers.

Step 1: Open PDF.co Account

  • Log into your PDF.co account and click the Document Parser menu.

Open PDF.co Account

Step 2: Document Parser Page

  • On your Document Parser page, click on the New Template button.

Create New Template

Step 3: Create New Template

Let’s extract the phone number and create a template ID.

  • First, click on the Load Test PDF or Image button to load the source file.
  • Next, hit the Add Object button and select the Add FIELD from the RECTANGLE selection.
  • Then, drag the rectangle into the phone number to extract the data and set your objective name properties.
  • Once the template is all setup, click on the Run Template button to see the result.

Template Editor

Step 4: Run Template Result

  • Once the template runs successfully and return the extracted phone number. Click the Save Template and Return button.

Run Template Result

Step 5: Created Template ID

  • Here’s the template ID containing the extracted phone number.

Created Template ID

Now, we will use this template ID to pass the extracted data to PDF.co.

Step 6: Request Tester Menu

  • Let’s go back to the PDF.co dashboard and click on the Request Tester menu.

Log into PDF.co Account

Step 7: Request Tester Tool

  • For the Choose PDF.co API Endpoint field, select the /pdf/documentparser (Output as JSON). This API endpoint extracts data from documents based on a document parser extraction template. With this API method, you may extract data from custom areas, by searching, form fields, tables, multiple pages, and more! You may also choose XML or CSV as an output format.
  • For the Input parameters field, you can override the URL param with a link or input with a file.

Request Tester Tool

Step 8: Add Template ID

Note: The default URL will automatically be replaced once you’ve added your source file.

Add Template ID

After adding the template ID, click on the Run Request button to send a request to PDF.co.

Step 9: Run Request Result

  • Great! The PDF.co successfully process our request and returned an output URL. Kindly click on the resulting URL to view the output or directly download the output file.

Run Request Result

Step 10: Extracted JSON Output

  • Here’s the extracted phone number from a PDF document in JSON format.
Extracted JSON Output
Extracted JSON Output

This tutorial taught you how to extract phone numbers using PDF.co Web API. You learned how to create a new template and a template ID. You also learned how to use the PDF.co Document Parser API to extract data from a PDF document.