In this tutorial, we will show you how to extract tables from PDF in JavaScript using PDF.co Web API. Below is the image of the source PDF invoice and the extracted table with text output in JSON format.

Here are the main steps for this simple tutorial:

  1. Source Code and Template
  2. Install Requests Module
  3. Insert API Key
  4. Source and Destination File
  5. Add Template
  6. Run JavaScript Program
  7. Extract Table with Text Demo
  8. How to Extract a Table from a PDF – Video Guide
  9. Extract Specific Data from Invoices – Use Cases

 

Source and Output
Source PDF and Extracted Table with Text Output

Step 1: Source Code and Template

To begin extracting tables from PDF, open Visual Studio Code or your favorite editor and save the following files.

JavaScript File Folder

Step 2: Install Requests Module

To install the requests module, kindly type the npm install requests in your terminal. We will use this requests module for file upload.

Step 3: Insert API Key

In line 12, insert your API key inside the double quote. You can get the API key in your PDF.co dashboard here.

PDF.co API Key

Step 4: Source and Destination File

In line 15, add your source PDF file and type your desired output file name in line 19. Aside from JSON output, you can also extract tables with text in CSV and XML formats.

Source and Destination File

Step 5: Add Template

In line 96, add the template name. The Document Parser supports both JSON and YML template formats.

For more details about Document Parser Template, check out this page.

Add Template

Step 6: Run JavaScript Program

To run the program, simply type node file.js in the terminal.

Step 7: Extract Table with Text Demo

Here’s a quick demo to extract a table with text from the PDF.

Extract tables with Text Demo
Extract tables with Text Demo

In this tutorial, you learned how to extract a table with text from PDF in JavaScript using PDF.co Web API. You learned how to set up the JavaScript source code to run the program successfully. You also learned how the PDF.co Document Parser parses the table with text from PDF.

How to Extract a Table from a PDF – Video Guide


Extract Specific Data from Invoices – Use Cases

Here are a few use cases for the extraction of specific data from invoices:

Accounts Payable Automation

By extracting data such as invoice numbers, dates, vendor names, and amounts from scanned invoices, companies can streamline the process of paying bills, reducing errors and saving time.

Expense Tracking

Data Extraction from receipts and invoices for business expenses helps individuals to easily keep track of their spending and ensure that they are accurately recording expenses for tax and accounting purposes.

Compliance

Many industries have strict regulations around financial reporting and record-keeping. By extracting data from scanned invoices, companies can ensure that they are complying with these regulations and avoid fines and penalties.

Business Intelligence

By extracting data from scanned invoices and other financial documents, companies can gain insights into their business performance. For example, they may be able to identify trends in spending, compare the profitability of different products or services, or assess the financial health of their business.