Read Table Data from PDF in Python using PDF.co Web API

Sep 2, 2024·3 Minutes Read

In this demonstration, we will show you how to read table data from PDF in Python. Here’s the /v1/pdf/document parser (output as JSON) endpoint that we use.

Step 1: Add Files in Folder

First, let’s start by adding our files to the Python program folder. You can download our sample file and Python code at our documentation.

Step 2: Install Requests Module

Next, install the requests module. Type the python -m pip install requests in your command line and press enter to install the pip requests.

Step 3: Add your API Key

Now, open the Python sample code and go to line 6. In the double quote, enter your PDF.co API Key. You can get your API Key in your PDF.co dashboard.

PDF.co API Key

Step 4: Source File and Destination

In lines 12 and 15, add your source PDF file and type the JSON filename output. You can also use other output formats such as XML, CSV, and JSON (custom template code).

Source File and Output

Step 5: Add Template File

In line 20, enter the name of your template file. Then, use our Document Parser Template Editor to create a new template. Check out this tutorial on how to create a new template.

Add Template File

Step 6: Run Program

Now, let’s run the program and check the folder to view the output.

Python Program Output
Output in JSON Format
Output in JSON Format

In this tutorial, you learned how to read table data from PDF in Python using the PDF.co Web API. You learned how to use the PDF.co Document Parser Web API to read table data from PDF. You learned how to use the Document Parser Template Editor to create a new template. You also learned how to install the requests module.

Related Tutorials

See Related Tutorials