Read Table Data from PDF in Python using PDF.co Web API
In this demonstration, we will show you how to read table data from PDF in Python. Here’s the /v1/pdf/document parser (output as JSON) endpoint that we use.
Step 1: Add Files in Folder
First, let’s start by adding our files to the Python program folder. You can download our sample file and Python code at our documentation.
Step 2: Install Requests Module
Next, install the requests module. Type the python -m pip install requests
in your command line and press enter to install the pip requests.
Step 3: Add your API Key
Now, open the Python sample code and go to line 6. In the double quote, enter your PDF.co API Key. You can get your API Key in your PDF.co dashboard.
Step 4: Source File and Destination
In lines 12 and 15, add your source PDF file and type the JSON filename output. You can also use other output formats such as XML, CSV, and JSON (custom template code).
Step 5: Add Template File
In line 20, enter the name of your template file. Then, use our Document Parser Template Editor to create a new template. Check out this tutorial on how to create a new template.
Step 6: Run Program
Now, let’s run the program and check the folder to view the output.
In this tutorial, you learned how to read table data from PDF in Python using the PDF.co Web API. You learned how to use the PDF.co Document Parser Web API to read table data from PDF. You learned how to use the Document Parser Template Editor to create a new template. You also learned how to install the requests module.