How to Extract Text from PDF and Paste in Excel using Python and PDF.co Web API

Jan 10, 2025·3 Minutes Read

In this tutorial, we will walk you through the process of extracting text from a PDF document and saving it to an Excel file using Python and the PDF.co Web API. By following this step-by-step guide, you’ll learn how to easily convert data from PDF files into an Excel format.

For this demonstration, we will use a sample PDF document and show you how to convert it into an Excel file.

IN THIS TUTORIAL

Install the requests Library

Access the Source Code

Configure the Python Code

Save Python Program

Run the Program

View the Extracted Data in Excel Format

Step 1: Install the requests Library

Before we begin, make sure that the requests library is installed in your Python environment. This library is essential for making HTTP requests to the PDF.co Web API.

Open your terminal or command line interface (CLI).
Run the following command to install the requests module: python -m pip install requests

Step 2: Access the Source Code

Next, prepare the Python script that will handle the conversion of the PDF file into Excel format.

Copy the sample Python code from the link provided.
Paste the code into your preferred Python code editor, such as Visual Studio Code, PyCharm, or any other Python-compatible editor.

Step 3: Configure the Python Code

With the sample code in hand, let’s configure it to suit your specific settings.

API Key:

Obtain your API Key from your PDF.co dashboard.
Insert your API key into the designated section in the Python script.

Source File:

Provide the name of the PDF file from which you want to extract data and convert it into an Excel file.

Output Excel Name:

Specify the name of the output Excel file where the extracted data will be saved.

Asynchronous Mode:

For greater efficiency, we recommend using Asynchronous Mode. This will allow the conversion process to run in the background, enabling the program to continue executing without waiting for the conversion to finish.

Step 4: Save Python Program

Once you’ve updated the script with your settings, save the Python program to your preferred directory.

Step 5: Run the Program

Now it’s time to run the program.

Execute the Python script. If everything is set up correctly, the script will initiate the extraction process, and you'll receive a generated Excel file containing the extracted data.

Step 6: View the Extracted Data in Excel Format

Once the script has finished running, you can access the output Excel file.

Navigate to the directory where the Python script is saved.
The extracted data from your PDF document will now be available in Excel format.

In this tutorial, you learned how to extract PDF contents and paste them into Excel using Python. You learned to use the PDF.co PDF Extractor Web API to easily convert a PDF to Excel. You also learned how to get started with PDF Extractor Web API right away using the Python sample code.