PDF to Excel API Benefits

Extract PDF to Excel
Extract Structured Data from PDF

PDF to Excel API engine analyzes input PDF documents and recreates the original layout of tables and text objects. You can extract data from PDF to Excel, CSV, JSON, and XML formats compared to other PDF to Excel converter tools.

Supports scanned and damaged text

Using our built-in OCR (Optical Character Recognition) which supports PDF files with mixed content and multiple languages. PDF.co can easily convert scanned and damaged texts inside your PDF.

SIGN UP FOR FREE

PDF to Excel API Supports Multiple Languages

PDF.co platform can extract PDF data to Excel, CSV, JSON, and XML from programming languages such as PHP, Javascript, .NET and ASP.NET, C#, Java, Visual Basic, and many others. Find source code samples in our API documentation.

Business Automation Platforms Integrations

If you are not a developer, you can also easily automate your PDF operations via popular business automation platforms: Zapier, Make, Airtable, Salesforce, Google Apps Script, and 300+ more.

Enterprise Solutions

For enterprise customers, there is a Dedicated API Server that runs as a dedicated private server with dedicated private cloud storage in the hosting region of your choice.

PDF to Excel API – Sample & Demo

Here’s the workflow for data extraction from PDF to Excel. For this demo, I am going to use a Sample PDF File.

Screenshot of Souce File
Screenshot of Source File

We’ll be using the code snippets below which are written in different programming languages which will convert the Sample PDF file above into Excel.

After you extract data from PDF to Excel, the final result will look like this.

Screenshot of XLS Output
Screenshot of XLS Output

Before we proceed with the code, let us first check the /v1/pdf/convert/to/xls parameters and its uses.

Endpoint

URL: https://api.pdf.co/v1/pdf/convert/to/xls
Method: POST
Parameter Description
url required. Link to the source file.
lang optional. English by default. Sets OCR (image to text extraction) language to be used for scanned PDF when the scanned document is detected or input is PNG, JPG images. Other supported values: eng, spa, deu, fra, jpn, chi_sim, chi_tra, kor. You can also specify two languages to be used on the same page, for example, eng+deu, jpn+kor, or other combinations.
inline optional. Must be one of: true to return data as inline or false to return link to the output file (default).
unwrap optional. Unwrap lines to a single line within table cells when lineGrouping is enabled. Must be one of true or false.
pages optional. Comma-separated list of page indices (or ranges) to process. IMPORTANT: the very first page starts with (zero). To set a range use the dash , for example: 0, 2-5, 7-.
rect optional. Defines coordinates for extraction, e.g. 51.8, 114.8, 235.5, 204.0. Must be a string.
encrypt optional. Enable encryption for the output file: true or false
async optional. Runs processing asynchronously. Returns jobId to use with job/checktrue or false
name optional. Output file name.
profiles optional. Must be a String. Set custom configuration. See profiles examples here
lineGrouping optional. Line grouping with table cells. Set to 1 to enable the grouping. Must be a string.

 

Now we are ready to write some codes.

cURL Code Snippet for Data Extraction from PDF to Excel

curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/xls' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-excel/sample.pdf"
}'

This sample code and other cURL source code samples are available here.

Now let’s see this program in action and extract PDF data to Excel.

Output XLS using cURL
Output XLS using cURL

The source code samples for PDF to Excel in JavaScript are located here.

The source code samples for PDF to Excel in PHP are located here.

The sample code for PDF to Excel in Python is here.

The source code samples for PDF to Excel in Java are located here.

The source code samples for PDF to Excel in C# are located here.

Sign Up

NOTE: Use PDF.co Document Classifier to know the source of the document. You can easily create and maintain classification rules with the desktop-based Classifier Testing Tool (see the details here)

You have learned how to extract specific data from PDF to Excel and have followed the steps to run the program using cURL code snippets.

Related Pages:

Related Samples: