PDF to CSV API Benefits

Organized CSV as output

With the help of our AI-based engine, PDF.co can analyze input PDF documents and recreates the original structure of tables and text as CSV data. As a result, it can make processing and preparation much easier compared to other regular PDF to CSV tools.

Supports damaged and scanned texts

There are PDF files that contain damaged and scanned texts. Our PDF.co engine can recognize all those texts in multiple languages with the help of our built-in OCR (Optical Character Recognition).

API and Business Automation Platforms Integrations

PDF.co platform can be used by software developers from programming languages such as Javascript, PHP, Java, .NET and ASP.NET, C#, Visual Basic, and many others.

If you are not a developer then you can also easily automate your PDF operations through business automation platforms such as ZapierIntegromat, and hundreds of others.

On-Prem and Private Instances for Enterprise

PDF.co platform runs on secure and certified cloud infrastructure but Enterprise customers required to process sensitive data in-house can go with the on-premise version that can be installed on your server and can work completely offline when required.

Sign Up

 

PDF to CSV API Sample & Demo

For this demo, I am going to use a Sample PDF File.

Screenshot of Souce File
Screenshot of Source File

We’ll be using the code snippets below which are written in different programming languages which will convert the Sample PDF File above into a CSV. The final result will look like this.

"Your Company Name","","","",
"Your Address","","","",
"City, State Zip","","","",
"","","","Invoice No. 123456",
"","","","Invoice Date 01/01/2016",
"Client Name","","","",
"Address","","","",
"City, State Zip","","","",
"Notes","","","",
"Item","Quantity","Price","Total",
"Item 1","1","40.00","40.00",
"Item 2","2","30.00","60.00",
"Item 3","3","20.00","60.00",
"Item 4","4","10.00","40.00",
"","","TOTAL","200.00",

Output CSV

Before we proceed with the code. Let us first check the /v1/pdf/convert/to/csv parameters and its uses.

Endpoint

URL: https://api.pdf.co/v1/pdf/convert/to/csv
Method: POST
Parameter Description
url required. Link to the source file.
lang optional. english by default. Sets OCR (image to text extraction) language to be used for scanned PDF when the scanned document is detected or input is PNG, JPG images. Other supported values: eng, spa, deu, fra, jpn, chi_sim, chi_tra, kor. You can also specify two languages to be used on the same page, for example: eng+deu, jpn+kor or other combinations.
inline optional. Must be one of: true to return data as inline or false to return link to the output file (default).
unwrap optional. Unwrap lines to a single line within table cells when lineGrouping is enabled. Must be one of true or false.
pages optional. Comma-separated list of page indices (or ranges) to process. IMPORTANT: the very first page starts with 0 (zero). To set a range use the dash , for example: 0, 2-5, 7-.
rect optional. Defines coordinates for extraction, e.g. 51.8, 114.8, 235.5, 204.0. Must be a string.
encrypt optional. Enable encryption for the output file: true or false
async optional. Runs processing asynchronously. Returns jobId to use with job/checktrue or false
name optional. Output file name.
profiles optional. Must be a String. Set custom configuration. See profiles examples here
lineGrouping optional. Line grouping with table cells. Set to 1to enable the grouping. Must be a string.

Now we are ready to write some codes.

cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/csv' 
--header 'Content-Type: application/json' 
--header 'x-api-key: YOUR_API_KEY' 
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-csv/sample.pdf",
    "lang": "eng",
    "inline": "true",
    "unwrap": "",
    "pages": "0-",
    "rect": "",
    "async": "false",
    "encrypt": "false",    
    "name": "result.csv",
    "password": "",
    "lineGrouping": "",
    "profiles": ""
}'

This sample code and other cURL sample codes are available here.

Now let’s see this program in action.

Output CSV using cURL
Output CSV using cURL

 

JavaScript sample codes for PDF to CSV API are available in our repository here.

PHP sample codes for PDF to CSV API are available in our repository here.

Java sample codes for PDF to CSV API are available in our repository here.

C# sample codes for PDF to CSV API are available in our repository here.

Sign Up

Related Pages:

Related Samples: