PDF to CSV Data Accuracy with PDF.co
PDF.co API offers a lot of file manipulation and conversion features. You can easily convert your PDF files to CSV, XLS, XLSX, JSON, XML, or Text in a matter of seconds. It doesn’t matter even if your PDF file contains any scanned images, PDF.co API intelligently extracts text from unstructured documents and images using built-in AI-Powered OCR.

Let us explore the following headings to learn PDF conversion:

  1. PDF to XLS
  2. PDF to XLSX
  3. PDF to XML
  4. PDF to JSON
  5. PDF to Text
  6. PDF to CSV

SIGN UP FOR FREE

Why PDF.co API

  • Security: PDF.co API runs on the secure and certified Amazon AWS infrastructure. All data transfers are encrypted by SSL/TLS encryption. See the security page for more details. The On-prem version can run on any hosting provider and cloud storage of your choice.
  • Asynchronous mode is supported so you can process large files and documents with hundreds of pages in the cloud.
  • Battle-tested by thousands of production users. Our engines are tested in production by thousands of enterprise users.
  • Credits-based system. For every page/image processed credits are reduced on your account. You can purchase credits with one-time payments or subscribe for monthly credits. Separate methods like uploading, background jobs also consume credits. (cloud version only) You can check how many credits are left using credits property in output or and remaining credits property to check how many credits are left. For details please explore your API logs.
  • On-Prem API Server and On-Prem SDK are available. Enterprise users may obtain ByteScout API Server for use on their own server (Windows Server). The On-Prem version is available here.

Extract PDF to XLS

This API will convert your PDF file and scanned images to a spreadsheet with layout and fonts preserved. You just need to provide the URL as input to the API and the rest of the things will be taken care of by API. You can pass a link to a file from Google Drive, Dropbox, or another online file service that can generate shareable links. You can find this API’s demo on our GitHub repository.

Now let me walk you through the PDF.co API endpoint documentation.

Endpoint

URL: https://api.pdf.co/v1/pdf/convert/to/xls
Method: POST
Parameter Description
Url points to the source file to be converted.
async optional. Set to true to run as an async job in the background (recommended for heavy documents).
name optional. The filename for the generated output. Must be a String.
pages optional. Comma-separated list of page indices (or ranges) to process.
rect optional. Defines coordinates for extraction, e.g. 51.8, 114.8, 235.5, 204.0.
lang optional. Sets OCR language to be used for scanned PDF, PNG, JPG documents when extracting text from them. Default is eng.
Inline optional. Returns the link of the output file (default).
encrypt optional. Enable encryption for the output file.

PDF to XLS Demo

PDF.co PDF to XLS Demo

PDF to XLS cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/xls' \
--header 'x-api-key: {{x-api-key}}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-excel/sample.pdf"
}'

Extract PDF to XLSX

This API will convert your PDF file to the spreadsheet with layout and fonts preserved. You just need to provide the URL as input to the API and the rest of the things will be taken care of by API.

You can find this API’s demo on our GitHub repository.

Now let me walk you through the PDF.co API endpoint documentation.

Endpoint

URL: https://api.pdf.co/v1/pdf/convert/to/xlsx
Method: POST
Parameter Description
Url points to the source file to be converted.
Password optional. Specify the password if your PDF is password protected.
name optional. The filename for the generated output. Must be a String.

PDF to XLSX Demo

PDF.co PDF To XLSX Demo

PDF to XLSX cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/xlsx?=' \
--header 'x-api-key: {{xi-api-key}}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url":"https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-to-excel/sample.pdf",
    "name": "result.xlsx"
}'

Extract PDF to XML

This API will convert your PDF file to XML with information about text value, tables, fonts, images, and object positions. You can find this API’s demo on our GitHub repository.

Now let me walk you through the PDF.co API endpoint documentation.

Endpoint

URL: https://api.pdf.co/v1/pdf/convert/to/xml
Method: POST
Parameter Description
Url points to the source file to be converted.
inline true”: To return data as inline or “false” to return link to the output file.

PDF To XML Demo

PDF.co PDF To XML Demo

PDF to XML cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/xml' \
--header 'x-api-key: {{x-api-key}}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-xml/sample.pdf"
}'

Extract PDF to JSON

This API will convert your PDF file into JSON representation with text, fonts, images, vectors, formatting preserved. You can find this API’s demo on our GitHub repository. We have created one basic course on JSON file format which you can see on our YouTube Channel.

Now let me walk you through the PDF.co API endpoint documentation.

Endpoint

URL: https://api.pdf.co/v1/pdf/convert/to/json2
Method: POST
Parameter Description
Url points to the source file to be converted.
inline optional. To return data as inline or “false” to return link to the output file.

PDF To JSON Demo

PDF.co PDF To JSON Demo

PDF to JSON cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/json2' \
--header 'Content-Type: application/json' \
--header 'x-api-key: {{x-api-key}}' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-json/sample.pdf",
    "inline": true
}'

Extract PDF to Text

This API will convert your PDF file to Text with layout preserved. You can find this API’s demo on our GitHub repository.

Now let me walk you through the PDF.co API endpoint documentation.

Endpoint

URL: https://api.pdf.co/v1/pdf/convert/to/text
Method: POST
Parameter Description
Url points to the source file to be converted.
inline optional. To return data as inline or “false” to return link to the output file.

PDF To TEXT Demo

PDF.co PDF To Text Demo

PDF to TEXT cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/text' \
--header 'Content-Type: application/json' \
--header 'x-api-key: {{x-api-key}}' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-text/sample.pdf"
}'

Extract PDF to CSV

This API will convert your PDF file into CSV representation with layout, columns, rows, tables. CSV files contain comma-separated values which are usually differentiated by comma. We have created one basic course in this CSV file format which you can see on our YouTube Channel. Now it’s time to see this API in action.

Now before we go ahead, let me walk you through the PDF.co API endpoint documentation which we are going to use in our demo application.

Endpoint

URL: https://api.pdf.co/v1/pdf/convert/to/csv
Method: POST
Parameter Description
Url points to the source file to be converted.
name The filename for the generated output. Must be a String.

PDF To CSV Demo

PDF.co PDF To CSV Demo

PDF to CSV cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/csv' \
--header 'Content-Type: application/json' \
--header 'x-api-key: {{x-api-key}}' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-csv/sample.pdf",
    "lang": "eng",
    "inline": "true",
    "unwrap": "",
    "pages": "0-",
    "rect": "",
    "async": "false",
    "encrypt": "false",    
    "name": "result.csv",
    "password": "",
    "lineGrouping": "",
    "profiles": ""
}'

Sign Up

PDF.co API sample source code is available in many programming languages such as Javascript, Python, PHP, Java, C#, Visual Basic, ASP.NET, Powershell, CLI, etc. You can explore our Hundreds of Source Code Sample Apps on Github. Stay tuned and stay updated with us to get more of these.

 

Related Pages:

Related Samples: