PDF to HTML API Benefits

Retains the exact format

Using the PDF.co API platform, you can convert your PDF file to HTML very easily while retaining the exact format as what it was in your PDF. You can convert a PDF to HTML with our API if you need to render the information very fast.

Support for Multiple Languages

Our API supports several languages, so you can choose PDF to HTML API in JavaScript, Python, Java, C#, PHP, .NET and ASP.NET, Visual Basic among other programming languages. The PDF to HTML source codes are available for use, as long as you have your API key.

If you don’t have your API key yet, you can get it here. You can also get updated code samples that you can copy and use from GitHub, if you prefer testing the API in different languages first.

Our PDF to HTML API is Secure

Data security is an important consideration when handling client information. We use SSL to secure our API connection, and host the servers on Amazon’s AWS infrastructure, making ours the best PDF to HTML converter. It guarantees you maximum data protection, making your business compliant with ePHI requirements.

In that regard, you can use our API for PDF to HTML5 conversion, or use the PDF to HTML source code. Furthermore, our API auto deletes any data you upload to its temporary storage, after 1 hour. And you have an option to use the “Delete” parameter to auto-delete the data instantly.

API Integrations for non-programmers

You can still use PDF.co API platform even if you’re not a programmer. We released a few integrations of our API and popular automation platforms. These integrations include Zapier and Integromat plugins, UiPath, and BluePrism extensions. You can easily connect with commonly used applications via 300+ PDF.co API integrations.

On-Premise Version Available for Enterprise Customers

PDF.co API Platform utilizes secure and certified cloud infrastructure. For enterprise customers, they can opt for the on-premise version to process ultra-sensitive data in-house. It can work completely offline when needed.

SIGN UP

PDF to HTML API Sample & Demo

I’ll be using this sample PDF file below for this demo.

Screenshot of Source File
Screenshot of Source File

We’ll be using those different sample code snippets below for this demo. They can convert the Sample PDF File above into HTML.

The final result will look like this.

Screenshot of Output HTML
Screenshot of Output HTML

Before we proceed with the code. Let us first check the /v1/pdf/convert/to/html parameters and its uses.

Endpoint

URL: https://api.pdf.co/v1/pdf/convert/to/html
Method: POST
Parameter Description
url required. Link to the source file.
lang optional. English by default. Sets OCR (image to text extraction) language to be used for scanned PDF when the scanned document is detected or input is PNG, JPG images. Other supported values: eng, spa, deu, fra, jpn, chi_sim, chi_tra, kor. You can also specify two languages to be used on the same page, for example, eng+deu, jpn+kor, or other combinations.
inline optional. Must be one of: true to return data as inline or false to return link to the output file (default).
unwrap optional. Unwrap lines to a single line within table cells when lineGrouping is enabled. Must be one of true or false.
pages optional. Comma-separated list of page indices (or ranges) to process. IMPORTANT: the very first page starts with (zero). To set a range use the dash , for example: 0, 2-5, 7-.
rect optional. Defines coordinates for extraction, e.g. 51.8, 114.8, 235.5, 204.0. Must be a string.
encrypt optional. Enable encryption for the output file: true or false
async optional. Runs processing asynchronously. Returns jobId to use with job/checktrue or false
name optional. Output file name.
profiles optional. Must be a String. Set custom configuration. See profiles examples here
lineGrouping optional. Line grouping with table cells. Set to 1 to enable the grouping. Must be a string.

Now we are ready to write some codes.

cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/html' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-to-html/sample.pdf",
    "inline": false
}'

This sample code and other cURL sample codes are available here.

Now let’s see this program in action.

Output HTML using cURL
Output HTML using cURL

The sample code for PDF to HTML in JavaScript is located here.

The sample code for PDF to HTML in PHP is located here.

The sample code for PDF to HTML in Java is located here.

The sample code for PDF to HTML in C# is located here.

SIGN UP

How to Execute PDF to HTML Converter API Asynchronously

Large files will time out when converting them from PDF to HTML if the execution is not asynchronous. To execute the API asynchronously and in the background, call the API method with “URL” param as the input, and set “async” param to “true”.

The API will return an output “URL” along with the task’s “jobId”. You can use the “jobId” to check its execution status using the /job/check API method, and the “URL” to access the converted HTML file.

 

Related Pages:

Related Samples: