How to Convert HTML to PDF Using PDF.co Web API in Python
As you know PDF is a complex file format and building it directly requires knowledge of PDF specification in detail.
On the contrary, HTML is easy to understand and if you need to make PDF you can create desired HTML and convert it to PDF using PDF.co RESTful API.
The PDF.co API will convert given HTML to PDF without any additional effort from your side. It also understands handlebars (mustache) templates, so that you can define a template and populate the final PDF with the specified parameters.
Let’s see how to convert HTML to PDF in Python with a help of requests HTTP library. You will have to install requests HTTP library if it is not yet installed using the terminal and the following command: pip install requests.
Step 1: Set an API Key
First things first, you need to obtain your API key using the signup URL.
The API key must be passed in the URL or header with each request to the PDF.co API, otherwise, you will get 401 unauthorized error responses:
API_KEY = "__YOUR_API_KEY__"
headers={"x-api-key": API_KEY}
Step 2: Set the Request Parameters
The only required request parameter is “HTML”, where you specify valid HTML code to be converted. The rest are optional and allow to set additional document properties, like header, footer, orientation, etc. For a full list of supported parameters, please refer to the online docs.
parameters = {}
# Input HTML code to be converted. Required.
parameters["html"] = sampleHtml
# Name of resulting file
parameters["name"] = os.path.basename(destinationFile)
# Set to css style margins like 10 px or 5px 5px 5px 5px.
parameters["margins"] = "5px 5px 5px 5px"
# Can be Letter, A4, A5, A6 or custom size like 200x200
parameters["paperSize"] = "Letter"
# Set to Portrait or Landscape. Portrait by default.
parameters["orientation"] = "Portrait"
There is a special parameter templateData. If you set it to a valid JSON string then the engine will process input HTML (or in URL) as Handlebars HTML template.
For example, given
parameters["html"] = "<h1>{{title}}</h1> <div>{{greeting}}</div>"
parameters["templateData"] = json.dumps({"title": "Example", greeting: "Hello, World"},
the resultant html will be equal to:
“<h1>Example</h1> <div>Hello, World</div>”
Another very important parameter to remember is async. If you add this parameter and set it to “true” that means a job will be run asynchronously. Whenever the running time of the job is more than 25 seconds you have to run it asynchronously, or the timeout error will occur and you risk losing the results. After placing an asynchronous job you can periodically poll the check job API to check its status.
To check the asynchronous job status the following code can be used:
import time
url = "https://api.pdf.co/v1/job/check"
payload = {'jobid': '123'}
headers = { 'x-api-key': API_KEY}
while True:
response = requests.request("POST", url, headers=headers, data = payload)
status = response.text.encode('utf8')
if status == "success":
print("Job has been done successfully")
break
elif status == "working":
# wait for several seconds
time.sleep(3)
else:
# some error occurred
print(status)
break
A downloadable example can be found by the following URL.
And more information about asynchronous jobs can be found here.
Step 3: Execute the Request
Having prepared header and body parameters, the request can be sent to the HTML to PDF API:
# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"
# Prepare URL for 'HTML To PDF' API request
url = "{}/pdf/convert/from/html".format(BASE_URL)
# Execute request and get response as JSON
response = requests.post(url, data=parameters, headers={ "x-api-key": API_KEY })
Step 4: Download the Converted PDF
Provided that no error occurred, the response JSON will contain a URL you can download the result PDF file from:
if (response.status_code == 200):
json = response.json()
if json["error"] == False:
# Get URL of result file
resultFileUrl = json["url"]
# Download result file
r = requests.get(resultFileUrl, stream=True)
if (r.status_code == 200):
with open(destinationFile, 'wb') as file:
for chunk in r:
file.write(chunk)
print(f"Result file saved as \"{destinationFile}\" file.")
That’s basically it. We successfully converted the input HTML string to PDF without any special efforts and all the conversion jobs have been done by PDF.co API.
A full code sample is written in Python, including error handling can be found on the company’s GitHub page and also on the documentation page.
API request/response historical log is available at your disposal here. This is where you can see all the details about the request/response made, including used credits for each call as well as the estimated cost.