How to Convert PDF to JSON Using PDF.co API
In this article, we are going to see how to convert PDF to JSON with PDF.co. Here we are having their sub implementation for this PDF to JSON. First of all, we are having the placeholder for the API key and we are providing the source file. Let us see what is the content of this source file. We have the standard invoice here and later we will get the pages here in the comma-separated list.
Step 1: Enter API Key and Generate Request String
Whatever the pages we want to convert that to JSON, we’re going to provide here. If our pages input PDF is password-protected, we are having the placeholder for the password. But it is not the case here. We are leaving this empty. We are writing the result file name, the destination file name is the result.json. What we are actually doing, we are creating the instance of the WebClient. We are providing the API keys in the header and then we are going to generate the request string here.
The API key, URL here is converted to JSON, and then we are going to provide the name here. They’re going to provide the password here and then we are going to provide the pages which we want to convert to JSON.
Step 2: Set Input Parameters and Generate Query
During the conversion to JSON, it is containing the different input parameters, like if you want to have the async version running. Then we can provide these async parameters. If you provide the async parameters, it will return us the job ID, which we need to check early like whether this job is complete or not with using these jobs.
Let’s check the URL and if we want to output in the encrypted format, then we are to enable this parameter if we want to return JSON as an inline body. Then we need to specify this inline option too. Otherwise, it will give us a link to the output file. These are the input parameters, if you want to provide the URL directly as input, then we can use a URL. Or if you want to upload the file, then we can utilize this file parameter. These are the profile parameters, if you want to provide the additional settings here, then we can utilize these ones.
Let us see this in action. We have created the instance. We have provided the API key in the header, we have generated the query here.
So, it’s converted to JSON. The name of the output file, the URLs. It’s processing, we got the response back. So, we have not provided the inline parameter. We are getting the result file URL here. It successfully going to fetch the resulting URL and download it.
Step 3: Check the Result
Let’s see, the result file. We are not providing any default path, so it’s been debugging and here it is. Let us see what it is containing. It has generated the JSON for our input file and it has given us all the necessary fields like whatever the text and the font name, its size, its style, its coordinates, its color to its width and height, and the text. So this is very detailed information which we are going to get back.
All the inputs we’re going to get its coordinates and all the information back. So that’s how easy to get the JSON from the PDF file using the PDF.co web API.
Check this PDF.co video tutorial on how to use PDF.co Web API and follow us on YouTube!