How to Convert PDF to JSON from Uploaded File for PDF to JSON API in PHP using PDF.co Web API
PDF.co Web API is the RST API that provides set of data extraction functions, tools for documents manipulation, splitting and merging of PDF files. Includes built-in OCR, images recognition, can generate and read barcodes from images, scans and PDF.
Today you are going to learn how to convert PDF to JSON from the uploaded file in PHP. PDF.co Web API was made to help with PDF to JSON API in PHP. PDF.co Web API is the Web API with a set of tools for document manipulation, data conversion, data extraction, splitting, and merging of documents. Includes image recognition, built-in OCR, barcode generation, and barcode decoders to decode bar codes from scans, pictures and PDF.
You will save a lot of time on writing and testing code as you may just take the code below and use it in your application. This sample code in PHP is all you need. Copy-paste it to your code editor, then add a reference to PDF.co Web API and you are ready to try it! Further enhancement of the code will make it more vigorous. Full source code can be found at this link.
On-demand (REST Web API) version:
Web API (on-demand version)
On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)
Step-by-step Tutorial: How to Convert PDF to JSON from the Uploaded File in PHP with Easy Code Samples to Make PDF to JSON API.
Let’s review the source code and its output first, then we’ll analyze the code briefly.
pdf-to-json.php
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>PDF To JSON Extraction Results</title> </head> <body> <?php // Note: If you have input files large than 200kb we highly recommend to check "async" mode example. // Get submitted form data $apiKey = $_POST["apiKey"]; // The authentication key (API Key). Get your own by registering at https://app.pdf.co/documentation/api $pages = $_POST["pages"]; // 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE. // * If you already have the direct PDF file link, go to the step 3. // Create URL $url = "https://api.pdf.co/v1/file/upload/get-presigned-url" . "?name=" . $_FILES["file"]["name"] . "&contenttype=application/octet-stream"; // Create request $curl = curl_init(); curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey)); curl_setopt($curl, CURLOPT_URL, $url); curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); // Execute request $result = curl_exec($curl); if (curl_errno($curl) == 0) { $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE); if ($status_code == 200) { $json = json_decode($result, true); // Get URL to use for the file upload $uploadFileUrl = $json["presignedUrl"]; // Get URL of uploaded file to use with later API calls $uploadedFileUrl = $json["url"]; // 2. UPLOAD THE FILE TO CLOUD. $localFile = $_FILES["file"]["tmp_name"]; $fileHandle = fopen($localFile, "r"); curl_setopt($curl, CURLOPT_URL, $uploadFileUrl); curl_setopt($curl, CURLOPT_HTTPHEADER, array("content-type: application/octet-stream")); curl_setopt($curl, CURLOPT_PUT, true); curl_setopt($curl, CURLOPT_INFILE, $fileHandle); curl_setopt($curl, CURLOPT_INFILESIZE, filesize($localFile)); // Execute request curl_exec($curl); fclose($fileHandle); if (curl_errno($curl) == 0) { $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE); if ($status_code == 200) { // 3. CONVERT UPLOADED PDF FILE TO JSON ExtractJSON($apiKey, $uploadedFileUrl, $pages); } else { // Display request error echo "<p>Status code: " . $status_code . "</p>"; echo "<p>" . $result . "</p>"; } } else { // Display CURL error echo "Error: " . curl_error($curl); } } else { // Display service reported error echo "<p>Status code: " . $status_code . "</p>"; echo "<p>" . $result . "</p>"; } curl_close($curl); } else { // Display CURL error echo "Error: " . curl_error($curl); } function ExtractJSON($apiKey, $uploadedFileUrl, $pages) { // Create URL $url = "https://api.pdf.co/v1/pdf/convert/to/json"; // Prepare requests params $parameters = array(); $parameters["url"] = $uploadedFileUrl; $parameters["pages"] = $pages; // Create Json payload $data = json_encode($parameters); // Create request $curl = curl_init(); curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json")); curl_setopt($curl, CURLOPT_URL, $url); curl_setopt($curl, CURLOPT_POST, true); curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); curl_setopt($curl, CURLOPT_POSTFIELDS, $data); // Execute request $result = curl_exec($curl); if (curl_errno($curl) == 0) { $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE); if ($status_code == 200) { $json = json_decode($result, true); if ($json["error"] == false) { $resultFileUrl = $json["url"]; // Display link to the file with conversion results echo "<div><h2>Conversion Result:</h2><a href='" . $resultFileUrl . "' target='_blank'>" . $resultFileUrl . "</a></div>"; } else { // Display service reported error echo "<p>Error: " . $json["message"] . "</p>"; } } else { // Display request error echo "<p>Status code: " . $status_code . "</p>"; echo "<p>" . $result . "</p>"; } } else { // Display CURL error echo "Error: " . curl_error($curl); } // Cleanup curl_close($curl); } ?> </body> </html>
Output
Now that we’ve reviewed the source code and the output, let’s review the important code snippets. First of all, we’re uploading the input PDF file using the endpoint /v1/file/upload/get-presigned-url, and we’re passing the input file name as input. This endpoint will return us two URLs, one of which is a public URL ($json[“url”]) for the uploaded file and the other URL ($json[“presignedUrl”]) is for uploading the actual file to. Upon the successful response of this API endpoint, we’re uploading a PDF file to the presigned URL.
After the successful file upload to the presigned URL is completed, the code is proceeding with JSON extraction from the uploaded file. All code snippets for PDF to JSON conversion are written in ExtractJSON function. We’re using /v1/pdf/convert/to/json PDF.co endpoint for this conversion. The parameters consist of the URL of the input PDF and the number of pages from which JSON needs to be converted. This use-case demonstrates a very simple scenario, for advanced API parameters please visit API documentation.
The output of PDF to JSON API call primarily consists of JSON URL that is displayed in the output HTML. Please try to perform this demo on your machine to get more knowledge of this API.
Thank you for reading!
PDF to JSON in PHP – Video Tutorial
ON-PREMISE OFFLINE SDK
See also:
ON-DEMAND REST WEB API
Get Your API Key
See also:
PDF-co-Web-API-PHP-Convert-PDF-To-JSON-From-Uploaded-File.pdf