In this tutorial, we will show you how to extract text from scanned PDF in PHP using PDF.co Web API. Below is the image of the scanned PDF source file and its extracted text output.
Step 1 – Source Files
First, kindly open the HTML and the PHP source code in your favorite editor. You can get the source code samples here
We highly recommend saving the files in a folder inside the \www or the \htdocs directory.
Step 2 – Start Server
Next, let’s start the Apache server so we can run our program.
Step 3 – Run Program
Now, let’s run our program and extract text from the scanned PDF.
- In the browser address bar, type in
localhost/folder-name/sample.html. The /folder-name/ is a folder in the /www directory where you stored the files if you are using WampServer.
- In the API Key field, enter your PDF.co API Key. You can get it in your PDF.co dashboard.
- Choose the scanned PDF file.
- Leave the page number field empty so it extracts all the PDF pages.
Then, click on the Proceed button.
Step 4 – Demo
Here’s a demo to see PDF to Text Web API in action.
In this tutorial, you learned how to extract text from scanned PDF in PHP using PDF.co Web API. You also learned how to set up the source code samples to get you up and running right away.