In this tutorial, we’re going to see how you can search a text within the PDF. Let us see that documentation first. So the API, URL, and the different input parameters like if you want to make these job async operations, then we can provide this parameter to the value True.
If we want to encrypt the output file then we can. In this parameter, we can ask the input as a directly uploading the file or by the URL. The search link like, what do you want to search into the PDF. We can provide it here. It is the optional parameter like the name of the generated output files. If it’s not provided, then it is going to search into all pages. But if provided, then it is going to search for this search string value into the specified pages.
This is the optional parameter like we want to enable that inline option not, so we want to provide the word matching mode or not here and we want to search by the Regex. We can search by the regex also if we enable this regex, then we need to provide the regex into these search string parameters.
And if the file is encrypted, then we can provide the password of the PDF file here. Here are some responses. Response status is a pretty standard response and the response we’re going to have all the results with their coordinates in the pages. So if you want to perform advanced operations into that, then we can utilize these. So let’s get started with the code.
We are having a repeat of that implementation. It is the placeholder for the API keys. We are having the source file, let us see what is inside. There is the standard invoice PDF as input and we’re going to provide it when we are in the placeholder for the pages, which we’re not providing because we want to search into all the pages. It’s a placeholder for the password. It’s a placeholder for the searchString.
We are basically passing the regex operation here and we want to get all the numbers. Basically, this regex for finding all the numbers in the document is here. And we are creating the object for the WebClient and providing the API key in the header. We are preparing the query.
It is the URL for it and we have passed all the parameters like the password, pages, URL, searchString, and the regex. Once the response is there, we are going to pass the response. And we’re only going to display the found text with the coordinates. Let’s see this in action.
We have provided the API keys in the header here. We have generated the query here. Let us see. We’re going to request it and we will see the response. Let us see what it is containing. So it has got all the numbers with their coordinates and we are passing the JSON here. We will display all the numbers with their coordinates.
It’s that easy to find any text using the direct text or with the regex using PDF.co.
Check this PDF.co video tutorial using PDF.co Web API and follow us on YouTube!
PDF.co video tutorial
PDF.co REST Web API tutorials
- Introduction to PDF.co Web API
- How to Convert Images to PDF
- How to Convert PDF to Image
- How to Find PDF Information
- How to Make PDF Searchable
- How to Merge PDF using API
- How to Split PDF into Pages
- How to Search PDF for Text
- Add Image to Existing PDF
- Add Text to Existing PDF
- How to Convert PDF to Text
- How to Convert PDF to JSON
- How to Convert PDF to XML
- How to Convert PDF to HTML
- How to Generate BarCodes
- How to Read BarCodes