The PDF.co PDF Search API encompasses several search functions. It ranges from searching text, removing and/or replacing text with another text or image, and making the PDF document searchable.

The Web API is capable of making multiple text replacements, getting text coordinates, employing advanced pattern search using a regular expression, and converting scanned PDFs and images into text-searchable PDFs.

Web API engines work in any programming language: PHP, Javascript, C#, .NET and ASP.NET, Java, Visual Basic, and many others.

All the documents transmitted through our Web API are encrypted and secure. To learn more, please read our security page https://pdf.co/security

SIGN UP FOR FREE

PDF Search API Benefits

Regular Expression Support

The PDF Search API supports advanced pattern search on top of the regular text search. This pattern search employs regular expression or regex to return text or data that only matches the set expression.

Get Text Coordinates

The PDF Search API returns the coordinates of the text search result. This information is very helpful when trying to recreate the PDF or parsing specific data using the extraction or document parsing module.

Scanned PDF and Images Conversion

The PDF Search API converts scanned PDF whether partially or fully made from scanned images into text-searchable PDF. It runs OCR and adds an invisible text layer on top of your document that can be used for text search, text indexing, etc.

Text and Image Replacement Support

The PDF Search API can search text and replace it with either text or image. The search can be narrowed down to specific pages and page ranges or do a whole document search.

High-Quality PDF Generation

PDF.co platform converts images and scanned PDFs into high-quality PDF files that can be searched for text. The Built-in OCR engine supports multiple languages including English, Spanish, German, Chinese, Japanese, and others.

Web API and Business Automation Platforms Integrations

PDF.co platform can be used by software developers from programming languages such as Javascript, PHP, Java, .NET and ASP.NET, C#, Visual Basic, and many others.

If you are not a developer then you can also easily automate your PDF operations through business automation platforms such as ZapierIntegromat, and hundreds of others.

Enterprise Solutions

For enterprise customers, there is a Dedicated API Server that runs as a dedicated private server with dedicated private cloud storage in the hosting region of your choice.


SIGN UP FOR FREE

PDF Search API Sample & Demo

In this demonstration, we will find the Invoice Date in a PDF Invoice using a combination of text and regular expression search strings. We will set the inline parameter to true so we view the result in the response body. When you set the inline parameter to false, PDF.co will return a downloadable JSON with the contents of the PDF Search Text API result.

Below are the images of our source PDF Invoice and output.

PDF Search API Sample PDF Invoice And Text Result
Images of sample PDF Invoice and Output

Let’s review the /v1/pdf/find endpoint’s parameters and their corresponding functions.

Endpoint

URL: https://api.pdf.co/v1/pdf/find
Method: POST
Parameter Description
URL required. Link to the source file.
searchString text to search. Can contain a regex.
pages optional. Comma-separated list of page indices (or ranges) to process.
inline optional. Must be one of true, false.
wordMatchingMode optional. Must be a String.
password optional. The password of the PDF file. Must be a String
regexSearch optional. Must be one of true, false.
encrypt optional. Enable encryption to the output file.
async optional. Runs processing asynchronously. Returns jobId to use with job/check: true or false
name optional. Output file name.
profiles optional. Must be a String. Set custom configuration. See profiles examples here

 

cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/find' \
--header 'x-api-key: {{x-api-key}}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "async": "false",
    "encrypt": "false",
    "url": "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-to-text/sample.pdf",
    "searchString": "Invoice Date \\d+\\/\\d+\\/\\d+",
    "regexSearch": "true",
    "name": "output",
    "pages": "0-",
    "inline": "true",
    "password": ""
}'

The PDF Search Text API cURL source code samples are available here.

Let’s see the PDF Search Text API in action.

PDF Search Text API Demonstration
PDF Search Text API Demonstration

The PDF Search Text API JavaScript source code samples are available here.

The PDF Search Text API Python source code samples are available here.

The PDF Search Text API Java source code samples are available here.

The PDF Search Text API C# source code samples are available here.

SIGN UP FOR FREE

Related Samples: