PDF Search API converts scanned PDF documents (where pages are fully or partially made from scanned images) into a text-searchable PDF. It uses OCR and adds an invisible text layer on top of your document that can be used for text search, text indexing, etc.

All the documents sent through our Web API are encrypted and secure. We use SSL, TLS, and file encryption security to protect your data. To learn more, please read our security page https://pdf.co/security

PDF Search API Benefits

No Licenses Required

PDF.co web API uses a credit-based payment. You can get credits through a subscription plan or a credit pack. To learn more about the available subscription plans and credit packs, visit here.

Multiple Programming Languages and Integrations

PDF.co API platform supports multiple programming languages that programmers can use: PHP, Javascript, Java, C#, .NET and ASP.NET, Visual Basic, and others. If you’re not a programmer, you can use PDF.co through automation platforms such as Integromat, Zapier, UiPath, Blue Prism, and others.

SIGN UP FOR FREE

On-premise Version

PDF.co platform operates on secure and certified cloud infrastructure. We also have an on-premise version that works on your server and can even run completely offline when required for enterprise customers who process sensitive data.

PDF Search API Sample & Demo

We have this sample scanned PDF to be used for this demo.

Screenshot of Sample Scanned PDF
Screenshot of Sample Scanned PDF

The code snippets below are in different programming languages. Using those code snippets, you can convert the sample scanned PDF file above into a text-searchable PDF.

The result would look like this.

Screenshot of output text-searchable PDF
Screenshot of output text-searchable PDF

Before we proceed with the code, let us first check the /v1/pdf/makesearchable parameters and their uses.

Endpoint

URL: https://api.pdf.co/v1/pdf/makesearchable
Method: POST
Parameter Description
url required. Link to the source file.
pages optional. Comma-separated list of page indices (or ranges) to process. IMPORTANT: the very first page starts with (zero). To set a range use the dash , for example: 0, 2-5, 7-.
expiration optional. Output link expiration in minutes. Default is 60 (i.e. 60 minutes or 1 hour). After this delay generated output file(s) (if any) will be auto-removed from PDF.co temporary files storage. Max allowed expiration period depends on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf, documents), please use PDF.co built-in Files Storage instead.
encrypt optional. Enable encryption for the output file: true or false
async optional. Runs processing asynchronously. Returns jobId to use with job/checktrue or false
name optional. Output file name.
profiles optional. Must be a String. Set custom configuration. See profiles examples here

cURL Code Snippet

curl --location --request POST 'https://api.pdf.co/v1/pdf/makesearchable' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"url": "https://bytescout-com.s3-us-west-2.amazonaws.com/files/demo-files/cloud-api/pdf-make-searchable/sample.pdf",
"lang": "eng",
"pages": "",
"name": "result.pdf", 
"password": "",
"async": "false",
"encrypt": false,
"profiles": ""
}'

The PDF Search API cURL sample codes are available here.

Let’s see the PDF Search API in action.

Output Searchable PDF using cURL
Output Searchable PDF using cURL

The PDF Search API JavaScript sample codes are available here.

The PDF Search API Python sample codes are available here.

The PDF Search API Java sample codes are available here.

The PDF Search API C# sample codes are available here.

The PDF Search API PHP sample codes are available here.

Related Samples: