How to Check if PDF Requires OCR using PDF.co and Zapier

In this tutorial, we will show you how to check if PDF requires OCR using PDF.co and Zapier.

  1. Create a Zap
  2. Google Drive App
  3. Setup Trigger
  4. Test Trigger
  5. Add PDF.co App
  6. Setup PDF.co Configuration
  7. PDF.co Test Result
  8. Add Action Path
  9. Edit and Rename Path
  10. Setup Rules and Testing
  11. Testing Result
  12. Add Action and Event
  13. Setup Action
  14. Test Result
  15. Rename and Edit Path
  16. Setup Rules and Testing
  17. Rules Testing Result

We will use this sample scanned PDF Document and check if it requires an OCR.

Sample Scanned PDF Document
Sample Scanned PDF Document

Step 1: Create a Zap

  • To begin. First, log into your Zapier account and click the Create Zap button.

Step 2: Google Drive App

  • Next, select the Google Drive app and the New File in Folder as a trigger.

Google Drive App

Step 3: Setup Trigger

Let’s set up the Google Drive configuration.

  • In the Drive field, select My Google Drive.
  • In the Folder field, enter the specific folder where the file resides.

Setup Trigger

Step 4: Test Trigger

  • Then, click on the test trigger button to make sure that we set it up correctly.

Test Trigger
Test Trigger Result

Once the test trigger was successful. You may now add another app.

Step 5: Add PDF.co App

  • Let’s add another app and select PDF.co. Then, choose the PDF to Anything Converter to convert PDF into any document.

Add PDF.co App

Step 6: Setup PDF.co Configuration

Let’s set up the configuration…

  • For the Output Format field, select the JSON(text objects and structure).
  • For the PDF URL field, choose the Web Content Link from your Google Drive.
  • In the Pages field, add the list of pages separated by a comma. Type 0 for the first page.
  • For the Inline Output field, set it to true to return data as inline.

Setup PDF.co Configuration

After setting up the configuration, click on the test and review button.

Step 7: PDF.co Test Result

  • Great! PDF.co was successfully processing our request. We can now move to the next step.

PDF.co Test Result

Now, let’s build a path for different steps and rules.

Step 8: Add Action Path

  • Let’s add an action and select the Path built-in tools.

Add Action Path

Step 9: Edit and Rename Path

  • Next, type in your desired Path name for the first rule. This path will check if the PDF requires an OCR.
  • Then, click on the Edit button.

Edit and Rename Path

Step 10: Setup Rules and Settings

Let’s set up the first rule

  • First, select the data from PDF.co app for the chosen field.
  • Next, add the condition you want to apply for the field data.
  • Then, enter the text related to the field data and click on the Continue button to test it out.

Setup Rules and Settings

Step 11: Testing Result

  • Great! We successfully set up the first rule. Now, let’s add an action to make scanned PDFs searchable.

Testing Result

Step 12: Add Action and Event

  • Let’s add the PDF.co app and choose the PDF Make Searchable to turn images or PDFs into searchable text.

Add Action Event

Step 13: Setup Action

  • For the PDF URL field, input the direct URL of the source file.
  • For the OCR Language field, select the language to be used for extracting text from scanned PDF. Let’s use English as a default.
  • Then, type in your desired output file name.

Setup Action

After setting up, click on the test and review button to see the result.

Step 14: Test Result

  • Excellent! Our test was a success, kindly copy the URL and paste it into your browser to view the output.

Test Result

Now, let’s make a path that files don’t require an OCR.

Step 15: Rename and Edit Path

  • Type in your desired Path name for the second rule. This path doesn’t require files to have an OCR.
  • Then, click on the Edit button.

Rename and Edit Path

Step 16: Setup Rules and Testing

Let’s set up the second rule.

  • First, add the data from the PDF.co app.
  • Next, add the condition with a different rule that doesn’t have contains your field data.
  • Then, enter the text not related to your field data and click the Continue button to test it.

Setup Rules and Testing

Step 17: Rules Testing Result

  • Great! We successfully set up the second rule and the file doesn’t require an OCR.

Rules Testing Result

In this tutorial, you learned how to check if PDF requires OCR using PDF.co and Zapier. You learned how to build a path with different steps and rules. You also learned how to make scanned PDFs searchable.

Please note: You can also use the v1/pdf/convert/to/text-simple to check if a file requires OCR. To try this endpoint, please use the PDF.co Custom API Call action.