Have you ever found yourself spending hours manually typing out text from scanned PDF documents? Well, with the help of the Document Parser module and the PDF.co connector in Power Automate, you automate this process and save yourself a great deal of time and effort.

Imagine having a stack of scanned PDFs that you need to extract information from. The Document Parser module, powered by AI and OCR technology, can intelligently analyze these documents and accurately extract the text, even if they have complex layouts. It understands different data fields, tables, and sections within the document, allowing it to extract specific information with precision.

In this article, we will demonstrate the use of PDF.co in Power Automate. By doing so, you will be able to establish automated workflows that easily extract text from scanned PDFs and store the extracted information in a format of your choice, such as CSV, XML, or JSON. This convenient setup allows you to configure the workflow once and let it handle the task, thereby freeing up your time for more important activities.

  1. Create a New Custom Flow
  2. Add a flow
  3. Add PDF.co Connector
  4. Configure Document Parser Settings
  5. Create Email Notifications
  6. Test and Run Flow
  7. Email Notifications Received
  8. Extracted Text Output
  9. Demo

Here is a sample scanned PDF invoice, and our objective is to extract specific text from this document.

Sample Scanned PDF Document
Sample Scanned PDF Document

Prior to initiating the automation procedure, our first step will be to integrate the PDF.co Custom Connector into Power Automate Cloud. This integration will allow us to utilize the PDF.co module and simplify the process easily. To accomplish this, kindly follow the instructions on how to Add PDF.co Custom Connector for Power Automate Cloud.

Step 1: Create a New Custom Flow

  • After that, please access your Power Automate account and proceed with creating a flow.

Create a New Custom Flow

Step 2: Add a Flow

  • Moving forward, let’s create a new custom flow for Document Parser and set the flow type as Manually trigger. To initiate the process, please click on the Create button.

Add a Flow

Step 3: Add PDF.co Connector

  • Then, select the Custom option and access the PDF.co Connector. From there, navigate to the Document Parser feature, which enables you to extract specific text from scanned PDFs.

Add PDF.co Connector
Add PDF.co Connector

Step 4: Configure Document Parser Settings

Let’s proceed with configuring the settings for the Document Parser:

  • In the URL field, enter the direct URL of your source file. Alternatively, if you have a file to upload, you can utilize the PDF.co File Storage service, which will convert your file into a URL format. This URL will be used specifically for PDF.co processing.
  • Next, choose your desired output format, such as JSON, XML, or CSV, from the Output Format field.
  • Then, provide the template ID in the Template ID field. This template contains the extracted data from scanned PDFs. You can easily create the template using the PDF.co Document Parser Template Editor. For instructions on creating a template, please refer to the tutorial guide.

Configure Document Parser Settings

Step 5: Create Email Notifications

  • In this step, we will set up Notifications to receive an email after the automation processing is complete. We will specify the subject of the email and include the Output URL in the body. This way, when we receive the email notification, the output URL will be included in the email body.

Create Email Notifications
Create Email Notifications

Step 6: Test and Run Workflow

  • Once all the steps for the automation have been added, proceed to save and test the workflow.

Test and Run Workflow

Step 7: Email Notifications Received

  • After the flow has been successfully executed, navigate to your email inbox and open the email notifications. Inside the email, click on the provided URL to access and view the output.

Step 8: Extracted Text Output

  • Below is the extracted text from the scanned PDF invoice, presented in JSON format. If desired, you can choose alternative output formats such as XML or CSV based on your specific requirements and preferences.
Extracted Text in JSON Format
Extracted Text in JSON Format

Step 9: Demo

  • Please take a look at this demonstration that showcases the setup and automation process for extracting text from scanned PDFs using the PDF.co Connector and Power Automate.
Extract Text from Scanned PDF Demo
Extract Text from Scanned PDF Demo

The integration of PDF.co’s Document Parser module with Power Automate allows you to automate the process of extracting text from scanned PDFs, saving you valuable time and effort. With the help of advanced AI and OCR technology, the Document Parser can intelligently analyze complex PDF layouts and accurately extract specific information. By configuring the settings and providing the necessary input, you can create automated workflows that handle the extraction process for you. You can even set up email notifications to receive the extracted text conveniently. Users have found PDF.co to be a reliable solution, praising its ability to save time, accurately process documents, and its user-friendly nature. Overall, PDF.co and Power Automate provide a seamless and efficient way to extract text from scanned PDFs, making your tasks easier and more productive.