How to Convert PDF to TEXT with Blue Prism
This tutorial explains how to convert PDF to text with Blue Prism OCR and PDF.co extension that we have developed for PDF.co. I have one process named the testing_1
which is created. We’ll be using it just for demo purposes. Now, drag and drop one Action here and configure this Action. Name this as the PDF to TEXT. In the Business Object field, we will select PDF.co and in the Action, we will use PDF to TEXT.
PDF.co Blue Prism Extension Explained
Now as you can see it is having different inputs for the URL or the API key or the number of pages from where we want to extract the data or rectangle (rect), which is expecting the region coordinates. If you have a requirement from a particular region, we have to extract then we can use it. This is inline, for example, we want the whole data then we can enable this play. We will see this in a moment.
Now to give the input, I will be using the Postman collection from PDF.co. I have already downloaded the Postman collection from the PDF.co website. If you want then go to the API and Integration under the Rest Web API. You can download the Postman collection here. In my machine, I have all the PDF.co API Postman collection. Here it is PDF to TEXT and if I navigate to that, let us see what is contained in the body. It is having one Text, URL for the PDF under these parameters. Let’s see what it contains as an input PDF. It is having the invoice data PDF. It will be interesting to see these in the Textual format.
Convert PDF to Text with Blue Prism
I have to add the URL and I already have the API key from PDF.co. I have configured the API key. I’m not putting any other parameters here, if you want we can fine-tune our inputs but in this case, for the demo purpose, we will not do that. For the output to define some outputs, we want our output body in data to say some variables like Result_body and URL which is like Result_URL and it’s visually created here. Now check if there are any requirements needed. I think we are good to go and click OK.
The Action is created and we have also defined the parameters and link to the start and endpoint. Now execute it. The process is completed. Let’s check the output variable and it’s containing one URL and it is pointing to the TEXT file. Open it, we can see all the data in the Textual format.
Now, try one more thing with this, preface it and enable the inline to True and click on OK. So that we should receive the whole data and execute it. After execution is completed, check the variable, and here all the data in Textual format.
This is how we can use the PDF.co API with Blue Prism. We can certainly try the different types of parameters in the combination of the parameters to get our desired output. Now, you’ve learned how to convert PDF to text with Blue Prism OCR and PDF.co extension.