Extract Text Info from PDF and Turn into XML File (Neglecting Tables and Images) using PDF.co and Zapier
In this tutorial, we will show you how to extract text info from PDF and turn it into an XML file using PDF.co and Zapier.
This is the Sample Source File that we will use for this demonstration.
Step 1: Make A Zap
First, click the Make a Zap button at the upper left corner of your dashboard.
Step 2: Choose PDF.co App
Next, select the PDF.co app for the App Event.
Step 3: PDF To Anything Converter
For the Action Event, select the PDF to Anything Converter to convert PDF to JPG, PNG, CSV, JSON, XML, and other formats.
Step 4: Connect the PDF.co Account
Now, let’s connect our PDF.co account to perform the Zap.
Step 5: Set Up the Action
Next, let’s set up the Action and fill out the Output Format, PDF URL, and Pages.
- For the Output Format, choose the XML Code because we want to extract text in our PDF and turn into XML.
- In the PDF URL field, put the URL of the source PDF document and set the file sharing option to Anyone with link when you use cloud services such as Google Drive, Dropbox, etc.
- Under the Pages field, type in 0 for page 1.
Step 6: Test and Review
Now, click the Test and Review button to make sure that there are no errors in our configuration.
Step 7: Test Result
Excellent! Our test was successful. PDF.co returned a URL to so we can view the output. You can now Turn on the Zap.
Step 8: Source File Output
When you open the output URL, the source file output looks like this.
In this tutorial, you learned how to extract text info from PDF and turn it into an XML file using PDF.co and Zapier. You also learned how to set up the PDF to Anything Converter module that supports PDF to Text, XLS, CSV, JSON, XML, and Images formats conversion.