Extract Text Info from PDF and Turn into XML File (Neglecting Tables and Images) using PDF.co and Zapier

Extract Text from PDF and Turn into XML – Guide

  1. Make A Zap
  2. Choose PDF.co App
  3. PDF to Anything Converter
  4. Connect the PDF.co Account
  5. Set up the Action
  6. Test and Review
  7. Test Result
  8. Source File Output

In this tutorial, we will show you how to extract text info from PDF and turn it into an XML file using PDF.co and Zapier.

This is the Sample Source File that we will use for this demonstration.

 

Sample Source File
Sample Source File

 

Step 1: Make A Zap

First, click the Make a Zap button at the upper left corner of your dashboard.

Step 2: Choose PDF.co App

Next, select the PDF.co app for the App Event.

PDF.co App

Step 3: PDF To Anything Converter

For the Action Event, select the PDF to Anything Converter to convert PDF to JPG, PNG, CSV, JSON, XML, and other formats.

PDF To Anything Converter

Step 4:Connect the PDF.co Account

Now, let’s connect our PDF.co account to perform the Zap.

PDF.co Account

Step 5: Set Up the Action

Next, let’s set up the Action and fill out the Output Format, PDF URL, and Pages.

  • For the Output Format, choose the XML Code because we want to extract text in our PDF and turn into XML.
  • In the PDF URL field, put the URL of the source PDF document and set the file sharing option to Anyone with link when you use cloud services such as Google Drive, Dropbox, etc.
  • Under the Pages field, type in 0 for page 1.

Set Up Action

Step 6: Test and Review

Now, click the Test and Review button to make sure that there are no errors in our configuration.

Test And Review

Step 7: Test Result

Excellent! Our test was successful. PDF.co returned a URL to so we can view the output. You can now Turn on the Zap.

Test Result

Step 8: Source File Output

When you open the output URL, the source file output looks like this.

 

Source File Output
Source File Output

 

In this tutorial, you learned how to extract text info from PDF and turn it into an XML file using PDF.co and Zapier. You also learned how to set up the PDF to Anything Converter module that supports PDF to Text, XLS, CSV, JSON, XML, and Images formats conversion.