This guide uses Google Sheets, but it can also be helpful for users of other apps like Xero and QuickBooks.
How to Split a Multi-Page PDF Invoice and Extract Data using Zapier
This tutorial demonstrates using the PDF.co API to split a PDF containing 5 invoices into individual files. It shows how to identify a keyword to split the PDF, extract details from each invoice, and add the data to a Google Spreadsheet using the Zapier platform integration.
Here are the simple step-by-step instructions to automate the process using PDF.co and Zapier.
IN THIS TUTORIAL
Step 1: Google Drive New File in Folder
Use the Google Drive New File in Folder option to select the specific PDF file containing the multiple invoices that you need to process.
Step 2: Setup Google Drive Configuration
- Select My Google Drive as the drive to be used for this automation.
- Choose the specific folder within your Google Drive where the PDF file containing the multiple invoices is located.
Step 3: Test Trigger
Test trigger to access the specific file in your Google Drive folder.
Step 4: Add PDF.co App
Add the PDF.co app and the Split PDF Based on Text Search feature to split the PDF invoice into individual invoices.
Step 5: Setup PDF.co Configuration
- Set the Source PDF URL property to point to the PDF file you want to process.
- In the Search String field, enter the unique keyword or text that appears on each page where you want to split the PDF. The splitter will split the document at every page containing this keyword.
Step 6: Test PDF.co Action
Now, let’s proceed to test the action to split the PDF invoice into individual invoices.
Step 7: Formatter by Zapier
Let's use the Formatter by Zapier to get the link for each of the split pages.
Step 8: Setup Formatter Utilities
- In the Transform field, select Line - item to Text.
- In the Input field, choose the Urls generated by the PDF Split module.
- Set the Separator field to ||, which will be used to separate the individual invoice URLs.
Step 9: Test Formatter
Let's test the Formatter by Zapier to format the PDF invoice URLs for extraction.
Step 10: Looping by Zapier
Let's integrate the Looping by Zapier to our workflow to process each invoice URL individually. We'll use the Create Loop From Text option.
Step 11: Setup Looping from Text
- In the Values to Loop field, enter the variable name for the invoice URLs. This variable will reference each URL in the loop. In the next text box, select the Output Text from the previous module.
- Enter the separator || as the Text Delimiter.
Step 12: Test Looping from Text
Let's test the Looping by Zapier to iterate through each invoice URL.
Step 13: PDF.co Document Parser
Let's add the PDF.co Document Parser feature to extract information from each of these individual invoices.
Step 14: Setup Document Parser
- Select the Invoice URL from the output of the looping step.
- Add the Template Id of the custom template you created for this document. If you are new to Document Parser and haven't created a template yet, please visit the Document Parser Manager here: https://app.pdf.co/document-parser-tool/playground/choose-template-group
Step 15: Test Document Parser
After setting up the configuration, test the action to extract the data from the PDF invoice.
Step 16: Add Google Sheets
Let's integrate the Google Sheets action to save the extracted data from the invoices and select the Create Multiple Spreadsheet Rows option to save the extracted data from each invoice into separate rows.
Step 17: Setup Google Sheets
- Specify My Google Drive as the storage location.
- Choose the target Spreadsheet and Worksheet to save the extracted data.
- Map the parsed invoice data to the corresponding columns in the Google Sheet
Step 18: Test Action and See Result
Now, test the action and then open your Google Sheets spreadsheet to review the result.
In conclusion, this tutorial has shown you how to efficiently split a large PDF with multiple invoices into individual files using a specific keyword. By using the PDF.co API and the Zapier platform, you can easily extract invoice details and add them to a Google Spreadsheet. This automated process simplifies invoice management, enhances data accuracy, and saves time, allowing you to focus on more important business tasks.