Parse a Multi-paged Table – Step-by-Step Guide

  1. Go to PDF.co Document Parser Page
  2. Create a New Template
  3. Load Source PDF
  4. Add Object
  5. Set Expressions and Properties for Objects
  6. Run Template
  7. Result
  8. Save Template

Once this step-by-step tutorial with screenshot is over, you will learn how to parse a multi-paged table using PDF.co’s Document Parser.

We have here a sample PDF that has a multi-paged table:

Source PDF with Multi-paged Table
Source PDF with Multi-paged Table

First, log in to your PDF.co account here.

Step 1 – Go to PDF.co Document Parser Page

  • On your dashboard, go to the Document Parser page by clicking the Document Parser menu

Go to PDF.co Document Parser Page

Step 2 – Create a New Template

  • Once you’re already on your Document Parser page, click New Template to start creating a template

Create a New Template

Step 3 – Load Source PDF

  • Load the PDF or Image file that you’re going to use for creating the template

Load Source PDF

Step 4 – Add Object

  • To get the Total, select Add FIELD based on TEXT SEARCH object
  • To get the Table Data, select Add TABLE field based on TEXT SEARCH object

Add field based on text search object

Add table field based on text search object

Step 5 – Set Expressions and Properties for Objects

  • To get the Total, set the Expression to TOTAL{{Spaces}}({{Number}}), make sure to check the Regex check box, and set the Data Type to Decimal or Currency
  • To get the Table Data, add the following properties:
    • {
      "start": {
      "expression": "Item{{Spaces}}Description{{Spaces}}Price",
      "regex": true
      },
      "end": {
      "expression": "TOTAL{{Spaces}}{{Number}}",
      "regex": true
      },
      "row": {
      "expression": "{{LineStart}}{{Spaces}}(?<itemNo>{{Digits}}){{Spaces}}(?<description>{{SentenceWithSingleSpaces}}){{Spaces}}(?<price>{{Number}}){{Spaces}}(?<qty>{{Digits}}){{Spaces}}(?<extPrice>{{Number}})",
      "regex": true
      },
      "columns": [
      {
      "name": "itemNo",
      "dataType": "integer"
      },
      {
      "name": "description",
      "dataType": "string"
      },
      {
      "name": "price",
      "dataType": "decimal"
      },
      {
      "name": "qty",
      "dataType": "integer"
      },
      {
      "name": "extPrice",
      "dataType": "decimal"
      }
      ],
      "multipage": true
      }

Set expression for total field

Set properties for table data

Step 6 – Run Template

  • After adding the objects you may now run the template

Run template

Step 7 – Result

  • Here’s the result
Parsed Multi-paged Table
Parsed Multi-paged Table

Step 8 – Save Template

  • Once you’re finished creating your template, you may now save it

Save template

You’ve learned how to parse a multi-paged table using PDF.co’s Document Parser through this tutorial.

Parse a Multi-paged Table – Video