Parse Multiline Items without Borders – Document Parser

  1. Go to PDF.co Document Parser
  2. Create a New Template
  3. Load your Source File
  4. Add Objects
  5. Run Template
  6. Result
  7. Save Template

In this step-by-step tutorial with screenshots, you will learn how to parse multiline items without borders using PDF.co Document Parser.

We’ll use the sample PDF below for this tutorial:

Screenshot of the Source PDF
Screenshot of the Source PDF

First, log in to your PDF.co account here.

Step 1 – Go to PDF.co Document Parser

  • After logging in, go to PDF.co Document Parser page by clicking the document parser menu at the top of your dashboard

Go to PDF.co’s Document Parser

Step 2 – Create a New Template

  • To create a new template. Click New Template on your document parser page

Create a New Template

Step 3 – Load your Source File

  • Load the source file that you’re gonna use for creating your template

Load your Source File

Step 4 – Add Objects

  • To get the items on the table, you’re going to use Add TABLE field based on TEXT SEARCH
  • Then add these properties:
    • {
      "start": {
      "expression": "Product Code{{Spaces}}Item Name",
      "regex": true
      },
      "end": {
      "expression": "Total",
      "regex": true
      },
      "row": {
      "expression": "{{LineStart}}{{Spaces}}(?<code>{{UppercaseLettersOrDigits}}){{Spaces}}(?<item>{{SentenceWithSingleSpaces}}){{Spaces}}(?<options>{{SentenceWithSingleSpaces}}){{Spaces}}(?<qty>{{Number}}){{Spaces}}{{Dollar}}(?<price>{{Number}}){{Spaces}}{{Dollar}}(?<subtotal>{{Number}})",
      "subExpression1": "{{LineStart}}{{Spaces}}(?<item>{{SentenceWithSingleSpaces}}){{Spaces}}(?<options>{{SentenceWithSingleSpaces}}{{LineEnd}})",
      "subExpression2": "{{LineStart}}{{Spaces}}(?<options>{{SentenceWithSingleSpaces}}{{LineEnd}})",
      "regex": true
      },
      "columns": [
      {
      "name": "code",
      "dataType": "string"
      },
      {
      "name": "item",
      "dataType": "string"
      },
      {
      "name": "options",
      "dataType": "string"
      },
      {
      "name": "qty",
      "dataType": "integer"
      },
      {
      "name": "price",
      "dataType": "decimal"
      },
      {
      "name": "subtotal",
      "dataType": "decimal"
      }
      ],
      "multipage": true
      }

Add TABLE field based on TEXT SEARCH

Set the properties of the table field

Step 5 – Run Template

  • Once you’re done adding the objects. You may now run your template

Run template

Step 6 – Result

  • Here’s the result
Result of the parsed data
Result of the Parsed Data

Step 7 – Save Template

  • Once you’re finished with your template. You may now save it.

Save template

You’ve learned how to parse multiline items without borders using PDF.co Document Parser through this step-by-step tutorial.

Related Pages:

Related Samples: