PDF.co Document Parser can automatically parse PDF, JPG, and PNG documents to extract fields, tables, values, and barcodes from invoices, statements, orders, scanned PDFs, and other PDF documents. Document parsing is a term that involves examining the data present in a document and extracting useful information from it. In office work, sometimes you need to quickly extract tables with text from a document. Of course, you can do this operation manually. However, if the document is voluminous, has a complex structure, or contains dozens of tables, then you may spend too much time on such a monotonous operation than it deserves.

To automate the parsing of a large document, we have developed the PDF.co Document Parser Template Editor, which allows you to extract tables with text from documents and save them to separate files with one click. As a result, you can choose the output to XML, CSV, or JSON format.

Now, we’re going to show you how to extract tables with text using PDF.co and Postman. We have simple steps to follow to do this task.

  1. Open Postman App
  2. Document Parser API Collections
  3. Add PDF.co API Key
  4. Add Body Parameters
  5. Document Parser Output

Here’s a sample PDF document and we’ll extract the table with text in this tutorial.

Sample PDF Document with Table Item
Sample PDF Document with Table Item

Follow these simple step-by-step guides to extracting tables with text.

Step 1: Open Postman App

  • To begin. First, open the Postman app and click on the PDF.co Postman Collections. If you haven’t set up your PDF.co Postman Collection yet, you can check out our step-by-step guide here.

Step 2: Document Parser API Collections

  • Next, click the Document Parser folder in the PDF.co API v.1.00 collections and open the POST/pdf/documentparser (Output as JSON). You can also choose CSV or XML as an output format. This API method extracts data from documents using a document parser extraction template and extracts data from custom areas, by searching, form fields, tables, multiple pages, and more.

Document Parser API Collection

Step 3: Add PDF.co API Key

  • For the Headers param, add your PDF.co API Key. You can get the API Key in your PDF.co dashboard. If you don’t have a PDF.co account, kindly sign up at this link.

Add Headers Parameters

Step 4: Add Body Parameters

Let’s set up the Body param.

  • For the URL param, input the direct URL of your source file.
  • For the TemplateId param, set the Id of the document parser template to be used. You can create a template Id at this link. Kindly check this guide on how to create a template here.
  • For the Inline param, set it to true to return the results inside the response. The setting false will produce a link to the output file generated.
  • After setting up the parameters, click the Send button to make a request to PDF.co.

Add Body Parameters

Once the PDF.co process our request successfully, the output will return inside the response.

Step 5: Document Parser Output

  • Here’s the extracted table with text in JSON format.  Then, click the Save response to save the output in a file.
Extracted Table with Text in JSON Format
Extracted Table with Text in JSON Format

In this tutorial, you learned how to extract tables with text using Postman. You learned how to use the PDF.co Document Parser to extract table with text in a document. You also learned how to create a new template using the PDF.co Document Parser Template Editor.