Automate PDF Text Extraction and Record-Keeping into Google Sheets

Jul 23, 2025·5 Minutes Read

What You'll Have When Done: Automatically extract text from any PDF uploaded to Google Drive and log document details in Google Sheets - perfect for document processing, content analysis, and record-keeping.

Prerequisites

Before you begin, make sure you have:

  • A PDF.co API Key (Get yours here)
  • A Google Drive account with OAuth2 credentials in n8n
  • A Google Sheets account with OAuth2 credentials in n8n
  • An n8n instance (cloud or self-hosted)
  • A Google Drive folder for PDF uploads (Create here)
  • A Google Sheets document for tracking extracted text (Sample here)

Quick Start Options

Option A: I Want It Working Now

  1. Import this workflow template → Download JSON File
  2. Connect your Google Drive, PDF.co, and Google Sheets accounts
  3. Set up your watched folder in Google Drive
  4. Configure your tracking spreadsheet
  5. Upload a test PDF
  6. Watch the automation extract and log text

Option B: I Want to Build It Step-by-Step

Follow the 4-step guide below to create the automation from scratch.

What This Automation Does (Overview)

  1. Monitors Google Drive folder for new PDF uploads
  2. Extracts text content from PDFs using PDF.co
  3. Downloads the extracted text via HTTP request
  4. Logs everything to Google Sheets with document name, text, and page count

Step 1: Monitor Google Drive for New PDFs

Node: Google Drive Trigger

Settings:

  • Trigger On: Changes Involving a Specific Folder
  • Folder to Watch: Select your target folder
  • Watch For: File Created

Success Looks Like: When you upload a PDF to the watched folder, the workflow automatically triggers and begins processing.

Note: Set the file sharing setting to “Anyone with the link can view” so the next node can access it without permission issue.

Step 2: Extract Text from PDF

Node: PDF.co API → Convert from PDF

Settings:

  • URL: ={{ $json.webContentLink }}
  • Convert Type: PDF to Text

Success Looks Like: PDF.co returns a URL containing the extracted text and document metadata.

Step 3: Download Extracted Text

Node: HTTP Request

Settings:

  • Method: Get
  • URL: ={{ $json.url }}

Purpose: Downloads the actual text content from PDF.co's temporary storage URL.

Success Looks Like: The raw extracted text is now available in the workflow data for logging to Google Sheets.

Step 4: Log to Google Sheets

Node: Google Sheets → Append Row

Settings:

  • Document From List: Select your tracking spreadsheet
  • Sheet From List: Choose target sheet (usually "Sheet1")

Column Mappings:

  • Document Name: ={{ $('PDFco Api').item.json.name }}
  • Text: ={{ $json.data }}
  • Page Count: ={{ $('PDFco Api').item.json.pageCount }}

Spreadsheet Setup: See sample Google Sheet here

Success Looks Like: Each processed PDF creates a new row with extracted text and metadata.

Congrats! You've built an automated document processing system that turns any PDF upload into searchable, structured data.

Built something cool? Share it with us @pdfdotco

Related Tutorials

See Related Tutorials