Auto-Process Editor’s Articles from Google Drive to TXT + HTML + Compressed PDF (with timestamped filenames)

Nov 18, 2025·6 Minutes Read

What you’ll have when done

A hands-free pipeline that:

  1. Watches a Drive folder for new editor PDFs.
  2. Normalizes the created time (human-readable YYYY-MM-DD).
  3. Converts the PDF into
    • Text (TXT) for search/indexing,
    • HTML for web use,
    • Compressed PDF for archiving.
  4. Downloads each converted file as binary.
  5. Uploads each to its own Drive folder, with standardized filenames that include the date (for audit and record-keeping).

Prerequisites

  • PDF.co API key (set in your “PDF.co account” credentials). Get yours here : https://app.pdf.co/
  • Google Drive OAuth2 connected in n8n.
  • Three destination Drive folders (copy their IDs):
    • Output_Web_HTML folder ID: 1O9zJ2swIu4Ya8bMbFSJlMzr1D_cefFqp
    • Output_Search_TXT folder ID: 1_HHoFS4MS-wBXGcr3bXwIjx8vC6t-G7Y
    • Output_Archive_PDFs folder ID: 1Pg6I9aSacilnZXpMsGT6FdJh0hx5atki
  • One source Drive folder to watch for new PDFs (your trigger folder).
  • A sample PDF Article you can test with. Link Here

Quick Start Options

Option A: I Want It Working Now

  1. Import this workflow templateDownload JSON File
  2. Connect your Google Drive account in n8n
  3. Add your PDF.co API key in the PDF.co node
  4. Update the Google Drive Trigger with your source folder
  5. Update the Google Drive Upload nodes with your destination folders
  6. Test with a sample PDF Article
  7. Activate and let it run

Option B : Step-by-Step Build Guide

Step 1: Trigger: Watch a Drive folder for new PDFs

Node: Google Drive Trigger

What it does: Fires whenever a new file is created in your chosen folder and passes the file’s metadata downstream (including webContentLink, createdTime, etc.).

Settings

  • Trigger On: Changes involving a specific folder
  • Folder From List: (pick your source folder)
  • Watch For: File Created
  • (Optional) Options → File Type: PDF

Success looks like: When you add a new PDF, the workflow starts and you can see the file metadata in the node’s output.

Step 2: Normalize the created date for filenames

Node: Code (right after the trigger)

What it does: Takes Google’s createdTime (YYYY-MM-DDTHH:mm:ss.sssZ) and produces a clean date YYYY-MM-DD you can safely put in filenames.

JS to paste (you’re already using this):

/**
 * Adds `cleanDate` = YYYY-MM-DD derived from Created At or createdTime
 */
return items.map(item => {
  const rawDate = item.json['Created At'] || item.json.createdTime || new Date().toISOString();
  const cleanDate = rawDate.split('T')[0];
  return { json: { ...item.json, cleanDate } };
});

Success looks like: The node output includes "cleanDate": "2025-11-01" (for example).

Step 3: Convert the PDF into three formats (run in parallel)

This is where we fork: connect the output of Code to each of the three PDF.co nodes directly (do not chain them).

Step 1: Convert to Text

(PDF → Text)

Node: Convert to Text (your custom PDF.co node)

What it does: Converts the PDF to a .txt file (great for search, indexing, and quick content grabs).

Settings

  • Operation: Convert from PDF
  • URL: ={{ $json.webContentLink }}
  • Advanced Options → File Name: =FeaturePlaybook_{{ $json.cleanDate }}_Text.txt
  • Output: JSON with a url to the generated TXT file.

Step 2: Convert to HTML

(PDF → HTML)

Node: Convert to HTML

What it does: Converts the PDF to HTML (handy for web/CMS ingestion or review).

Settings

  • Operation: Convert from PDF
  • Convert Type: toHtml
  • URL: ={{ $('Code').item.json.webContentLink }}
  • Advanced Options → File Name: =FeaturePlaybook_{{ $('Code').item.json.cleanDate }}_Web.html
  • Output: JSON with a url to the generated HTML file.

Step 3: Compress the PDF

(PDF Compress)

Node: Optimize PDF

What it does: Produces a smaller, web-optimized PDF to store/ship.

Settings

  • Operation: Compress PDF
  • URL: ={{ $('Google Drive Trigger').item.json.webContentLink }}
  • Advanced Options → File Name: =FeaturePlaybook_{{ $('Code').item.json.cleanDate }}_Optimized.pdf

Output: JSON with a url to the compressed PDF.

Step 4: Download each converted file as binary

Google Drive Upload requires binary input. Your PDF.co nodes return a URL — so we add an HTTP Request node after each conversion to download the file as binary.

For each branch (TXT, HTML, PDF), add an HTTP Request node:

Node (example): Binary Text File / Binary HTML File / Binary Compressed file

What it does: Fetches the file and stores it as binary.data.

Settings (v4.2 of HTTP Request)

  • Method: GET
  • URL: ={{ $json.url }}
  • Options → Add Option → Response Format: File
  • Options → Add Option → Binary Property: data

Success looks like: The node execution shows a Binary tab with a data entry (not a giant HTML or PDF string in JSON).

Tip: If you see only JSON text, you didn’t set Response Format = File. Flip that and re-run.

Step 5: Upload each binary to its destination Drive folder

You can upload separately (one Drive node per branch) or collect and route. The simplest for non-technical users is one upload per branch.

Step 1: Upload Compressed PDF to Archive

Node: Google Drive

What it does: Uploads the optimized PDF.

Settings

  • Resource: File
  • Operation: Upload
  • Input Data Field Name (Binary Property): data
  • Parent Drive: My Drive
  • Parent Folder by ID: 1Pg6I9aSacilnZXpMsGT6FdJh0hx5atki

Step 2: Upload HTML to Web folder

Node: Google Drive1

Settings

  • Resource: File
  • Operation: Upload
  • Input Data Field Name (Binary Property): data
  • Parent Folder by ID: 1O9zJ2swIu4Ya8bMbFSJlMzr1D_cefFqp

Step 3: Upload TXT to Search folder

Node: Google Drive2

Settings

  • Resource: File
  • Operation: Upload
  • Input Data Field Name (Binary Property): data
  • Parent Folder by ID: 1_HHoFS4MS-wBXGcr3bXwIjx8vC6t-G7Y

Success looks like: You will see three new files appear in their corresponding folders, named:

  • FeaturePlaybook_YYYY-MM-DD_Web.html
  • FeaturePlaybook_YYYY-MM-DD_Text.txt
  • FeaturePlaybook_YYYY-MM-DD_Optimized.pdf

Why each step matters

  • Trigger: listens so nobody has to click “run”.
  • Code (date): converts technical timestamps into a simple date you can read and reuse in filenames.
  • Three conversions: each format has a purpose
    • TXT → search, quick copy/paste, NLP ingestion.
    • HTML → web preview/CMS import.
    • Compressed PDF → smaller to archive/share.
  • HTTP Request (File): turns each URL into a real file (binary) so Drive can upload it.
  • Separate uploads: sends each file to the right home, keeping your Drive tidy and consistent.

The exact JS snippets you’ll reuse

A) Normalize the date (already in your workflow)

return items.map(item => {
  const rawDate = item.json['Created At'] || item.json.createdTime || new Date().toISOString();
  const cleanDate = rawDate.split('T')[0];
  return { json: { ...item.json, cleanDate } };
});

Troubleshooting (read this if something fails)

  • Drive Upload says: “expects a binary file ‘data’” → Your HTTP Request download is not set to Response Format: File + Binary Property: data. Fix and re-run.
  • Only one output shows up → Your three conversion nodes are chained in a line. Make sure the Code node forks to all three conversions directly.
  • HTML or TXT shows as big string in JSON → You didn’t download as File. Edit the HTTP Request node options.
  • Wrong folder → In Drive Upload, tick Use ID and paste the folder ID, not the folder name.
  • Filename missing extension → Ensure your PDF.co “name” fields include .html, .txt, .pdf.

Your sample workflow JSON (annotated)

You’re already very close. The big wins you implemented:

  • Forked from CodeOptimize PDF, Convert to HTML, Convert to Text.

After each conversion, added an HTTP Request node with:

 "options": {
  "response": {
    "response": {
      "responseFormat": "file"
            }
         }
     }
  • That’s the correct way to make binary files in v4.2.

Just ensure each Drive node’s Input Data Field Name (aka Binary Property) is data.

Done!

You now have a reliable editorial pipeline:

  • Editors drop a PDF → your workflow names it consistently by created date,
  • converts it to TXT, HTML, and compressed PDF,
  • and files each format to its proper Drive folder with timestamped filenames for clean history and audit.

Built something cool with this workflow? Share it with us @pdfdotco

Related Tutorials

See Related Tutorials