Email-to-Booklet: Parse, Personalize, and Merge Legal PDFs with n8n + PDF.co

Nov 14, 2025·6 Minutes Read

Automation n8n

What You’ll Have When Done

A flow that:

Watches an email inbox for messages with legal documents (PDF attachments).
Uploads each attachment to PDF.co’s temporary storage (so we get stable URLs).
Runs Document Parser on the filled form to extract: project_title, client_name, reference_id, generated_at.
Generates a personalized cover page from an HTML Template (hosted in PDF.co) using those extracted values.
Merges (cover + all uploaded PDFs) into a single booklet.
Saves the final booklet in Google Drive.

Prerequisites

PDF.co API Key – sign up: https://app.pdf.co/
Mailbox with IMAP access (for Email Trigger)
Google Drive OAuth2 connected in n8n
A Google Drive folder for the final merged booklet
PDF.co HTML Template created in editor: https://app.pdf.co/html-templates-tool/editor/new
- Copy your Template ID (you can view/manage templates here: https://app.pdf.co/html-templates-tool/editor)
PDF.co Document Parser Template created here: https://app.pdf.co/document-parser/templates/new
- Test and copy its Template ID using: https://app.pdf.co/document-parser-tool/playground/choose-template

Quick Start Options

Option A — I want it working now

Import your workflow JSON (Link Here).
Open each node and:
- Set your IMAP credentials in Email Trigger (IMAP).
- In each PDFco Api node:
  - Add your PDF.co API credentials.
- In Document Parser HTTP Request, set your x-api-key and templateId.
- In PDFco Api4 (HTML→PDF from Template), set your templateId and make sure the templateData fields match your parser keys.
- In Google Drive upload node, pick your destination folder.
Run once to test, then activate.

Option B — Build it step-by-step (tutorial below)

Follow the steps. I’ll show exact settings and payloads.

Endpoint Index (you’ll use these)

PDF Upload : Docs: https://docs.pdf.co/api-reference/file-upload/upload
- POST /v1/file/upload/
Document Parser (extract fields from filled form): Docs: https://docs.pdf.co/api-reference/documentparser/overview
- POST /v1/pdf/documentparser
HTML Template → PDF (cover page): Docs: https://docs.pdf.co/api-reference/pdf-from-html/convert-from-template
- POST /v1/pdf/convert/from/html (template mode in the n8n custom node is often labeled URL/HTML to PDF → htmlTemplateToPDF)
PDF Merge (booklet): Docs: https://docs.pdf.co/api-reference/merge/pdf
- POST /v1/pdf/merge

Step-by-Step Build

IN THIS TUTORIAL

Email Trigger & attachments

Upload each attachment to PDF.co (to obtain stable, public URLs)

Parse the filled form to extract client data

Build a clean filename (optional but recommended)

Generate the cover page from your PDF.co HTML Template

Gather all PDF URLs (cover + four uploaded documents)

Merge PDFs

Download merged file and upload to Google Drive

Step 1: Email Trigger & attachments

Node: Email Trigger (IMAP)

Goal: Watch for new emails with subject “Legal Documents” and download attachments.

Settings

Download Attachments: Enabled

Options → customEmailConfig: ["UNSEEN", ["SUBJECT", "Legal Documents"]]

What happens: Each new matching email becomes one execution, with its attachments available as binary.attachment_0, binary.attachment_1, etc.

Step 2: Upload each attachment to PDF.co (to obtain stable, public URLs)

Reason: PDF.co cannot download OAuth-protected links or private blobs; you upload binary and get a temporary URL to use downstream (merge, parser, etc.).

You can use your custom PDFco Api nodes (as in your sample), or do it transparently with two HTTP nodes:

Get Presigned URL (POST)
PUT Binary to presignedUrl

If you prefer the custom nodes you already have (PDFco Api, PDFco Api1, PDFco Api2, PDFco Api3) that say “Operation: Upload File to PDF.co” with binaryData = true and binaryPropertyName = attachment_n — perfect. Each of those will output a JSON with the final url.

In your sample, you map:

attachment_0 → Engagement_Agreement
attachment_1 → Filled_Form
attachment_2 → SOW
attachment_3 → MNDA

Each of these must output a json.url field for later.

API Reference (for completeness):

POST https://api.pdf.co/v1/file/upload/ → returns url

Docs: https://docs.pdf.co/api-reference/file-upload/upload

Step 3: Parse the filled form to extract client data

Node: HTTP Request (name: HTTP Request)

Endpoint Used: Document Parser Docs: https://docs.pdf.co/api-reference/documentparser/overview

Settings

Method: POST
URL: https://api.pdf.co/v1/pdf/documentparser
Headers:
- Content-Type: application/json
- x-api-key: YOUR_API_KEY

Body: JSON

 {
  "url": "{{ $json.url }}",
  "outputFormat": "JSON",
  "templateId": "YOUR_TEMPLATE_ID",
  "async": false,
  "inline": "true"
}

Make sure the item entering this node is the “Filled_Form” upload output (so {{ $json.url }} points to the filled form’s uploaded URL). Use your real templateId from: https://app.pdf.co/document-parser-tool/playground/choose-template

Output shape (simplified):

{
  "body": {
    "objects": [
      { "name": "project_title", "value": "XYZ Rollout" },
      { "name": "client_name", "value": "Acme Corp" },
      { "name": "reference_id", "value": "REF-4492" },
      { "name": "generated_at", "value": "2025-11-09 13:10" }
    ]
  }
}

Step 4: Build a clean filename (optional but recommended)

Node: Code (name: Code1)

Goal: Create a safe fileName (and coverName) for later nodes, using parser output.

Code (copy–paste):

/**
 * Build a clean, safe filename from parser output
 * Uses body.objects values (prefer by key names)
 */
function sanitize(s = '') {
  return String(s)
    .trim()
    .replace(/[\/\\:*?"<>|]+/g, '')
    .replace(/\s+/g, ' ')
    .replace(/[\s]+/g, '_')
    .replace(/[^\w.-]+/g, '_')
    .replace(/_+/g, '_')
    .replace(/^_+|_+$/g, '');
}

function pick(objects, keyName, idxFallback = 0) {
  if (Array.isArray(objects)) {
    const byKey = objects.find(o =>
      (o?.name || o?.key || o?.field || '').toString().toLowerCase() === keyName
    );
    if (byKey?.value) return byKey.value;
    return objects[idxFallback]?.value;
  }
  return undefined;
}

return items.map(item => {
  const objects = item.json?.body?.objects || [];
  const rawProject = pick(objects, 'project_title', 0) ?? 'Untitled_Project';
  const rawClient  = pick(objects, 'client_name', 1)   ?? 'Unknown_Client';

  const project = sanitize(rawProject);
  const client  = sanitize(rawClient);

  const date = new Date().toISOString().slice(0,10);
  const base = `${project}_${client}_${date}`.slice(0, 180);

  return {
    json: {
      ...item.json,
      projectTitle: rawProject,
      clientName: rawClient,
      fileBase: base,
      fileName: `${base}.pdf`,
      coverName: `00_Cover_${base}.pdf`
    }
  };
});

Step 5: Generate the cover page from your PDF.co HTML Template

Node: PDFco Api4

Endpoint Used: HTML to PDF (from Template)

Docs: https://docs.pdf.co/api-reference/pdf-from-html/convert-from-template

Settings

Operation: URL/HTML to PDF
Convert Type: htmlTemplateToPDF
templateId: YOUR_HTML_TEMPLATE_ID
- templateData: (this must match the keys your template expects)

 {
  "project_title": "{{ $json.body.objects[0].value }}",
  "client_name":   "{{ $json.body.objects[1].value }}",
  "reference_id":  "{{ $json.body.objects[2].value }}",
  "generated_at":  "{{ $json.body.objects[3].value }}"
}

If your parser keys are named, use them (e.g., {{ $json.body.objects.find(o => o.name==='project_title').value }}) — but the simple index form is fine if your order is stable.

Advanced Options → name: ={{ $json.fileBase }}_Cover

What happens: PDF.co renders your template with those values → returns a url to the generated cover PDF.

Step 6: Gather all PDF URLs (cover + four uploaded documents)

Node: Merge (mode: Wait for all inputs)

Connect inputs from:

PDFco Api3 (Cover page)
PDFco Api1 (Filled_Form upload → also goes to parser)
PDFco Api (Engagement_Agreement upload)
PDFco Api2 (SOW upload)
PDFco Api4 (MNDA upload)

Then: a Code node to collect URLs and build a final merged name.

Code (copy–paste):

// Collect URLs, keep first 5, build final name
function sanitizeName(s = '') {
  return String(s)
    .trim()
    .replace(/[\/\\:*?"<>|]+/g, '')
    .replace(/\s+/g, ' ')
    .replace(/[\s]+/g, '_')
    .replace(/[^\w.\-]+/g, '_')
    .replace(/_+/g, '_')
    .replace(/^_+|_+$/g, '');
}

const incoming = $input.all();
const urls = [];
const nameHints = [];

for (const it of incoming) {
  // Most PDFco upload/convert nodes return { url, name }
  const u = it.json?.url || it.json?.fileUrl || it.json?.downloadUrl;
  if (typeof u === 'string' && /^https?:\/\//i.test(u)) {
    urls.push(u);
  }
  nameHints.push(it.json?.name, it.json?.fileName, it.json?.filename);
}

// keep order, de-dup, cap at 5
const seen = new Set();
const unique = [];
for (const u of urls) {
  if (!seen.has(u)) { seen.add(u); unique.push(u); }
  if (unique.length >= 5) break;
}

// final merged filename (prefer existing base from upstream if present)
const base = sanitizeName($json.fileBase || 'AttachmentPacket');
const fileName = `${base}.pdf`;
const out = { fileName, urls: unique };
unique.forEach((u, i) => out[`url${i + 1}`] = u);
return [{ json: out }];

You’ll now have:

url1..url5
fileName (e.g., AcmeCorp_ProjectX_2025-11-09.pdf)

Step 7: Merge PDFs

Node: PDFco Api5

Endpoint Used: PDF Merge

Docs: https://docs.pdf.co/api-reference/merge/pdf

Settings

Operation: Merge PDF
url (multiple): map your ordered list:

={{ $json.urls[1] }}   // e.g., Cover
={{ $json.urls[0] }}   // Agreement
={{ $json.urls[2] }}   // Filled Form
={{ $json.urls[3] }}   // MNDA
={{ $json.urls[4] }}   // SOW

Adjust order to your preference (cover first).

Advanced Options → name: ={{ $json.fileName }}

Response: Returns a url to the merged booklet.

Step 8: Download merged file and upload to Google Drive

Node: HTTP Request (GET merged file)

URL: ={{ $json.url }}
Options → Response → Format: File
Binary Property: data

Node: Google Drive → Upload File

Settings:

Operation: Upload
Input Data Field Name: data
Parent Drive from List: My Drive
Parent Folder by ID: Select your destination folder

What This Does: Uploads the finalized PDF into your chosen Drive folder.

Congrats! You’ve created a streamlined legal packet builder that automatically assembles personalized, shareable, and professional PDF booklets from client emails — no manual intervention needed.

Built something cool with this workflow? Share it with us @pdfdotco and show how you’re automating document magic!