Email-to-Booklet: Parse, Personalize, and Merge Legal PDFs with n8n + PDF.co

Nov 14, 2025·6 Minutes Read

What You’ll Have When Done

A flow that:

  1. Watches an email inbox for messages with legal documents (PDF attachments).
  2. Uploads each attachment to PDF.co’s temporary storage (so we get stable URLs).
  3. Runs Document Parser on the filled form to extract: project_title, client_name, reference_id, generated_at.
  4. Generates a personalized cover page from an HTML Template (hosted in PDF.co) using those extracted values.
  5. Merges (cover + all uploaded PDFs) into a single booklet.
  6. Saves the final booklet in Google Drive.

Prerequisites

Quick Start Options

Option A — I want it working now

  • Import your workflow JSON (Link Here).
  • Open each node and:
    • Set your IMAP credentials in Email Trigger (IMAP).
    • In each PDFco Api node:
      • Add your PDF.co API credentials.
    • In Document Parser HTTP Request, set your x-api-key and templateId.
    • In PDFco Api4 (HTML→PDF from Template), set your templateId and make sure the templateData fields match your parser keys.
    • In Google Drive upload node, pick your destination folder.
  • Run once to test, then activate.

Option B — Build it step-by-step (tutorial below)

Follow the steps. I’ll show exact settings and payloads.

Endpoint Index (you’ll use these)

Step-by-Step Build

Step 1: Email Trigger & attachments

Node: Email Trigger (IMAP)

Goal: Watch for new emails with subject “Legal Documents” and download attachments.

Settings

  • Download Attachments: Enabled

Options → customEmailConfig: ["UNSEEN", ["SUBJECT", "Legal Documents"]]

  • What happens: Each new matching email becomes one execution, with its attachments available as binary.attachment_0, binary.attachment_1, etc.

Step 2: Upload each attachment to PDF.co (to obtain stable, public URLs)

Reason: PDF.co cannot download OAuth-protected links or private blobs; you upload binary and get a temporary URL to use downstream (merge, parser, etc.).

You can use your custom PDFco Api nodes (as in your sample), or do it transparently with two HTTP nodes:

  1. Get Presigned URL (POST)
  2. PUT Binary to presignedUrl

If you prefer the custom nodes you already have (PDFco Api, PDFco Api1, PDFco Api2, PDFco Api3) that say “Operation: Upload File to PDF.co” with binaryData = true and binaryPropertyName = attachment_n — perfect. Each of those will output a JSON with the final url.

In your sample, you map:

  • attachment_0 → Engagement_Agreement
  • attachment_1 → Filled_Form
  • attachment_2 → SOW
  • attachment_3 → MNDA

Each of these must output a json.url field for later.

API Reference (for completeness):

POST https://api.pdf.co/v1/file/upload/ → returns url

Docs: https://docs.pdf.co/api-reference/file-upload/upload

Step 3: Parse the filled form to extract client data

Node: HTTP Request (name: HTTP Request)

Endpoint Used: Document Parser Docs: https://docs.pdf.co/api-reference/documentparser/overview

Settings

  • Method: POST
  • URL: https://api.pdf.co/v1/pdf/documentparser
  • Headers:
    • Content-Type: application/json
    • x-api-key: YOUR_API_KEY

Body: JSON

 {
  "url": "{{ $json.url }}",
  "outputFormat": "JSON",
  "templateId": "YOUR_TEMPLATE_ID",
  "async": false,
  "inline": "true"
}

Output shape (simplified):

{
  "body": {
    "objects": [
      { "name": "project_title", "value": "XYZ Rollout" },
      { "name": "client_name", "value": "Acme Corp" },
      { "name": "reference_id", "value": "REF-4492" },
      { "name": "generated_at", "value": "2025-11-09 13:10" }
    ]
  }
}

Step 4: Build a clean filename (optional but recommended)

Node: Code (name: Code1)

Goal: Create a safe fileName (and coverName) for later nodes, using parser output.

Code (copy–paste):

/**
 * Build a clean, safe filename from parser output
 * Uses body.objects values (prefer by key names)
 */
function sanitize(s = '') {
  return String(s)
    .trim()
    .replace(/[\/\\:*?"<>|]+/g, '')
    .replace(/\s+/g, ' ')
    .replace(/[\s]+/g, '_')
    .replace(/[^\w.-]+/g, '_')
    .replace(/_+/g, '_')
    .replace(/^_+|_+$/g, '');
}

function pick(objects, keyName, idxFallback = 0) {
  if (Array.isArray(objects)) {
    const byKey = objects.find(o =>
      (o?.name || o?.key || o?.field || '').toString().toLowerCase() === keyName
    );
    if (byKey?.value) return byKey.value;
    return objects[idxFallback]?.value;
  }
  return undefined;
}

return items.map(item => {
  const objects = item.json?.body?.objects || [];
  const rawProject = pick(objects, 'project_title', 0) ?? 'Untitled_Project';
  const rawClient  = pick(objects, 'client_name', 1)   ?? 'Unknown_Client';

  const project = sanitize(rawProject);
  const client  = sanitize(rawClient);

  const date = new Date().toISOString().slice(0,10);
  const base = `${project}_${client}_${date}`.slice(0, 180);

  return {
    json: {
      ...item.json,
      projectTitle: rawProject,
      clientName: rawClient,
      fileBase: base,
      fileName: `${base}.pdf`,
      coverName: `00_Cover_${base}.pdf`
    }
  };
});


Step 5: Generate the cover page from your PDF.co HTML Template

Node: PDFco Api4

Endpoint Used: HTML to PDF (from Template)

Docs: https://docs.pdf.co/api-reference/pdf-from-html/convert-from-template

Settings

  • Operation: URL/HTML to PDF
  • Convert Type: htmlTemplateToPDF
  • templateId: YOUR_HTML_TEMPLATE_ID
    • templateData: (this must match the keys your template expects)
 {
  "project_title": "{{ $json.body.objects[0].value }}",
  "client_name":   "{{ $json.body.objects[1].value }}",
  "reference_id":  "{{ $json.body.objects[2].value }}",
  "generated_at":  "{{ $json.body.objects[3].value }}"
}
  • If your parser keys are named, use them (e.g., {{ $json.body.objects.find(o => o.name==='project_title').value }}) — but the simple index form is fine if your order is stable.

Advanced Options → name: ={{ $json.fileBase }}_Cover

  • What happens: PDF.co renders your template with those values → returns a url to the generated cover PDF.

Step 6: Gather all PDF URLs (cover + four uploaded documents)

Node: Merge (mode: Wait for all inputs)

Connect inputs from:

  • PDFco Api3 (Cover page)
  • PDFco Api1 (Filled_Form upload → also goes to parser)
  • PDFco Api (Engagement_Agreement upload)
  • PDFco Api2 (SOW upload)
  • PDFco Api4 (MNDA upload)

Then: a Code node to collect URLs and build a final merged name.

Code (copy–paste):

// Collect URLs, keep first 5, build final name
function sanitizeName(s = '') {
  return String(s)
    .trim()
    .replace(/[\/\\:*?"<>|]+/g, '')
    .replace(/\s+/g, ' ')
    .replace(/[\s]+/g, '_')
    .replace(/[^\w.\-]+/g, '_')
    .replace(/_+/g, '_')
    .replace(/^_+|_+$/g, '');
}

const incoming = $input.all();
const urls = [];
const nameHints = [];

for (const it of incoming) {
  // Most PDFco upload/convert nodes return { url, name }
  const u = it.json?.url || it.json?.fileUrl || it.json?.downloadUrl;
  if (typeof u === 'string' && /^https?:\/\//i.test(u)) {
    urls.push(u);
  }
  nameHints.push(it.json?.name, it.json?.fileName, it.json?.filename);
}

// keep order, de-dup, cap at 5
const seen = new Set();
const unique = [];
for (const u of urls) {
  if (!seen.has(u)) { seen.add(u); unique.push(u); }
  if (unique.length >= 5) break;
}

// final merged filename (prefer existing base from upstream if present)
const base = sanitizeName($json.fileBase || 'AttachmentPacket');
const fileName = `${base}.pdf`;
const out = { fileName, urls: unique };
unique.forEach((u, i) => out[`url${i + 1}`] = u);
return [{ json: out }];

You’ll now have:

  • url1..url5
  • fileName (e.g., AcmeCorp_ProjectX_2025-11-09.pdf)

Step 7: Merge PDFs

Node: PDFco Api5

Endpoint Used: PDF Merge

Docs: https://docs.pdf.co/api-reference/merge/pdf

Settings

  • Operation: Merge PDF
  • url (multiple): map your ordered list:
={{ $json.urls[1] }}   // e.g., Cover
={{ $json.urls[0] }}   // Agreement
={{ $json.urls[2] }}   // Filled Form
={{ $json.urls[3] }}   // MNDA
={{ $json.urls[4] }}   // SOW
  • Adjust order to your preference (cover first).

Advanced Options → name: ={{ $json.fileName }}

  • Response: Returns a url to the merged booklet.

Step 8: Download merged file and upload to Google Drive

Node: HTTP Request (GET merged file)

  • URL: ={{ $json.url }}
  • Options → Response → Format: File
  • Binary Property: data

Node: Google Drive → Upload File

Settings:

  • Operation: Upload
  • Input Data Field Name: data
  • Parent Drive from List: My Drive
  • Parent Folder by ID: Select your destination folder

What This Does: Uploads the finalized PDF into your chosen Drive folder.

Congrats! You’ve created a streamlined legal packet builder that automatically assembles personalized, shareable, and professional PDF booklets from client emails — no manual intervention needed.

Built something cool with this workflow? Share it with us @pdfdotco and show how you’re automating document magic!

API Reference (links)

Related Tutorials

See Related Tutorials