How to convert PDF to XML from URL asynchronously for PDF to XML API in PowerShell and PDF.co Web API
Step By Step Tutorial: how to convert PDF to XML from URL asynchronously for PDF to XML API in PowerShell
Today you are going to learn how to convert PDF to XML from URL asynchronously in PowerShell. PDF to XML API in PowerShell can be implemented with PDF.co Web API. PDF.co Web API is the Rest API that provides set of data extraction functions, tools for documents manipulation, splitting and merging of pdf files. Includes built-in OCR, images recognition, can generate and read barcodes from images, scans and pdf.
You will save a lot of time on writing and testing code as you may just take the code below and use it in your application. Follow the instruction and copy – paste code for PowerShell into your project’s code editor. Tutorials are available along with installed PDF.co Web API if you’d like to dive deeper into the topic and the details of the API.
ByteScout free trial version is available for FREE download from our website. Programming tutorials along with source code samples are included.
On-demand (REST Web API) version:
Web API (on-demand version)
On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)
ConvertPdfToXmlFromUrlAsynchronously.ps1
# Cloud API asynchronous "PDF To XML" job example. # Allows to avoid timeout errors when processing huge or scanned PDF documents. # The authentication key (API Key). # Get your own by registering at https://app.pdf.co/documentation/api $API_KEY = "***********************************" # Direct URL of source PDF file. $SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-to-xml/sample.pdf" # Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'. $Pages = "" # PDF document password. Leave empty for unprotected documents. $Password = "" # Destination XML file name $DestinationFile = ".\result.xml" # (!) Make asynchronous job $Async = $true # Prepare URL for `PDF To XML` API call $query = "https://api.pdf.co/v1/pdf/convert/to/xml" # Prepare request body (will be auto-converted to JSON by Invoke-RestMethod) # See documentation: https://apidocs.pdf.co $body = @{ "name" = $(Split-Path $DestinationFile -Leaf) "password" = $Password "pages" = $Pages "url" = $SourceFileUrl "async" = $Async } | ConvertTo-Json try { # Execute request $response = Invoke-WebRequest -Method Post -Headers @{ "x-api-key" = $API_KEY; "Content-Type" = "application/json" } -Body $body -Uri $query $jsonResponse = $response.Content | ConvertFrom-Json if ($jsonResponse.error -eq $false) { # Asynchronous job ID $jobId = $jsonResponse.jobId # URL of generated XML file that will available after the job completion $resultFileUrl = $jsonResponse.url # Check the job status in a loop. do { $statusCheckUrl = "https://api.pdf.co/v1/job/check?jobid=" + $jobId $jsonStatus = Invoke-RestMethod -Method Get -Headers @{ "x-api-key" = $API_KEY } -Uri $statusCheckUrl # Display timestamp and status (for demo purposes) Write-Host "$(Get-date): $($jsonStatus.status)" if ($jsonStatus.status -eq "success") { # Download XML file Invoke-WebRequest -Headers @{ "x-api-key" = $API_KEY } -OutFile $DestinationFile -Uri $resultFileUrl Write-Host "Generated XML file saved as `"$($DestinationFile)`" file." break } elseif ($jsonStatus.status -eq "working") { # Pause for a few seconds Start-Sleep -Seconds 3 } else { Write-Host $jsonStatus.status break } } while ($true) } else { # Display service reported error Write-Host $jsonResponse.message } } catch { # Display request error Write-Host $_.Exception }
run.bat
@echo off powershell -NoProfile -ExecutionPolicy Bypass -Command "& .\ConvertPdfToXmlFromUrlAsynchronously.ps1" echo Script finished with errorlevel=%errorlevel% pause
VIDEO
ON-PREMISE OFFLINE SDK
See also:
ON-DEMAND REST WEB API
Get Your API Key
See also:
PDF-co-Web-API-PowerShell-Convert-PDF-To-XML-From-URL-Asynchronously.pdf