Python has a large set of libraries for handling different types of operations. To extract the data from PDF, we will use the PDF.co Web API.
In this article, We are going to extract hyperlinks from PDF in Python using PDF.co Web API
- Install Request Module
- Python Sample Code
- PDF.co API Key
- Source File and Destination
- Custom Profile
- Run Program
- Output
- PDF to JSON Demo
We have here a sample PDF and will extract the hyperlinks using Python

Step 1: Install Request Module
- First, install the request module. Type
python -m pip install request
in your command line.
Step 2: Python Sample Code
- Next, let’s add the Python sample code in the Visual Studio Code Editor. You can also use your favorite editor in Python. Kindly click this link for the source code.
Step 3: PDF.co API Key
- Then, add the PDF.co API Key. You can get the API Key in your PDF.co dashboard.
Step 4: Source File and Destination
- In line 12, input the source PDF file name.
- In line 18, type in your desired JSON output file name.
Step 5: Custom Profiles
- In line 56, we will use a set advanced conversion profile
{ "OutputStructure": "OnlyLinks", "OutputTransformation": "$..text" }.
It will extract all links in a PDF.
Step 6: Run Program
- once the program runs successfully, check your program folder to view the output.
Step 7: Output
- Here are the extracted links in JSON format.
Step 8: PDF to JSON Demo
- Here’s a quick demo in PDF to JSON advanced conversion.

In this article, you learned how to extract hyperlinks from PDF in Python. You also learned how to use PDF.co Web API to extract multiple links from a PDF.