Extract Text Content from PDF or XPS in C#
In this tutorial, we will show you how to extract text content from PDF in C# using the PDF.co PDF to Text Web API. Below is the PDF Invoice that we will convert to Text.
data:image/s3,"s3://crabby-images/2d959/2d959d1d25d05ec792921f3a996586afe84b391e" alt="PDF Source File"
Step 1: Create New Project
To begin, let’s create a new project inside app folder using this command dotnet new console -o app
.
data:image/s3,"s3://crabby-images/4ba9e/4ba9e74fdf6fa2161d4a6fa2958c8a9737aa8501" alt="Create New Project"
Step 2: Open VSCode
Type cd app
to go to the folder and enter code .
to open VSCode.
data:image/s3,"s3://crabby-images/c9e3d/c9e3df5b6b95b009a43ede7cd84514ee19df1a50" alt="Open VSCode"
Step 3: Add Source Code
Let’s copy the PDF to Text from URL in C# source code from the documentation sample.
Step 4: Add Package
Then, let’s add a Newtonsoft.Json package using this command dotnet add package Newtonsoft.Json
.
data:image/s3,"s3://crabby-images/533df/533dfa7d7f6ef24684657cb3926a32872f8b3f72" alt="Add Package"
Step 5: Add API Key
Now, let’s add our PDF.co API Key in line 14. You can get your API Key in the PDF.co dashboard.
data:image/s3,"s3://crabby-images/d8bc9/d8bc9d00488c4c66f025df1d049affdcae8c4b7f" alt="API Key"
Step 6: Add Source File
In line 18, you can find the PDF Invoice URL. If you’d like to try your file, please replace the sample link.
data:image/s3,"s3://crabby-images/500fb/500fb97fcf11ec213313ec02d9657605e401758e" alt="Add Source File"
Step 7: Add Destination File
In line 24, enter your desired output filename.
data:image/s3,"s3://crabby-images/3613c/3613c959e2655444909b1fe7c05d139ef8fe83ee" alt="Add Destination File"
Step 8: Run Project
We are now ready to run the project. In the terminal type the command dotnet run
.
data:image/s3,"s3://crabby-images/f6e85/f6e85d9d316d83930e7650b84efb21d02077a291" alt="Run Project"
Step 9: Extracted Text Output
Here’s our extracted text output.
data:image/s3,"s3://crabby-images/df03c/df03c63eee771b7876c028b8c24b73edd9d057dc" alt="Extracted Text Output"
In this tutorial, you learned how to extract text from PDF in C# using PDF.co Web API. You learned how to create a new project in C#. You also learned how to add a Newtonsoft.Json package.
Related Tutorials
data:image/s3,"s3://crabby-images/708ab/708ab1fff1041b667446e8bda0ee2399b271ea6d" alt="Tutorial default thumbnail"
data:image/s3,"s3://crabby-images/708ab/708ab1fff1041b667446e8bda0ee2399b271ea6d" alt="Tutorial default thumbnail"
data:image/s3,"s3://crabby-images/708ab/708ab1fff1041b667446e8bda0ee2399b271ea6d" alt="Tutorial default thumbnail"
data:image/s3,"s3://crabby-images/708ab/708ab1fff1041b667446e8bda0ee2399b271ea6d" alt="Tutorial default thumbnail"