The Document Classifier is available as a desktop app and a virtual app. The desktop app is a component of the PDF Multitool and you can download it here. For more information, please visit this page.
The Document Classifier virtual app is the improved version of the desktop app. It has the desktop app functionalities plus the following advanced features:
- Uses Auto AI rules by default
- Supports loading files from local machine and file URL
- The ability to add more rule expressions has been expanded
How to Use Document Classifier Virtual App
Step 1 – Load the File
There are two ways to load your source files in the app. You can load files from your local machine and thru a file URL.
Let’s open the Document Parser virtual app here and load our PDF invoice from our local machine.
Step 2 – Run Document Classifier
After the file is loaded, let’s click on the Run Document Classifier button to see the result.
The Document Classifier virtual app uses the rules set by AI by default. The AI classified our PDF file as an invoice, related to finance, and a document.
Step 3 – Create Custom Classification Rules
Now, let’s create our own rules. Go to the Custom Classification Rules tab and place a checkmark on Enable custom rules.
Step 4 – Add Rule
Let’s click on the Add a row button and type the Class name, select the Logic, and enter the Expression(s) of the new rule.
This invoice comes from one of our clients ACME Inc. We want to classify all incoming files from different clients because some of them might require a different workflow.
Step 5 – Re-run Document Classifier
Let’s re-run the Document Classifier and you’ll see that the result has changed. The AI-provided classes have been replaced with our custom rules.
Step 6 – Import & Export Rules
To save our work and keep a back-up, we can export the rules in CSV format. This same file can also be imported back into the app.
Step 7 – Import Format
The Document Classifier follows a specific format. When you import your CSV file, please refer to the image below for the document layout.
Step 8 – Load More Files
Let’s load another file and this time, we’ll use the file’s URL. When you re-run the Document Classifier, you’ll see that the engine assigns classes for both files.
Step 9 – Copy Source Code
We are now ready to test our custom rules in a Document Classifier API call. Click on the Copy source code for pdf/classifier endpoint or module button and copy the code from the editor.
Step 10 – Open Request Tester Tool
Let’s open the Request Tester Tool here. Please double-check the PDF.co API Endpoint selected is the pdf/classifier.
Paste the source code in the editor and load your file. And then click on Run Request.
Step 11 – Document Classifier API Result
You’ll now see the Document Classifier output in JSON format.