Step 1 : Download or Clone DocVisor tool
Clone Using Git
If you have git installed on your local machine, run the following command to clone the docvisor repository.
git clone https://github.com/ihdia/docvisor
Download Zip
If you do not have git or you want to download the zip file, download the zip file from here and unzip the tool to any location on your divice.
Step 2: Data Preperation
There three main layouts of the DocVisor tool:
- Fully Automatic
- Box Supervised
- OCR
You can load one or more of these tools to the DocVisor tool at any given point in time.
- To load the Fully Automatic tool, prepare your datafiles as described here
- To load the Box Supervised tool, prepare your datafiles as described here
- To load the OCR tool, prepare your datafiles as described here
Step 3: Setting up your environment
-
Create a conda environment using the following command:
conda create --name docvisor
-
Ensure that the pip points to the docvisor environment by running
which pip
. If it does not, then run the following command:conda install pip
-
Install the requirements necessary:
pip install -r requirements.txt
Step 4: Modify Config File
- Place all the metaData files in one directory
The metaData directory will look like:
metaData/
- ocr_handwritten.json
- ocr_printed.json
- fullyAutomatic.json
- boxSupervised.json
- Change the path of the metaData file in the tool/config.py file.
Step 5: Launch the tool
Launch the tool by running ./run.sh
script.
Load Example Data
We have provided an example folder in the repository for all the layouts. To load the example layouts, follow the steps below:
- Ensure that the
metaDataDir
field intool/config.py
is set toexample/metaData
- Run
./run.sh
script to load the app