Step 1: Configuring Box Supervised Layout
Step 1.1 Generation of Meta Data File
The metadata for the box supervised tool essentially acts as a configuration file, through which the user can specify the paths of the json data for the layout, as well as the layout type which the data belongs to (Box Supervised Region Parsing in this case). Additionally, the user can also specify other configuration data for the layout which would be rendered from this metadata, such as whether certain outputs should be shown in the plot with a mask or not.
Shown below is an example metadata file:
{
"metaData":{
"pageLayout": "Box-supervised Region Parsing",
"pageName": "BoundaryNet",
"dataPaths": {
"Class 1":"/path/to/class_1.json",
"Class 2":"/path/to/class_2.json",
.
.
.
"Class N":"/path/to/class_N.json"
},
"outputMasks": {"output1":1,"output2":0, ... , "outputN":1},
"defaultDisplayed": ["output1-polygon","output2-polygon", ... , "outputM-polygon"]
}
}
-
metaData
:dict (required)All the relevant data is within the
metaData
attribute of the json data. -
pageLayout
:string (required)pageLayout
tells the app which layout the data should be rendered in. “box supervised Region Parsing” is the expected value for the data to be rendered in the box supervised Page Layout. -
pageName
:string (required)pageName
is the name of the instance of the page layout for the data. Since there can be multiple instances of the same layout for different data, thepageName
attribute is essential to identify the particular page of interest. -
dataPaths
:dict (required)dataPaths
is a dictionary with the keys being the name of the group of data (train/test/val for example), and the values are the path to thatjson
data. -
outputMasks
:dictoutputMasks
is a dictionary with the keysgroundTruth
andmodelPrediction
, and their corresponding values a boolean - whether the output should have a mask or not in the visualization plot. -
defualtDisplayed
:listdefaultDisplayed
is a list of all the outputs the user wishes to be displayed and locked by default. The user should enter the exact output they wish to display. As shown in the example above, the user must provide not just the output to be shown, but also which of thepts
,polygon
ormask
outputs should be shown.
An example instance of the metadata file can be found here
Multiple Format Support
DocVisor is compatible with files of the MS-COCO segmentation format. To use a json file which is of the COCO format, simply add dataFormat
to your metadata file with the value "coco"
. An example file is shown below:
{
"metaData":{
"pageLayout": "Box-supervised Region Parsing",
"pageName": "BoundaryNet",
"dataPaths": {
"Class 1":"/path/to/class_1.json",
"Class 2":"/path/to/class_2.json",
.
.
.
"Class N":"/path/to/class_N.json"
},
"outputMasks": {"output1":1,"output2":0, ... , "outputN":1},
"defaultDisplayed": ["output1-polygon","output2-polygon", ... , "outputM-polygon"],
"dataFormat": "coco"
}
}
Step 1.2 Setup Directory
Create a directory containing only config files. Each instance of the layout should have a json file. For example, if we are trying to visualize data for two box supervised models, the following is a possible directory structure:
metadata/
- boxSupervised_model1.json
- boxSupervised_model2.json
With each of the json files following the format as described above. Note that no two instances of the layout should have the same pageName
key.
Multiple Layouts
If you are trying to visualize results of either OCR or Fully Automatic Layouts, you can do so, by creating similar metaData files for them as described in their corresponding documentations and then place all of them into this metaData directory.
Example of directory containing all three layouts:
metaData/
- ocr_handwritten.json
- fullyAutomatic.json
- boxSupervised.json
The docVisor tool will launch single instances of the OCR, Fully Automatic and Box Supervised layouts.
Multiple Instances of the Same Layout
The DocVisor tool allows the user to view multiple instances of the same layout in a single session. For example, if you would like to visualize outputs for two box supervised models, as well as a fully automatic layout and an OCR layout, you can just create mulitple a seperate json file with unique pageName
key and place the file in the metaData directory.
The metaData directory will look like:
metaData/
- ocr_handwritten.json
- fullyAutomatic.json
- boxSupervised_model1.json
- boxSupervised_model2.json
For the above directory structure, the DocVisor tool will load two instances of the Box Supervised layout and a single layout of OCR and Fully Automatic layouts.
Step 1.3: Formatting Data Files
The data files are json
files which contain the data that is to be visualized. Shown below is an example data file:
[
{
"imagePath": "/home/user/new_jpg_data/Bhoomi_data/images/AAVARNI VYAKHYA/GOML/991/19.jpg",
"outputs": {
"ground_truth": [],
"gcn_output": [],
"encoder_output": []
},
"metrics": {
"iou": 0.6072467801928089,
"hd": 115.10864433221339
},
"regionLabel": "Physical Degradation",
"bbox": [256, 7, 302, 165],
"collection": "bhoomi"
}
]
The json file with the data for the box-supervised layout is a list of dictionaries. Each dictionary represents the data of a single region of an image . The structure of each dictionary is as follows:
-
imagePath
:string (required)imagePath
is the path to the image of the image associated with the outputs to be visualized. -
outputs
:dict (required)outputs
is an dictionary of lists. Each list in turn stores the points of the output polygon to be visualized. In the above example, there are three outputs to be visualized:ground_truth
,gcn_output
andencoder_output
. Note that the coordinates should be relative to the bounding box for each region. Basically, subtract the x and y coordinates of the top-left of the bounding box for each point. -
metrics
:dictmetrics
is a dictionary containing the values of all the metrics computed for the data. In the above example, there are two metrics present for the data:iou
andhd
. -
regionLabel
:string (required)regionLabel
contains the label of the region class for the data. In the above example, theregionLabel
isPhysical Degradation
. -
bbox
:list (required)bbox
is the user-annotated bounding box for the current region. The bounding box should be of the form[x_0,y_0,w,h]
, wherex_0
is the x-coordinate of the top-left corner,y_0
is the y-coordinate of the top-left corner,w
andh
are the width and height of the bounding box respectively. -
collection
:stringcollection
is additional information such as the collection the current data belongs to in the dataset.
Example instances of the json data file can be found here
Step 1.4 Updating the Config File
In tool/config.py file, change the metaDataDir
to point to the metaData Directory that you have created.
Step 2: Launch the Tool
To launch the tool, you need to run the ./run.sh file. The tool will load on localhost with port 8501
. If 8501
is pre-occupied, check the terminal to know which exact port in which it has been loaded.
Help and Feedback
For Feedback or queries, you can either visit the github repo and create issues or use the discussion format. For more details, you can mail, docvisor.iiith@gmail.com.