Step 1: Configuring Box Supervised Layout
Step 1.1 Generation of Meta Data File
The metadata for the box supervised tool essentially acts as a configuration file, through which the user can specify the paths of the json data for the layout, as well as the layout type which the data belongs to (Box Supervised Region Parsing in this case). Additionally, the user can also specify other configuration data for the layout which would be rendered from this metadata, such as whether certain outputs should be shown in the plot with a mask or not.
Shown below is an example metadata file:
{
"metaData":{
"pageLayout": "Box-supervised Region Parsing",
"pageName": "BoundaryNet",
"dataPaths": {
"Class 1":"/path/to/class_1.json",
"Class 2":"/path/to/class_2.json",
.
.
.
"Class N":"/path/to/class_N.json"
},
"outputMasks": {"output1":1,"output2":0, ... , "outputN":1},
"defaultDisplayed": ["output1-polygon","output2-polygon", ... , "outputM-polygon"]
}
}
-
metaData:dict (required)All the relevant data is within the
metaDataattribute of the json data. -
pageLayout:string (required)pageLayouttells the app which layout the data should be rendered in. “box supervised Region Parsing” is the expected value for the data to be rendered in the box supervised Page Layout. -
pageName:string (required)pageNameis the name of the instance of the page layout for the data. Since there can be multiple instances of the same layout for different data, thepageNameattribute is essential to identify the particular page of interest. -
dataPaths:dict (required)dataPathsis a dictionary with the keys being the name of the group of data (train/test/val for example), and the values are the path to thatjsondata. -
outputMasks:dictoutputMasksis a dictionary with the keysgroundTruthandmodelPrediction, and their corresponding values a boolean - whether the output should have a mask or not in the visualization plot. -
defualtDisplayed:listdefaultDisplayedis a list of all the outputs the user wishes to be displayed and locked by default. The user should enter the exact output they wish to display. As shown in the example above, the user must provide not just the output to be shown, but also which of thepts,polygonormaskoutputs should be shown.
An example instance of the metadata file can be found here
Multiple Format Support
DocVisor is compatible with files of the MS-COCO segmentation format. To use a json file which is of the COCO format, simply add dataFormat to your metadata file with the value "coco". An example file is shown below:
{
"metaData":{
"pageLayout": "Box-supervised Region Parsing",
"pageName": "BoundaryNet",
"dataPaths": {
"Class 1":"/path/to/class_1.json",
"Class 2":"/path/to/class_2.json",
.
.
.
"Class N":"/path/to/class_N.json"
},
"outputMasks": {"output1":1,"output2":0, ... , "outputN":1},
"defaultDisplayed": ["output1-polygon","output2-polygon", ... , "outputM-polygon"],
"dataFormat": "coco"
}
}
Step 1.2 Setup Directory
Create a directory containing only config files. Each instance of the layout should have a json file. For example, if we are trying to visualize data for two box supervised models, the following is a possible directory structure:
metadata/
- boxSupervised_model1.json
- boxSupervised_model2.json
With each of the json files following the format as described above. Note that no two instances of the layout should have the same pageName key.
Multiple Layouts
If you are trying to visualize results of either OCR or Fully Automatic Layouts, you can do so, by creating similar metaData files for them as described in their corresponding documentations and then place all of them into this metaData directory.
Example of directory containing all three layouts:
metaData/
- ocr_handwritten.json
- fullyAutomatic.json
- boxSupervised.json
The docVisor tool will launch single instances of the OCR, Fully Automatic and Box Supervised layouts.
Multiple Instances of the Same Layout
The DocVisor tool allows the user to view multiple instances of the same layout in a single session. For example, if you would like to visualize outputs for two box supervised models, as well as a fully automatic layout and an OCR layout, you can just create mulitple a seperate json file with unique pageName key and place the file in the metaData directory.
The metaData directory will look like:
metaData/
- ocr_handwritten.json
- fullyAutomatic.json
- boxSupervised_model1.json
- boxSupervised_model2.json
For the above directory structure, the DocVisor tool will load two instances of the Box Supervised layout and a single layout of OCR and Fully Automatic layouts.
Step 1.3: Formatting Data Files
The data files are json files which contain the data that is to be visualized. Shown below is an example data file:
[
{
"imagePath": "/home/user/new_jpg_data/Bhoomi_data/images/AAVARNI VYAKHYA/GOML/991/19.jpg",
"outputs": {
"ground_truth": [],
"gcn_output": [],
"encoder_output": []
},
"metrics": {
"iou": 0.6072467801928089,
"hd": 115.10864433221339
},
"regionLabel": "Physical Degradation",
"bbox": [256, 7, 302, 165],
"collection": "bhoomi"
}
]
The json file with the data for the box-supervised layout is a list of dictionaries. Each dictionary represents the data of a single region of an image . The structure of each dictionary is as follows:
-
imagePath:string (required)imagePathis the path to the image of the image associated with the outputs to be visualized. -
outputs:dict (required)outputsis an dictionary of lists. Each list in turn stores the points of the output polygon to be visualized. In the above example, there are three outputs to be visualized:ground_truth,gcn_outputandencoder_output. Note that the coordinates should be relative to the bounding box for each region. Basically, subtract the x and y coordinates of the top-left of the bounding box for each point. -
metrics:dictmetricsis a dictionary containing the values of all the metrics computed for the data. In the above example, there are two metrics present for the data:iouandhd. -
regionLabel:string (required)regionLabelcontains the label of the region class for the data. In the above example, theregionLabelisPhysical Degradation. -
bbox:list (required)bboxis the user-annotated bounding box for the current region. The bounding box should be of the form[x_0,y_0,w,h], wherex_0is the x-coordinate of the top-left corner,y_0is the y-coordinate of the top-left corner,wandhare the width and height of the bounding box respectively. -
collection:stringcollectionis additional information such as the collection the current data belongs to in the dataset.
Example instances of the json data file can be found here
Step 1.4 Updating the Config File
In tool/config.py file, change the metaDataDir to point to the metaData Directory that you have created.
Step 2: Launch the Tool
To launch the tool, you need to run the ./run.sh file. The tool will load on localhost with port 8501. If 8501 is pre-occupied, check the terminal to know which exact port in which it has been loaded.
Help and Feedback
For Feedback or queries, you can either visit the github repo and create issues or use the discussion format. For more details, you can mail, docvisor.iiith@gmail.com.