What is Image to Text Mapping
In the process of learning to map a line image to its corresponding text, in intermediate stage produces the alignment matrix via the attention component. We leverage this attention/alignment matrix to introduce a cool feature of Image to Text Mapping and Text to Image mapping.
The attentions produced at each time step, can be viewed as a probability distributions indicating the decoder to pay-attention to a certain features of the image. Using this matrix we find the location the model is paying attention at that time step. To see how this is done, visit this page:
Step 1: In the Visualize Settings of the OCR-Layout, select the Image Selection Component.
Step 2: Select a any sub-portion of the image.
Step 3: View the the corresponding text highlighted on all the models that have attentions provided.
The following gif should give you an idea of the Image Selection Feature and how cool it is.
Text Font Size
For purpose of easy-visualization, users might want to aling text along with the characters in the image. To do so, follow the steps below:
To do so, follow the below steps:
- Open the sidebar (if it is not open)
- Expand the Render component under Settings
- Go to Text Font Size and choose the font-size value that bests suits the image/dataset.
The following gif should help you understand the feature.
Notice that the predicted text words are almost in line with the lines image words, which can help in visualization.