Revision as of 15:32, 7 July 2016

EvaLTex (Evaluating Text Localization) is a unified evaluation framework used to measure the performance of text detection and text segmentation algorithms. It takes as input text objects represented either by rectangle coordinates or by irregular masks. The output consists of a complex set of scores, at local and global levels, and a visual representation of the behavior of the analysed algorithm through quality histograms.

For more details on the evaluation protocol, read the scientific paper published in the Image and Vision Computing Journal. Details on the visual representation of the evaluation can be found in the article published in the Proc. of International Conference in Document Analysis and Recognition.

Please cite the IVC paper in all publications that use the EvaLTex tool and the ICDAR paper in all publications that use the histogram representation.

Performance measurements

Local evaluation

For each matched GT object we assign two quality measures: Coverage (Cov) and Accuracy (Acc);

Cov computes the rate of the matched area with respect to the GT object area
Acc computes the rate of the matched area with respect to the detection area

Global evaluation

The Recall ( $R$ ) computes the amount of detected text. We compute 3 measures: a global $R$ , a quantitative $R$ that measures the amount detected objects (regardless of the matched area) and a qualitative $R$ that corresponds to the rate of the detected text area with respect to the number of true positives ( $TP$ ).

$R_{G}={\frac {\sum Cov}{G}}$
$R_{quant}={\frac {TP}{G}}$
$R_{qual}={\frac {\sum Cov}{TP}}$

The Precision ( $P$ ) computes the rate of detections that have a match in the GT. Similarly to $R$ , we compute 3 measures: a quantitative $P$ that measures the amount of valid detections (regardless of the matched area) and a qualitative $P$ that corresponds to the rate of the detected text area with respect to the number of total detections, computed as the sum of $TP$ and $FP$

$P_{G}={\frac {\sum Acc}{TP+FP}}$
$P_{quant}={\frac {TP}{TP+FP}}$
$P_{qual}={\frac {\sum Acc}{TP+FP}}$

Input format

The framework takes as input .txt files containing the coordinates of the bounding boxes surrounding the text objects and binary images corresponding to the text object masks.

Ground truth (GT)

The GT files contains the reference to which the detection and segmentation results will be compared to. For text detection tasks using bounding boxes, a .txt file is enough. If the text objects are represented by irregular masks, then an additional labeled image will be needed.

Text detection

The GT format contains the following attributes:

img name
image height, image width
text object

ID: unique text object ID
region ID: region ID to which the object belongs to
"transcription": can be empty
text reject: option that decides if a text object should be counted or not; can be set to f (default) or t (not take into account)
x: x coordinate of the bounding box
y: y coordinate of the bounding box
width: width of the bounding box
height:x height of the bounding box

e.g.

img_1 960,1280 1,1,"Tiredness",f,38,43,882,172 2,2,"kills",f,275,264,390,186 3,3,"A",f,0,699,77,131 4,3,"short",f,128,705,355,134 5,3,"break",f,542,710,396,131 6,4,"could",f,87,884,370,137 7,4,"save",f,517,919,314,105 8,5,"your",f,166,1095,302,136 9,5,"life",f,530,1069,213,137

Text segmentation

To evaluate text segmentation we use, in addition to the .txt file a labeled image (each character is labeled differently). Each GT object is represented by a character. Character-level GT objects cannot be grouped into regions and consequently each text object has a different region tag. The x, y, width and height will define the coordinates of the bounding box of each character.

e.g.

img_1 960,1280 1,1,"",f,384,43,101,166 2,2,"",f,142,44,46,164 3,3,"",f,38,47,106,163 4,4,"",f,192,80,71,126 5,5,"",f,269,80,100,131 6,6,"",f,501,81,97,126 7,7,"",f,721,81,97,131

Detection/Segmentation

The detection .txt file formats differ slightly from the GT one:

no image size
no region tag
no reject option

e.g.

img_1 1,"",272,264,392,186 2,"",34,40,886,175 3,"",168,1082,300,148

Output

The evaluation results are given in two forms:

local evaluation .txt file for each image

EvaLTex statistics - image img_1 General Number of GTs =43 Number of detections = 19 Number of false positives =1 Number of true positives =18 Global results Recall=0.414803 Recall_noSplit=0.414803 Precision=0.921798 Split=0.428571 FScore=0.572144 FScore_noSplit=0.572144 Quantity results Recall=0.967873 Precision=0.947368 Quality results Recall=0.428571 Recall_noSplit=0.967873 Precision=0.973009 Coverage histogram = {0.571429, 0, 0, 0, 0, 0, 0.0238095, 0, 0.0238095, 0.380952} Accuracy histogram = {0.288977, 0.00137028, 0.00091352, 0.0022838, 0.00365408, 0.00471985, 0.00517661, 0.0103532, 0.0235993, 0.658952} EMD results Recall=0.420952 Recall_noSplit=0.420952 Precision=0.926316 FScore=0.578853 FScore_noSplit=0.578853 Local evaluation GT object 1 Coverage = 1 Accuracy = 0.991792 Split = 1 GT object 2 Coverage = 0.809862 Accuracy = 0.994543 Split = 1 GT object 3 Coverage = 1 Accuracy = 0.954386 Split = 1 GT object 4 Coverage = 0.998092 Accuracy = 0.967474 Split = 1 GT object 5 Coverage = 1 Accuracy = 0.993222 Split = 1 GT object 6 Coverage = 1 Accuracy = 0.960362 Split = 1 GT object 7 Coverage = 1 Accuracy = 0.987906 Split = 1 GT object 8 Coverage = 1 Accuracy = 0.977346 Split = 1 GT object 9 Coverage = 0.99977 Accuracy = 0.999885 Split = 1 GT object 10 Coverage = 1 Accuracy = 0.986737 Split = 1 GT object 11 Coverage = 1 Accuracy = 0.977986 Split = 1 GT object 12 Coverage = 1 Accuracy = 0.944269 Split = 1 GT object 13 Coverage = 1 Accuracy = 0.991489 Split = 1 GT object 14 Coverage = 1 Accuracy = 1 Split = 1 GT object 15 Coverage = 0 Accuracy = 0 Split = 0 GT object 16 Coverage = 0 Accuracy = 0 Split = 0 GT object 17 Coverage = 0 Accuracy = 0 Split = 0 GT object 18 Coverage = 0 Accuracy = 0 Split = 0 GT object 19 Coverage = 0 Accuracy = 0 Split = 0 GT object 20 Coverage = 0 Accuracy = 0 Split = 0 GT object 21 Coverage = 0 Accuracy = 0 Split = 0 GT object 22 Coverage = 0 Accuracy = 0 Split = 0 GT object 23 Coverage = 0 Accuracy = 0 Split = 0 GT object 24 Coverage = 0 Accuracy = 0 Split = 0 GT object 25 Coverage = 0 Accuracy = 0 Split = 0 GT object 26 Coverage = 0 Accuracy = 0 Split = 0 GT object 27 Coverage = 0 Accuracy = 0 Split = 0 GT object 28 Coverage = 0 Accuracy = 0 Split = 0 GT object 29 Coverage = 0 Accuracy = 0 Split = 0 GT object 30 Coverage = 0 Accuracy = 0 Split = 0 GT object 31 Coverage = 0 Accuracy = 0 Split = 0 GT object 32 Coverage = 0 Accuracy = 0 Split = 0 GT object 33 Coverage = 0 Accuracy = 0 Split = 0 GT object 34 Coverage = 0 Accuracy = 0 Split = 0 GT object 35 Coverage = 0 Accuracy = 0 Split = 0 GT object 36 Coverage = 0 Accuracy = 0 Split = 0 GT object 37 Coverage = 0 Accuracy = 0 Split = 0 GT object 38 Coverage = 0 Accuracy = 0 Split = 0 GT object 39 Coverage = 0.977941 Accuracy = 0.998527 Split = 1 GT object 40 Coverage = 0.99478 Accuracy = 0.965439 Split = 1 GT object 41 Coverage = 0 Accuracy = 0 Split = 0 GT object 42 Coverage = 0.661783 Accuracy = 1 Split = 1 GT object 43 Coverage = 0.979484 Accuracy = 0.822794 Split = 1

global evaluation for an entire dataset

Run the evaluation

Parameters to run the tool

Datasets

ICDAR 2013

Born-digital

ground truth .txt
labeled images

Natural scene

ground truth .txt
labeled images

Downloads

Credits

EvaLTex was written by Ana Stefania CALARASANU. Please send any suggestions, comments or bug reports to calarasanu@lrde.epita.fr

Please cite the ICV paper in all publications that use the EvaLTex tool and the ICDAR paper in all publications that use the histogram representation.

@@ Line 102: / Line 102: @@
 The evaluation results are given in two forms:
 * local evaluation ''.txt'' file for each image
+{| class="wikitable"
+|-
+|<small>
+EvaLTex statistics - image img_1<br/>
+General <br/>
+	Number of GTs =43<br/>
+	Number of detections = 19<br/>
+	Number of false positives =1<br/>
+	Number of true positives =18<br/>
+Global results<br/>
+	Recall=0.414803<br/>
+	Recall_noSplit=0.414803<br/>
+	Precision=0.921798<br/>
+	Split=0.428571<br/>
+	FScore=0.572144<br/>
+	FScore_noSplit=0.572144<br/>
+Quantity results<br/>
+	Recall=0.967873<br/>
+	Precision=0.947368<br/>
+Quality results<br/>
+	Recall=0.428571<br/>
+	Recall_noSplit=0.967873<br/>
+	Precision=0.973009<br/>
+	Coverage histogram = {0.571429, 0, 0, 0, 0, 0, 0.0238095, 0, 0.0238095, 0.380952}<br/>
+	Accuracy histogram = {0.288977, 0.00137028, 0.00091352, 0.0022838, 0.00365408, 0.00471985, 0.00517661, 0.0103532, 0.0235993, 0.658952}<br/>
+EMD results<br/>
+	Recall=0.420952<br/>
+	Recall_noSplit=0.420952<br/>
+	Precision=0.926316<br/>
+	FScore=0.578853<br/>
+	FScore_noSplit=0.578853<br/>
+Local evaluation<br/>
+GT object 1<br/>
+	Coverage = 1	Accuracy = 0.991792	Split = 1<br/>
+GT object 2<br/>
+	Coverage = 0.809862	Accuracy = 0.994543	Split = 1<br/>
+GT object 3<br/>
+	Coverage = 1	Accuracy = 0.954386	Split = 1<br/>
+GT object 4<br/>
+	Coverage = 0.998092	Accuracy = 0.967474	Split = 1<br/>
+GT object 5<br/>
+	Coverage = 1	Accuracy = 0.993222	Split = 1<br/>
+GT object 6<br/>
+	Coverage = 1	Accuracy = 0.960362	Split = 1<br/>
+GT object 7<br/>
+	Coverage = 1	Accuracy = 0.987906	Split = 1<br/>
+GT object 8<br/>
+	Coverage = 1	Accuracy = 0.977346	Split = 1<br/>
+GT object 9<br/>
+	Coverage = 0.99977	Accuracy = 0.999885	Split = 1<br/>
+GT object 10<br/>
+	Coverage = 1	Accuracy = 0.986737	Split = 1<br/>
+GT object 11<br/>
+	Coverage = 1	Accuracy = 0.977986	Split = 1<br/>
+GT object 12<br/>
+	Coverage = 1	Accuracy = 0.944269	Split = 1<br/>
+GT object 13<br/>
+	Coverage = 1	Accuracy = 0.991489	Split = 1<br/>
+GT object 14<br/>
+	Coverage = 1	Accuracy = 1	Split = 1<br/>
+GT object 15<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 16<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 17<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 18<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 19<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 20<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 21<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 22<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 23<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 24<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 25<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 26<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 27<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 28<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 29<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 30<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 31<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 32<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 33<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 34<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 35<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 36<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 37<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 38<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 39<br/>
+	Coverage = 0.977941	Accuracy = 0.998527	Split = 1<br/>
+GT object 40<br/>
+	Coverage = 0.99478	Accuracy = 0.965439	Split = 1<br/>
+GT object 41<br/>
+	Coverage = 0	Accuracy = 0	Split = 0<br/>
+GT object 42<br/><br/>
+	Coverage = 0.661783	Accuracy = 1	Split = 1<br/>
+GT object 43<br/>
+	Coverage = 0.979484	Accuracy = 0.822794	Split = 1<br/>
+</small>
+|}

Difference between revisions of "Evaltex"

From LRDE

Revision as of 15:32, 7 July 2016

Contents

Performance measurements

Local evaluation

Global evaluation

Input format

Ground truth (GT)

Text detection

Text segmentation

Detection/Segmentation

Output

Run the evaluation

Datasets

ICDAR 2013

Born-digital

Natural scene

Downloads

Credits