Resources for the DAS 2014 submission

From LRDE

Abstract

Mathematical morphology, when used in the field of document image analysis and processing, is often limited to some classical yet basic tools. The domain however features a lesser-known class of powerful operators, called connected filters. These operators present an important property: they do not shift nor create contours. Most connected filters are linked to a tree-based representation of an image's contents, where nodes represent connected components while edges express an inclusion relation. By computing attributes for each node of the tree from the corresponding connected component, then selecting nodes according to an attribute-based criterion, one can either filter or recognize objects in an image. This strategy is very intuitive, efficient, easy to implement, and actually well-suited to processing images of magazines. Examples of applications include image simplification, smart binarization, and object identification.

Documents

Warning: Display title "Resources for the DAS 2014 submission" overrides earlier display title "Planting, Growing and Pruning Trees: Connected Filters Applied to Document Image Analysis".

This page gathers resources related to the article entitled Planting, Growing, and Pruning Trees: Connected Filters Applied to Document Image Analysis submitted to the 11th IAPR International Workshop on Document Analysis Systems (DAS 2014).


Full Resolution Illustrations from the Paper

Click on image thumbnails to see the full resolution images.

Fig. 8: Comparison of an opening based on a structuring element and an algebraic opening.

Lena.png
Lena-opening.png
Lena-area-opening.png

Initial image.

Structural opening (with a disc).

Algebraic (area) opening.


Fig. 9: Sample uses of connected operators.

Mp00215c 50p.png
Mp00215c 50p cartouche.png

(a) Filtering out everything but boxes.

Mp00032c 50p.png
Mp00032c 50p lines.png

(b) Showing filtered lines.

Mp00550c 50p.png
Mp00550c 50p text.png

(c) An image featuring almost only text.


Additional Illustrations

Zero-crossing contours of different Laplace operators

Lap4 0i.png
Lap17 0i.png

(a) Δ4

(b) LoG 17x17

Lapm51 0i.png
Lapm17 0i.png

(c) Δ□51

(d) Δ□17

Applying some applications upon the simplified tree structure

234.png
234lap.png

(a) Color input image.

(b) Δ□51

234mean.png
234smart.png

(c) Simplification.

(d) Binarization.


Additional Results

Some morphological methods have been run on 63 document images from the PRImA Layout Analysis Dataset. Several results are proposed:

Click on image thumbnails to see the full resolution images.


Simplification

Mp00032c mean color.png

Simplification of color images.
Download all results (40MB)


Binarization

Mp00032c bin smart.png

"Smart" binarization taking into account reverse video.
Download all results (9MB)

Identification

Mp00032c superposed.png

Images combining the result of several identifications: background (black), non-specific objects (%SILVER%light gray%ENDCOLOR% contours), object holes (%YELLOW%yellow%ENDCOLOR% contours), thin line separators (%GREEN%green%ENDCOLOR% contours), text boxes (%AQUA%light blue%ENDCOLOR% contours), noise (%RED%dark red%ENDCOLOR% contours) and spurious shapes (%GRAY%dark gray%ENDCOLOR% contours).
Download all results (85MB)

Show-through extraction

Mp00076c 50p fg.png
Mp00076c 50p bg.png


Download all results (157MB)

Boxes filtering

Mp00290c 50p filtering boxes.png


Download all results (26MB)

Lines filtering

Mp00252c 50p filtering lines.png


Download all results (300KB)

Text filtering

Mp00076c 50p filtering text.png


Download all results (109MB)

Bibtex (lrde.bib)

@InProceedings{	  lazzara.14.das,
  author	= {Guillaume Lazzara and Thierry G\'eraud and Roland
		  Levillain},
  title		= {Planting, Growing and Pruning Trees: Connected Filters
		  Applied to Document Image Analysis},
  booktitle	= {Proceedings of the 11th IAPR International Workshop on
		  Document Analysis Systems (DAS)},
  year		= 2014,
  address	= {Tours, France},
  pages		= {36--40},
  month		= apr,
  organization	= {IAPR},
  abstract	= {Mathematical morphology, when used in the field of
		  document image analysis and processing, is often limited to
		  some classical yet basic tools. The domain however features
		  a lesser-known class of powerful operators, called
		  connected filters. These operators present an important
		  property: they do not shift nor create contours. Most
		  connected filters are linked to a tree-based representation
		  of an image's contents, where nodes represent connected
		  components while edges express an inclusion relation. By
		  computing attributes for each node of the tree from the
		  corresponding connected component, then selecting nodes
		  according to an attribute-based criterion, one can either
		  filter or recognize objects in an image. This strategy is
		  very intuitive, efficient, easy to implement, and actually
		  well-suited to processing images of magazines. Examples of
		  applications include image simplification, smart
		  binarization, and object identification. }
}