Our collaboration with SWT (now EMC Captiva) between 1999 and 2006 focused on the recognition of types of scanned documents. The aim was to process, after scanning, incoming surface mail (letters, invoices, forms). To do this, relying on similarity measures, we developed a classifier that automatically reveals the different types of received documents. A statistical study can then extract descriptors for each type of document in the form of relevant sub-sections (thumbnails). We finally developed a recognition engine based on the theory of evidence to make the sorting on the fly of incoming mail.
This work has resulted in two patents and led to a European award for innovation (IST 2004).