Improving OCR k-NN classifier's training set

From LRDE

Revision as of 17:07, 9 January 2018 by Bot (talk | contribs) (Created page with "{{CSIReport | authors = Anthony Seure | title = Improving OCR k-NN classifier's training set | year = 2015 | abstract = One part of an OCR toolchain is to classify detected ch...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Abstract

One part of an OCR toolchain is to classify detected characters: they can be lowercase or capital letters, or digits. To do so, our OCR computes for each image of character an associated wavelet-based descriptor. This descriptor can then be classified. The classification step is currently based on a multiclass k-NN classifier. Since the testing step heavily depends on the number of samples of the training set, the latter can be modified to improve the scores. Our work is focused on the possible improvements of the training set.