CSI Seminar 2017-01-31
10h30 Automatic detection of text zones on identity documents. – Anne-Claire Berthet
Our goal is to detect text zones on any type of identity documents from any country and that are recorded by smartphones cameras. Hence, we may segment latin letterscyrillic letters or ideograms. Moreover, some identity papers have watermarks or letters as background pattern that should be filtered. Furthermore, the context of the video (luminosity, background, ldots) has to be taken into account, in order to ensure that the binarized image has a minimal number of components to filter. In this way, the proposed processing, thanks to morphological operatorssegments the text and limits the number of components to process.
11h00 K shortest-paths in Vcsn – Sébastien Piat
The K shortest paths computation can be very time consuming especially when applied to the enormous automata used in linguistics. Hence, after having implemented one of the state-of-the-art solution to the problem (namely Yen's algorithm) in Vcsn, the next step was to implement the best known solution for automata with cycles: Eppstein. This work will describe our different implementations and compare their performances.
11h30 Time Delay Neural Networks-Based Universal Background Model for Speaker Recognition – Valentin Iovene
In speaker recognition, deep neural networks (DNN) have recently proved to be more efficient than traditional gaussian mixture models (GMM) for collecting Baum-Welch statistics that can be used for i-vector extraction. However, this type of architecture can be too slow at evaluation time, requiring a GPU to achieve real-time performance. We show how triphone posteriors produced by a time delay neural network (TDNN) can be used to create a more lightweight supervised GMM serving as a universal background model (UBM) inside the i-vector framework. The equal error rate (EER) obtained with this approach is compared to those obtained with traditional GMM-based UBM.