I-Vectors distance learning with Convolutional Neural Networks



In this work we apply Convolutional Neural Networks to the task of speaker recognition. The CNN is used to approximate a measure of the distance between two i-vectors (representation of a speaker). Contrary to the commonly used cosine similarity measure, the function approximated by a CNN can be non-linear. The performance of this model will be compared to the ones of the Cosine Similarity measure and PLDA classification.