Comparison Between Factor Analysis and GMM Support Vector Machines for Speaker Verification

Authors: Najim Dehak, Réda Dehak, Patrick Kenny, Pierre Dumouchel
Where: Proceedings of the Speaker and Language Recognition Workshop (IEEE-Odyssey 2008)
Place: Stellenbosch, South Africa
Type: inproceedings
Date: 2007-09-25

Abstract

We present a comparison between speaker verification systems based on factor analysis modeling and support vector machines using GMM supervectors as features. All systems used the same acoustic features and they were trained and tested on the same data sets. We test two types of kernel (one linear, the other non-linear) for the GMM support vector machines. The results show that factor analysis using speaker factors gives the best results on the core condition of the NIST 2006 speaker recognition evaluation. The difference is particularly marked on the English language subset. Fusion of all systems gave an equal error rate of 4.2% (all trials) and 3.2% (English trials only).

Bibtex (lrde.bib)

@InProceedings{	  dehak.08.odysseyb,
  author	= {Najim Dehak and R\'eda Dehak and Patrick Kenny and Pierre
		  Dumouchel},
  title		= {Comparison Between Factor Analysis and {GMM} Support
		  Vector Machines for Speaker Verification},
  booktitle	= {Proceedings of the Speaker and Language Recognition
		  Workshop (IEEE-Odyssey 2008)},
  year		= 2008,
  address	= {Stellenbosch, South Africa},
  month		= jan,
  abstract	= {We present a comparison between speaker verification
		  systems based on factor analysis modeling and support
		  vector machines using GMM supervectors as features. All
		  systems used the same acoustic features and they were
		  trained and tested on the same data sets. We test two types
		  of kernel (one linear, the other non-linear) for the GMM
		  support vector machines. The results show that factor
		  analysis using speaker factors gives the best results on
		  the core condition of the NIST 2006 speaker recognition
		  evaluation. The difference is particularly marked on the
		  English language subset. Fusion of all systems gave an
		  equal error rate of 4.2\% (all trials) and 3.2\% (English
		  trials only).}
}

Comparison Between Factor Analysis and GMM Support Vector Machines for Speaker Verification

From LRDE

Abstract

Bibtex (lrde.bib)