GMM Weights Adaptation Based on Subspace Approaches for Speaker Verification
From LRDE
- Authors
- Najim Dehak, O Plchot, M H Bahari, L Burget, H Van hamme, Réda Dehak
- Where
- Odyssey 2014, The Speaker and Language Recognition Workshop
- Place
- Joensuu, Finland
- Type
- inproceedings
- Projects
- SpeakerId"SpeakerId" is not in the list (Vaucanson, Spot, URBI, Olena, APMC, Tiger, Climb, Speaker ID, Transformers, Bison, ...) of allowed values for the "Related project" property.
- Date
- 2014-06-16
Abstract
In this paper, we explored the use of Gaussian Mixture Model (GMM) weights adaptation for speaker verifica- tion. We compared two different subspace weight adap- tation approaches: Subspace Multinomial Model (SMM) and Non-Negative factor Analysis (NFA). Both techniques achieved similar results and seemed to outperform the retraining maximum likelihood (ML) weight adaptation. However, the training process for the NFA approach is substantially faster than the SMM technique. The i-vector fusion between each weight adaptation approach and the classical i-vector yielded slight improvements on the tele- phone part of the NIST 2010 Speaker Recognition Eval- uation dataset.
Bibtex (lrde.bib)
@InProceedings{ dehak.14.odyssey, author = {Najim Dehak and O. Plchot and M.H. Bahari and L. Burget and H. Van hamme and R\'eda Dehak}, title = {{GMM} Weights Adaptation Based on Subspace Approaches for Speaker Verification}, booktitle = {Odyssey 2014, The Speaker and Language Recognition Workshop}, year = 2014, address = {Joensuu, Finland}, month = jun, abstract = {In this paper, we explored the use of Gaussian Mixture Model (GMM) weights adaptation for speaker verifica- tion. We compared two different subspace weight adap- tation approaches: Subspace Multinomial Model (SMM) and Non-Negative factor Analysis (NFA). Both techniques achieved similar results and seemed to outperform the retraining maximum likelihood (ML) weight adaptation. However, the training process for the NFA approach is substantially faster than the SMM technique. The i-vector fusion between each weight adaptation approach and the classical i-vector yielded slight improvements on the tele- phone part of the NIST 2010 Speaker Recognition Eval- uation dataset.}, pages = {48--53} }