Speaker Diarization based on the Mel Frequency Cepstral Coefficients

From LRDE

Abstract

Speaker diarization has emerged as an increasingly important and dedicated domain of speech research. It relates to the problem of determining "who spoke when ?". It means that we would like to find the intervals during which eachspeaker is active. By computing the Mel Frequency Cepstral Coefficients (MFCC) features from a given speech signal and using the Independent Component Analysis (ICA) on these features, we are able to segment the speech, with the help of a Hidden Markov Model (HMM). We will use this algorithm for speaker diarization in verification systemwith multi-speaker audio data, such as interview of microphone segment of NIST Speaker Recognition Evaluation.