DocumentCode
290097
Title
Segmentation of speech using speaker identification
Author
Wilcox, Lynn ; Chen, Francine ; Kimber, Don ; Balasubramanian, Vijay
Author_Institution
Xerox PARC, Palo Alto, CA, USA
Volume
i
fYear
1994
fDate
19-22 Apr 1994
Abstract
This paper describes techniques for segmentation of conversational speech based on speaker identity. Speaker segmentation is performed using Viterbi decoding on a hidden Markov model network consisting of interconnected speaker sub-networks. Speaker sub-networks are initialized using Baum-Welch training on data labeled by speaker, and are iteratively retrained based on the previous segmentation. If data labeled by speaker is not available, agglomerative clustering is used to approximately segment the conversational speech according to speaker prior to Baum-Welch training. The distance measure for the clustering is a likelihood ratio in which speakers are modeled by Gaussian distributions. The distance between merged segments is recomputed at each stage of the clustering, and a duration model is used to bias the likelihood ratio. Segmentation accuracy using agglomerative clustering initialization matches accuracy using initialization with speaker labeled data
Keywords
Gaussian distribution; Gaussian processes; Viterbi decoding; hidden Markov models; speaker recognition; speech processing; Baum-Welch training; Gaussian distributions; Viterbi decoding; agglomerative clustering; conversational speech segmentation; distance measure; duration model; hidden Markov model network; initialization; interconnected speaker sub-networks; likelihood ratio; segmentation accuracy; speaker identification; speaker labeled data; speaker segmentation; Cepstral analysis; Gaussian distribution; Hidden Markov models; Indexing; Iterative algorithms; Iterative decoding; Speech; Statistical distributions; Streaming media; Viterbi algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location
Adelaide, SA
ISSN
1520-6149
Print_ISBN
0-7803-1775-0
Type
conf
DOI
10.1109/ICASSP.1994.389330
Filename
389330
Link To Document