DocumentCode :
2176555
Title :
Incorporating alignments into Conditional Random Fields for grapheme to phoneme conversion
Author :
Lehnen, Patrick ; Hahn, Stefan ; Guta, Andreas ; Ney, Hermann
Author_Institution :
Comput. Sci. Dept., RWTH Aachen Univ., Aachen, Germany
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
4916
Lastpage :
4919
Abstract :
Conditional Random Fields (CRFs) are a state-of-the-art approach to natural language processing tasks like grapheme-to phoneme (g2p) conversion which is used to produce pronunciations or pronunciation variants for almost all ASR pronunciation lexica. One drawback of CRFs is that for training, an alignment is needed between graphemes and phonemes, usually even 1-to-l. The quality of the g2p result heavily depends on this alignment. Since these alignments are usually not annotated within the corpora, external models have to be used to produce such an alignment in a preprocessing step. In this work, we propose two approaches to integrate the alignment generation directly and efficiently into the CRF training process. Whereas the first approach relies on linear segmentation as starting point, the second approach considers all possible alignments given certain constraints. Both methods have been evaluated on two English g2p tasks, namely NETtalk and Celex, on which state-of-the-art results have been reported in the literature. The proposed approaches lead to results comparable to the state-of-the art.
Keywords :
speech recognition; ASR pronunciation lexica; CRF training process; Celex; G2P conversion; NETtalk; conditional random fields; grapheme-to- phoneme conversion; linear segmentation; natural language processing tasks; Automata; Biological system modeling; Hidden Markov models; Joints; Manuals; Mathematical model; Training; Alignments; CRF; EM Algorithm; G2P;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947458
Filename :
5947458
Link To Document :
بازگشت