DocumentCode :
3485487
Title :
Efficient determinization of tagged word lattices using categorial and lexicographic semirings
Author :
Shafran, Izhak ; Sproat, Richard ; Yarmohammadi, Mahsa ; Roark, Brian
Author_Institution :
Center for Spoken Language Understanding, USA
fYear :
2011
fDate :
11-15 Dec. 2011
Firstpage :
283
Lastpage :
288
Abstract :
Speech and language processing systems routinely face the need to apply finite state operations (e.g., POS tagging) on results from intermediate stages (e.g., ASR output) that are naturally represented in a compact lattice form. Currently, such needs are met by converting the lattices into linear sequences (n-best scoring sequences) before and after applying the finite state operations. In this paper, we eliminate the need for this unnecessary conversion by addressing the problem of picking only the single-best scoring output labels for every input sequence. For this purpose, we define a categorial semiring that allows determinzation over strings and incorporate it into a 〈Tropical, Categorial〉 lexicographic semiring. Through examples and empirical evaluations we show how determinization in this lexicographic semiring produces the desired output. The proposed solution is general in nature and can be applied to multi-tape weighted transducers that arise in many applications.
Keywords :
natural language processing; sequences; speech recognition; transducers; categorial semiring; language processing; lexicographic semiring; multitape weighted transducer; single best scoring output label; speech processing; tagged word lattice; Acoustics; Complexity theory; Grammar; Lattices; Speech recognition; Tagging; Transducers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
Type :
conf
DOI :
10.1109/ASRU.2011.6163945
Filename :
6163945
Link To Document :
بازگشت