DocumentCode
2769082
Title
Maximum entropy model parameterization with TF∗IDF weighted vector space model
Author
Wang, Ye-Yi ; Acero, Alex
Author_Institution
Microsoft Res., Redmond
fYear
2007
fDate
9-13 Dec. 2007
Firstpage
213
Lastpage
218
Abstract
Maximum entropy (MaxEnt) models have been used in many spoken language tasks. The training of a MaxEnt model often involves an iterative procedure that starts from an initial parameterization and gradually updates it towards the optimum. Due to the convexity of its objective function (hence a global optimum on a training set), little attention has been paid to model initialization in MaxEnt training. However, MaxEnt model training often ends early before convergence to the global optimum, and prior distributions with hyper-parameters are often added to the objective function to prevent over-fitting. This paper shows that the initialization and regularization hyper-parameter setting may significantly affect the test set accuracy. It investigates the MaxEnt initialization/regularization based on an n-gram classifier and a TF*IDF weighted vector space model. The theoretically motivated TF*IDF initialization/regularization has achieved significant improvements over the baseline flat initialization/regularization, especially when training data are sparse. In contrast, the n-gram based initialization/ regularization does not exhibit significant improvements.
Keywords
convergence of numerical methods; iterative methods; learning (artificial intelligence); speech processing; statistical distributions; MaxEnt model training; TF*IDF weighted vector space model; conditional probability distribution; convergence; initialization hyper-parameter setting; iterative procedure; maximum entropy model parameterization; n-gram classifier; regularization hyper-parameter setting; spoken language tasks; text classification; Convergence; Entropy; Iterative algorithms; Natural languages; Random variables; Space technology; Stochastic processes; Testing; Text categorization; Training data; Maximum entropy model; TF∗IDF; model initialization; model regularization; n-gram classification model; vector space model;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location
Kyoto
Print_ISBN
978-1-4244-1746-9
Electronic_ISBN
978-1-4244-1746-9
Type
conf
DOI
10.1109/ASRU.2007.4430111
Filename
4430111
Link To Document