Title of article :
The Sichel model and the mixing and truncation order
Author/Authors :
Xavier Puig، نويسنده , , Josep Ginebra & Marti Font، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Abstract :
The analysis of word frequency count data can be very useful in authorship attribution problems. Zerotruncated
generalized inverse Gaussian–Poisson mixture models are very helpful in the analysis of these
kinds of data because their model-mixing density estimates can be used as estimates of the density of the
word frequencies of the vocabulary. It is found that this model provides excellent fits for theword frequency
counts of very long texts, where the truncated inverse Gaussian–Poisson special case fails because it does
not allow for the large degree of over-dispersion in the data. The role played by the three parameters of
this truncated GIG-Poisson model is also explored. Our second goal is to compare the fit of the truncated
GIG-Poisson mixture model with the fit of the model that results from switching the order of the mixing
and truncation stages. A heuristic interpretation of the mixing distribution estimates obtained under this
alternative GIG-truncated Poisson mixture model is also provided.
Keywords :
Poisson mixture , stylometry , truncated mixture , truncated model , Word frequency , Categorical data , Mixture model , generalized inverse Gaussian
Journal title :
JOURNAL OF APPLIED STATISTICS
Journal title :
JOURNAL OF APPLIED STATISTICS