• DocumentCode
    3585010
  • Title

    Unsupervised lexical clustering of speech segments using fixed-dimensional acoustic embeddings

  • Author

    Kamper, Herman ; Jansen, Aren ; King, Simon ; Goldwater, Sharon

  • Author_Institution
    CSTR, Univ. of Edinburgh, Edinburgh, UK
  • fYear
    2014
  • Firstpage
    100
  • Lastpage
    105
  • Abstract
    Unsupervised speech processing methods are essential for applications ranging from zero-resource speech technology to modelling child language acquisition. One challenging problem is discovering the word inventory of the language: the lexicon. Lexical clustering is the task of grouping unlabelled acoustic word tokens according to type. We propose a novel lexical clustering model: variable-length word segments are embedded in a fixed-dimensional acoustic space in which clustering is then performed. We evaluate several clustering algorithms and find that the best methods produce clusters with wide variation in sizes, as observed in natural language. The best probabilistic approach is an infinite Gaussian mixture model (IGMM), which automatically chooses the number of clusters. Performance is comparable to that of non-probabilistic Chinese Whispers and average-linkage hierarchical clustering. We conclude that IGMM clustering of fixed-dimensional embeddings holds promise as the lexical clustering component in unsupervised speech processing systems.
  • Keywords
    Gaussian processes; linguistics; mixture models; natural language processing; pattern clustering; speech processing; unsupervised learning; IGMM clustering; child language acquisition modelling; fixed-dimensional acoustic space; fixed-dimensional embeddings; infinite Gaussian mixture model; lexical clustering component; lexical clustering model; natural language; probabilistic approach; unlabelled acoustic word tokens; unsupervised speech processing method; unsupervised speech processing systems; variable-length word segments; word inventory; zero-resource speech technology; Acoustics; Clustering algorithms; Gaussian mixture model; Probabilistic logic; Speech; Standards; Vectors; Lexical clustering; fixed-dimensional embeddings; lexical discovery; unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2014 IEEE
  • Type

    conf

  • DOI
    10.1109/SLT.2014.7078557
  • Filename
    7078557