• DocumentCode
    423547
  • Title

    Linguistic feature extraction using independent component analysis

  • Author

    Honkela, Timo ; Hyvarinen, Aapo

  • Author_Institution
    Neural Networks Res. Center, Helsinki Univ. of Technol., Finland
  • Volume
    1
  • fYear
    2004
  • fDate
    25-29 July 2004
  • Lastpage
    284
  • Abstract
    Our aim is to find syntactic and semantic relationships of words based on the analysis of corpora. We propose the application of independent component analysis, which seems to have clear advantages over two classic methods: latent semantic analysis and self-organizing maps. Latent semantic analysis is a simple method for automatic generation of concepts that are useful, e.g., in encoding documents for information retrieval purposes. However, these concepts cannot easily be interpreted by humans. Self-organizing maps can be used to generate an explicit diagram which characterizes the relationships between words. The resulting map reflects syntactic categories in the overall organization and semantic categories in the local level. The self-organizing map does not, however, provide any explicit distinct categories for the words. Independent component analysis applied on word context data gives distinct features which reflect syntactic and semantic categories. Thus, independent component analysis gives features or categories that are both explicit and can easily be interpreted by humans. This result can be obtained without any human supervision or tagged corpora that would have some predetermined morphological, syntactic or semantic information.
  • Keywords
    feature extraction; independent component analysis; linguistics; self-organising feature maps; independent component analysis; latent semantic analysis; linguistic feature extraction; self-organizing maps; Feature extraction; Frequency; Humans; Independent component analysis; Information analysis; Information retrieval; Laboratories; Matrix decomposition; Neural networks; Self organizing feature maps;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
  • ISSN
    1098-7576
  • Print_ISBN
    0-7803-8359-1
  • Type

    conf

  • DOI
    10.1109/IJCNN.2004.1379914
  • Filename
    1379914