• DocumentCode
    2282487
  • Title

    Corpus-Based Extraction of Collocations in Chinese

  • Author

    Hui, Wang ; Donghong, Ji

  • Author_Institution
    Nat. Univ. of Singapore, Singapore
  • Volume
    3
  • fYear
    2008
  • fDate
    9-12 Dec. 2008
  • Firstpage
    330
  • Lastpage
    333
  • Abstract
    Collocation, i.e. the sequences of certain words which habitually co-occur, plays an essential part in human language. The present study is intending to identify the detailed classification and typical features of collocations in Chinese language, and explore a new computer-assistant way for extraction and representation of Chinese collocations. The investigation is based on the largest and only Singapore Chinese corpus (SCC), of which 20 million words have been analysed. The central novel idea of this research is the combination of dictionary, language rules and statistic data in automatic collocation extraction. So far, this method has not been proposed.
  • Keywords
    dictionaries; information retrieval; natural language processing; Chinese collocations; Chinese language; corpus-based extraction; human language; Dairy products; Data mining; Dictionaries; Displays; Humans; Intelligent agent; Large-scale systems; Natural languages; Rain; Statistics; Chinese Language; Collocation; Computer-assistanted Extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-0-7695-3496-1
  • Type

    conf

  • DOI
    10.1109/WIIAT.2008.72
  • Filename
    4740791