• DocumentCode
    1619091
  • Title

    New resources trigger new technologies

  • Author

    Dong, Zhendong

  • Author_Institution
    Comput. & Language Inf. Res. Centre, Chinese Acad. of Sci., China
  • fYear
    2010
  • Abstract
    Two decades ago large-scale corpora, as new language resources brought forth a new paradigm shift marked by the revival of empiricism. However, now some researchers including the beginner of the revival began to rethink: “what should they (next generation students) do when most of the low hanging fruit has been pretty much picked over?” or to predict that the weird state of computational linguistics without general linguistics should be brought to an end. The author anticipates a newly adjusting of paradigm is approaching. Again new language resources will trigger new NLP technologies. What are the new language resources like? The resources like HowNet will soon be brought into full play. Corpora helped us achieve shallow practice. Instead, HowNet will take us deeper and thus may help us reach the high-hanging fruit. After a brief overview of HowNet, the author will give an overall demonstration of three HowNet-based application tools that are all closely related to some immediate potential demands. They are: (1) Text-CT (vs. Text-X-ray), which can show all the senses of each word and expression of a text, rather than merely word strings or at most its POS; (2) Sense-Colony-Tester, which works on the basis of an sense colony activator and is able to measure the sense colony testing value of each sense in the text; (3) Morphological Decomposer, which can be used to deal with various types of OOVs by decomposing the morphological formation in both English and Chinese and extracting their meanings.
  • Keywords
    computational linguistics; computer aided instruction; natural language processing; HowNet; NLP technologies; OOV; Sense-Colony-Tester; Text-CT; computational linguistics; empiricism; language resources; morphological decomposer;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Universal Communication Symposium (IUCS), 2010 4th International
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-7821-7
  • Type

    conf

  • DOI
    10.1109/IUCS.2010.5666778
  • Filename
    5666778