• DocumentCode
    3439290
  • Title

    Robust Language Learning via Efficient Budgeted Online Algorithms

  • Author

    Filice, S. ; Castellucci, Giuseppe ; Croce, Daniele ; Basili, Roberto

  • Author_Institution
    Univ. of Roma Tor Vergata, Rome, Italy
  • fYear
    2013
  • fDate
    7-10 Dec. 2013
  • Firstpage
    913
  • Lastpage
    920
  • Abstract
    In many Natural Language Processing tasks, kernel learning allows to define robust and effective systems. At the same time, Online Learning Algorithms are appealing for their incremental and continuous learning capability. They allow to follow a target problem, with a constant adaptation to a dynamic environment. The drawback of using kernels in online settings is the continuous complexity growth, in terms of time and memory usage, experienced both in the learning and classification phases. In this paper, we extend a state-of-the-art Budgeted Online Learning Algorithm that efficiently constraints the overall complexity. We introduce the principles of Fairness and Weight Adjustment: the former mitigates the effect of unbalanced datasets, while the latter improves the stability of the resulting models. The usage of robust semantic kernel functions in Sentiment Analysis in Twitter improves the results with respect to the standard budgeted formulation. Performances are comparable with one of the most efficient Support Vector Machine implementations, still preserving all the advantages of online methods. Results are straightforward considering that the task has been tackled without manually coded resources (e.g. WordNet or a Polarity Lexicon) but mainly exploiting distributional analysis of unlabeled corpora.
  • Keywords
    learning (artificial intelligence); natural language processing; social networking (online); support vector machines; Twitter; budgeted online learning algorithms; continuous complexity growth; continuous learning capability; fairness principle; incremental learning capability; natural language processing tasks; robust semantic kernel functions; sentiment analysis; support vector machine implementation; unbalanced dataset effect mitigation; weight adjustment principle; Algorithm design and analysis; Complexity theory; Kernel; Robustness; Semantics; Support vector machines; Training; online learning; sentiment analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
  • Conference_Location
    Dallas, TX
  • Print_ISBN
    978-1-4799-3143-9
  • Type

    conf

  • DOI
    10.1109/ICDMW.2013.87
  • Filename
    6754019