• DocumentCode
    3765455
  • Title

    An improved system for sentence-level novelty detection in textual streams

  • Author

    Xinyu Fu;Eugene Ch´ng;Uwe Aickelin;Lanyun Zhang

  • Author_Institution
    InternationalDoctoralInnovation Centre, The University of Nottingham, Ningbo, China
  • fYear
    2015
  • fDate
    7/1/2015 12:00:00 AM
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Novelty detection in news events has long been a difficult problem. A number of models performed well on specific data streams but certain issues are far from being solved, particularly in large data streams from the WWW where unpredictability of new terms requires adaptation in the vector space model. We present a novel event detection system based on the Incremental Term Frequency-Inverse Document Frequency (TF-IDF) weighting incorporated with Locality Sensitive Hashing (LSH). Our system could efficiently and effectively adapt to the changes within the data streams of any new terms with continual updates to the vector space model. Regarding miss probability, our proposed novelty detection framework outperforms a recognised baseline system by approximately 16% when evaluating a benchmark dataset from Google News.
  • Publisher
    iet
  • Conference_Titel
    Smart and Sustainable City and Big Data (ICSSC), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1049/cp.2015.0250
  • Filename
    7446433