• DocumentCode
    3134869
  • Title

    Discovery of serial episodes from streams of events

  • Author

    Mielikäinen, Taneli

  • Author_Institution
    Dept. of Comput. Sci., Helsinki Univ., Finland
  • fYear
    2004
  • fDate
    21-23 June 2004
  • Firstpage
    447
  • Lastpage
    448
  • Abstract
    A very important problem in data mining is finding patterns from sequential data. There is a vast number of sources for sequential data such as biological sequences, text documents, telecommunication alarm sequences, click streams, market basket data, Web logs, and other time series. One of the most popular patterns mined from sequential data are the episodes, i.e., directed acyclic graphs with labeled nodes (Mannila et al., 1997), An important subclass of episodes are the serial episodes, which are essentially sequences. Serial episodes are useful in many applications, including network monitoring and molecular biology. Currently, there are many situations were so much sequential data is produced that it cannot even be stored without great difficulties. That kind of sequential sources are called data streams. In this paper we focus on finding serial episodes from data streams. To the best of our knowledge the problem of mining serial episodes from data streams has been studied in depth only for length-1 episodes (Karp et al., 2003).
  • Keywords
    data analysis; data mining; directed graphs; pattern recognition; time series; Web logs; biological sequences; click streams; data mining; data streams; directed acyclic graphs; event streams; labeled nodes; market basket data; molecular biology; network monitoring; pattern mining; sequential data; sequential sources; serial episode discovery; serial episodes; telecommunication alarm sequences; text documents; time series; Computer science; Data mining; Frequency; Monitoring; Sequences;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on
  • ISSN
    1099-3371
  • Print_ISBN
    0-7695-2146-0
  • Type

    conf

  • DOI
    10.1109/SSDM.2004.1311253
  • Filename
    1311253