Title :
Automating sequence dataset generating by using SeqGen
Author :
Reshamwala, Alpa ; Mahajan, Sunita
Author_Institution :
Comput. Eng. Dept., SVKM´s NMIMS Univ., Mumbai, India
Abstract :
Data preprocessing describes any type of processing performed on raw data to prepare it for another processing procedure. Commonly used as a preliminary data mining practice, data preprocessing transforms the data into a format that will be more easily and effectively processed. Sequential Pattern Mining finds interesting sequential patterns among the large database. Data acquired from the dataset may not be sequential. In this paper, we propose a SeqGen algorithm as preprocessing step in sequential pattern mining. The main objective of the algorithm is to generate sequences with timestamp on user personalization. The reference attribute is given as parameter for generating the sequences. Experimental results have shown that raw data in any form can be easily transformed into sequence dataset once the reference attribute is given.
Keywords :
data mining; SeqGen algorithm; data mining; data preprocessing; sequence dataset generation automation; sequential pattern mining; user personalization; Computer crime; Computers; Data mining; Data preprocessing; Databases; Transforms; Web pages; Data mining; KDD Cup 1999; KDD Cup 2010; KDD Cup 2011; Learning Management System; Preprocessing; Raw data; Sequence data; Time stamp;
Conference_Titel :
Communication, Information & Computing Technology (ICCICT), 2015 International Conference on
Conference_Location :
Mumbai
Print_ISBN :
978-1-4799-5521-3
DOI :
10.1109/ICCICT.2015.7045717