Title :
A review paper on data preprocessing: A critical phase in web usage mining process
Author :
Sanjay Kumar Dwivedi;Bhupesh Rawat
Author_Institution :
Computer Science Department, BBAU, Central University, Lucknow, India
Abstract :
Web usage mining refers to the process of discovering user access patterns from the log of website. Usually the web log contains unstructured, noisy and irrelevant data. To make this data suitable for pattern mining and pattern analysis it has to be passed through data preprocessing phase. Data preprocessing not only improves the quality of data but it also reduces the size of web log file. Data preprocessing involves several steps including data collection, data cleaning, session identification, user identification and path completion. This paper presents several data preprocessing techniques in order to prepare raw data suitable for mining and analysis tasks.
Keywords :
"IP networks","Data preprocessing","Web mining","Data collection","Web servers"
Conference_Titel :
Green Computing and Internet of Things (ICGCIoT), 2015 International Conference on
DOI :
10.1109/ICGCIoT.2015.7380517