DocumentCode :
559641
Title :
Text mining: Finding right documents from large collection of unstructured documents
Author :
Amarakoon, Savidu ; Caldera, Amitha
Author_Institution :
Sch. of Comput., Univ. of Colombo., Colombo, Sri Lanka
fYear :
2011
fDate :
24-26 Oct. 2011
Firstpage :
5
Lastpage :
10
Abstract :
In our day to day life we come across unstructured data in many forms. These include books journals, audio / video files and unstructured text such as emails, web pages and documents. And these data can be a vital source in order to make informed decisions. For example in any company there is a set of people who can be identified as the paramount from among its workforce. Identifying what is common among them and identifying others like them would undoubtedly improve the output of the company. This is the basis on which this research was carried out. The central aspect of the research was to use text mining techniques to mine the data in a set of documents and identify what are the common characteristics among them and then to identify other documents which contains these characteristics.
Keywords :
data mining; text analysis; data mining; right document finding; text mining techniques; unstructured document large collection; Indexing; Java; Libraries; Portable document format; Text mining; Data Mining; Document-based Searching; Lucene; Text Mining; Unstructured Data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining and Intelligent Information Technology Applications (ICMiA), 2011 3rd International Conference on
Conference_Location :
Macao
Print_ISBN :
978-1-4673-0231-9
Type :
conf
Filename :
6108390
Link To Document :
بازگشت