Title :
Text mining wikipedia to discover alternative destinations
Author_Institution :
Comput. Eng. Dept., Chiang Mai Univ., Chiang Mai, Thailand
Abstract :
This paper discusses an application of some statistical Natural Language Processing algorithms to a set of articles from Wikipedia about top tourist destinations. The objective is to automatically identify the key features of each destination and then discover other destinations which share similar sets of features. Through this a method is demonstrated by which meta data about each article can be extracted from the unstructured text and then used to answer complex discovery type queries. The paper compares an approach to automatically clustering similar destinations with a more user driven feature focused technique.
Keywords :
Web sites; data mining; meta data; natural language processing; query processing; text analysis; travel industry; alternative destinations; complex discovery type query; meta data; statistical natural language processing algorithms; text mining Wikipedia; tourist destinations; unstructured text; user driven feature focused technique; Cities and towns; Electronic publishing; Encyclopedias; Government; Internet; Natural language processing; Corpus Linguistics; Information Retrieval; Natural Language Processing; Text Mining;
Conference_Titel :
Computer Science and Software Engineering (JCSSE), 2013 10th International Joint Conference on
Conference_Location :
Maha Sarakham
Print_ISBN :
978-1-4799-0805-9
DOI :
10.1109/JCSSE.2013.6567317