Title :
A Unified Framework for Thai Metadata Extraction Using Case-Based Reasoning
Author :
Khankasikam, Krisda ; Chakpitak, Nopasit
Author_Institution :
Sch. of Inf. Commun. & Technol., Naresuan Univ. Phayao, Phayao
Abstract :
Metadata is a very popular word in information technology today because it helps users to differentiate significant documents from non-significant documents. With the growth of the Internet and related tools, there has been a rapid growth of online resources. However, lack of metadata available for these resources stops their discovery and dissemination over the Internet. The process for manual metadata extraction is time-consuming, costly, and labor-extensive. This paper describes a framework for automatic metadata extraction from electronic Thai documents. The system consists of three main components: a case retrieval module for comparing problem case and stored case using nearest neighbor retrieval technique, a metadata creation module for automatically extracting metadata from electronic Thai documents using Thai information extraction techniques, and a metadata verification module for correcting the errors in extracted metadata. The experimental results show that using the proposed framework could reduce the labor work of Thai metadata creation process.
Keywords :
case-based reasoning; document handling; information retrieval; meta data; natural language processing; Internet; Thai metadata extraction; case-based reasoning; electronic Thai documents; metadata verification module; nearest neighbor retrieval technique; online resources; Artificial intelligence; Automatic speech recognition; Communications technology; Data mining; Face recognition; Information retrieval; Internet; Natural language processing; Speech recognition; Text recognition; Case-based Reasoning; Metadata Extraction;
Conference_Titel :
Advanced Computer Theory and Engineering, 2008. ICACTE '08. International Conference on
Conference_Location :
Phuket
Print_ISBN :
978-0-7695-3489-3
DOI :
10.1109/ICACTE.2008.164