DocumentCode :
1152930
Title :
Using an Ant Colony Metaheuristic to Optimize Automatic Word Segmentation for Ancient Greek
Author :
Tambouratzis, George
Author_Institution :
Inst. for Language & Speech Process., Athens, Greece
Volume :
13
Issue :
4
fYear :
2009
Firstpage :
742
Lastpage :
753
Abstract :
Given a text or collection of texts involving unconstrained language, a basic task in a multitude of applications is the identification of stems and endings for each word form, which is termed morphological analysis. In this paper, the use of an ant colony optimization (ACO) metaheuristic is proposed for a linguistic task that involves the automated morphological segmentation of Ancient Greek word forms into stem and ending. The task of morphological analysis is essential for implementing text-processing applications such as semantic analysis and information retrieval. The difficulty of the morphological analysis task differs depending on the language chosen, being hardest in the case of highly-inflectional languages, where each stem may be associated with a large number of different endings. In this paper, focus is placed on the morphological analysis of ancient Greek, which has been shown to be a particularly hard task. To perform this task, a system for the automated morphological processing has been proposed, which implements the morphological analysis of words by coupling an iterative pattern-recognition algorithm with a modest amount of linguistic knowledge, expressed via a set of interactions associated with weights. In an earlier version of the system, these weights were determined by combining the input from specialized scientists with a lengthy manual optimization process. In this paper, the ACO metaheuristic is applied to the task of defining near-optimal system weights using an automated process based on a set of training data. The experiments performed indicate that the segmentation quality achieved by ACO is equivalent to or in several cases substantially higher than that achieved using manually optimized weights.
Keywords :
iterative methods; linguistics; natural language processing; optimisation; pattern recognition; text analysis; ancient Greek; ant colony metaheuristic; automated morphological processing; highly-inflectional languages; information retrieval; iterative pattern-recognition algorithm; linguistic knowledge; manual optimization process; morphological analysis; near-optimal system; optimize automatic word segmentation; segmentation quality; semantic analysis; text processing application; Ancient Greek; ant colony optimization (ACO) metaheuristic; automated morphological analysis; heuristic function; text processing;
fLanguage :
English
Journal_Title :
Evolutionary Computation, IEEE Transactions on
Publisher :
ieee
ISSN :
1089-778X
Type :
jour
DOI :
10.1109/TEVC.2009.2014363
Filename :
5175425
Link To Document :
بازگشت