DocumentCode :
2751213
Title :
Bootstrapping the Albanian Information Retrieval
Author :
Karanikolas, Nikitas N.
Author_Institution :
Dept. of Inf., Technol. Educ. Inst. (TEI) of Athens, Athens, Greece
fYear :
2009
fDate :
17-19 Sept. 2009
Firstpage :
231
Lastpage :
235
Abstract :
In this paper we investigate the Albanian language and try to uncover the characteristics of the language that will permit the information retrieval (IR) community to develop IR systems adapted for the specific language. As a consequence of our study (investigation) we provide a naive-single-step (rudimentary) stemming algorithm for the Albanian language. A stopword list is also created. Human experts are contacted for the evaluation of the provided stemming algorithm. The evaluation method used and the observation of the method´s results uncover more rules, which could improve the capabilities of the rudimentary stemming algorithm. We believe that our approach for this specific language could become a standard way for building information retrieval functionalities (tools, functions, etc) for languages less perused, as is the language studied in this paper.
Keywords :
computer bootstrapping; information retrieval; Albanian information retrieval; Albanian language; bootstrapping; naive-single-step stemming algorithm; rudimentary stemming algorithm; Educational technology; Electronic mail; Encoding; Humans; ISO standards; Informatics; Information retrieval; Natural languages; Search engines; Web search; information retrieval; stemming algorithm; stopword list;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Informatics, 2009. BCI '09. Fourth Balkan Conference in
Conference_Location :
Thessaloniki
Print_ISBN :
978-0-7695-3783-2
Type :
conf
DOI :
10.1109/BCI.2009.16
Filename :
5359163
Link To Document :
بازگشت