Title :
Looking for new words out there
Author :
Filip Graliński;Marcin Walas
Author_Institution :
Adam Mickiewicz University, Faculty of Mathematics and Computer Science, ul.Umultowska 87, 61-614 Poznan, Poland
Abstract :
This paper presents methods for automatic extraction of new lexemes from Web corpora in order to obtain a comprehensive list of Polish words. We present the following methods: Reverse Derivation, Compound Formation, List Extraction, extraction of adjectives from addresses, Polonisation of English words. We proceed to describe the process of correcting errors that arise from the application of automated methods. Quantitative evaluation of the project and presentation of its results are given.
Keywords :
"Dictionaries","Computer science","Helium","Productivity","Natural languages","Information technology","Mathematics","Data mining","Error correction","Cultural differences"
Conference_Titel :
Computer Science and Information Technology, 2009. IMCSIT ´09. International Multiconference on
Print_ISBN :
978-1-4244-5314-6
DOI :
10.1109/IMCSIT.2009.5352725