DocumentCode
3634291
Title
Looking for new words out there
Author
Filip Graliński;Marcin Walas
Author_Institution
Adam Mickiewicz University, Faculty of Mathematics and Computer Science, ul.Umultowska 87, 61-614 Poznan, Poland
fYear
2009
Firstpage
213
Lastpage
218
Abstract
This paper presents methods for automatic extraction of new lexemes from Web corpora in order to obtain a comprehensive list of Polish words. We present the following methods: Reverse Derivation, Compound Formation, List Extraction, extraction of adjectives from addresses, Polonisation of English words. We proceed to describe the process of correcting errors that arise from the application of automated methods. Quantitative evaluation of the project and presentation of its results are given.
Keywords
"Dictionaries","Computer science","Helium","Productivity","Natural languages","Information technology","Mathematics","Data mining","Error correction","Cultural differences"
Publisher
ieee
Conference_Titel
Computer Science and Information Technology, 2009. IMCSIT ´09. International Multiconference on
ISSN
2157-5525
Print_ISBN
978-1-4244-5314-6
Type
conf
DOI
10.1109/IMCSIT.2009.5352725
Filename
5352725
Link To Document