DocumentCode
3302919
Title
Identification of Multiword Expressions in Technical Domains: Investigating Statistical and Alignment-Based Approaches
Author
Villavicencio, Aline ; de Medeiros Caseli, Helena ; Machado, Andre
fYear
2009
fDate
8-11 Sept. 2009
Firstpage
27
Lastpage
35
Abstract
Multiword Expressions (MWEs) are one of the stumbling blocks for more precise Natural Language Processing (NLP) systems. The lack of coverage of MWEs in resources can impact negatively on the performance of tasks and applications, and can lead to loss of information or communication errors; especially in technical domains where MWE are frequent. This paper investigates some approaches to the identification of MWEs in technical corpora based on: association measures, part-of-speech and lexical alignment information. We examine the influence of some factors on their performance such as sources of information for identification and evaluation. While the association measures emphasize recall, the alignment method focuses on precision.
Keywords
Application software; Computer science; Global warming; Humans; Informatics; Information resources; Natural language processing; Natural languages; Performance loss; Vocabulary; Lexical Acquisition; Multiword Expressions; Natural Language Processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Information and Human Language Technology (STIL), 2009 Seventh Brazilian Symposium in
Conference_Location
Sao Carlos, TBD, Brazil
Print_ISBN
978-1-4244-6008-3
Type
conf
DOI
10.1109/STIL.2009.33
Filename
5532435
Link To Document