DocumentCode :
1908879
Title :
Towards Shallow Semantics: The OntoNotes Project
Author :
Hovy, Eduard
Author_Institution :
Inf. Sci. Inst., Univ. of Southern California, Los Angeles, CA
fYear :
2007
fDate :
Aug. 30 2007-Sept. 1 2007
Firstpage :
2
Lastpage :
3
Abstract :
Summary form only given. Many natural language processing (NLP) applications could benefit from a richer model of text meaning than the bag-of-words and n-gram models that currently predominate. Despite theoretical interest since the 1960s, however, no large-scale model exists; in fact, it is not even clear what such a model should minimally include. However, the introduction of large-scale public resources such as the Penn TreeBank and WordNet have generated a great deal of progress in the NLP community, and so it seems increasingly important to create some kind of meaning-oriented model and build a corresponding corpus that is large enough to support adequate machine learning. This talk argues for the necessity of (even shallow) semantics-based NLP, describes the contents and operation of the OntoNotes project, and in so doing introduces and explains the general issues facing annotation projects. Our hope is that other people not only try to use the OntoNotes corpus in their own work, but also create their own annotations on the same material, so that more layers of shallow semantics can be included into OntoNotes.
Keywords :
computational linguistics; learning (artificial intelligence); natural language processing; text analysis; OntoNotes annotation corpus project; Penn TreeBank; WordNet; bag-of-words model; machine learning; meaning-oriented model; n-gram model; semantics-based natural language processing; shallow semantics; text meaning; Broadcasting; Intersymbol interference; Large-scale systems; Machine learning; Natural language processing; Ontologies;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1610-3
Electronic_ISBN :
978-1-4244-1611-0
Type :
conf
DOI :
10.1109/NLPKE.2007.4367998
Filename :
4367998
Link To Document :
بازگشت