مرکز منطقه ای اطلاع رساني علوم و فناوري - Subject classification in the Oxford English Dictionary

DocumentCode :

2334892

Title :

Subject classification in the Oxford English Dictionary

Author :

Langari, Zarrin ; Tompa, Frank Wm

Author_Institution :

Dept. of Comput. Sci., Waterloo Univ., Ont., Canada

fYear :

2001

fDate :

2001

Firstpage :

329

Lastpage :

336

Abstract :

The Oxford English Dictionary is a valuable source of lexical information and a rich testing ground for mining highly structured text. Each entry is organized into a hierarchy of senses, which include definitions, labels and cited quotations. Subject labels distinguish the subject classification of a sense, for example they signal how a word may be used in anthropology, music or computing. Unfortunately subject labeling in the dictionary is incomplete. To overcome this incompleteness, we attempt to classify the senses (i.e., definitions) in the dictionary by their subjects, using the citations as an information guide. We report on four different approaches: k nearest neighbors, a standard classification technique; term weighting, an information retrieval method dealing with text; naive Bayes, a probabilistic method; and expectation maximization, an iterative probabilistic method. Experimental performance of these methods is compared based on standard classification metrics

Keywords :

Bayes methods; classification; data mining; dictionaries; Oxford English Dictionary; cited quotations; classification metrics; definitions; expectation maximization; highly structured text mining; information guide; information retrieval method; iterative probabilistic method; k nearest neighbors; labels; lexical information; naive Bayes; probabilistic method; senses; subject classification; subject labels; term weighting; Computer science; Data mining; Dictionaries; Information retrieval; Iterative methods; Labeling; Multiple signal classification; Nearest neighbor searches; Speech; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on

Conference_Location :

San Jose, CA

Print_ISBN :

0-7695-1119-8

Type :

conf

DOI :

10.1109/ICDM.2001.989536

Filename :

989536

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2334892