Title of article :
Learning to construct knowledge bases from the World Wide Web
Author/Authors :
Mitchell، Tom نويسنده , , CRAVEN، MARK نويسنده , , DiPasquo، Dan نويسنده , , Freitag، Dayne نويسنده , , McCallum، Andrew نويسنده , , Nigam، Kamal نويسنده , , Slattery، Se?n نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2000
Abstract :
Morphology is the area of linguistics concerned with the internal structure of words. Information retrieval has generally not paid much attention to word structure, other than to account for some of the variability in word forms via the use of stemmers. We report on our experiments to determine the importance of morphology, and the effect that it has on performance. We found that grouping morphological variants makes a significant improvement in retrieval performance. Improvements are seen by grouping inflectional as well as derivational variants. We also found that performance was enhanced by recognizing lexical phrases. We describe the interaction between morphology and lexical ambiguity, and how resolving that ambiguity will lead to further improvements in performance.
Keywords :
Machine learning , Knowledge bases , Text classification , Relational learning , Web spider , Information extraction , world wide web
Journal title :
ARTIFICIAL INTELLIGENCE (NON MEMBERS) (AI)
Journal title :
ARTIFICIAL INTELLIGENCE (NON MEMBERS) (AI)