• DocumentCode
    658356
  • Title

    Automatic Class Labeling for CiteSeerX

  • Author

    Kashireddy, Surya Dhairya ; Gauch, Susan ; Billah, Syed Masum

  • Author_Institution
    Comput. Sci. & Comput. Eng., Univ. of Arkansas, Fayetteville, AR, USA
  • Volume
    1
  • fYear
    2013
  • fDate
    17-20 Nov. 2013
  • Firstpage
    241
  • Lastpage
    245
  • Abstract
    The CiteSeerx project at the University of Arkansas uses a browsing interface is based on the Association for Computing Machinery´s Computing Classification System (ACM CCS). CCS contains just 369 categories whereas the CiteSeerx database contains over 2 million documents. This results in more than 6500 documents per category, far too many to browse. To address this problem, we are exploring ways to automatically expand the CCS ontology. Previous work has focused on using clustering to automatically identify the new classes. This work focuses on how to label the subclasses in a semantically meaningful way to that they can support user browsing. We develop methods based on text mining from the subclass members to extract class labels. We evaluate three methods by comparing the suggested labels with human-assigned labels for existing categories.
  • Keywords
    data analysis; data mining; database management systems; online front-ends; ontologies (artificial intelligence); pattern classification; text analysis; ACM CCS; Association for Computing Machinery Computing Classification System; CCS ontology; CiteSeerx project; CiteSeerx database; University of Arkansas; automatic class labeling; browsing interface; human-assigned labels; subclass members; text mining; user browsing; Clustering algorithms; Encyclopedias; Labeling; Ontologies; Programming; Semantic Web; Text mining; labeling; ontologies; text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2013 IEEE/WIC/ACM International Joint Conferences on
  • Conference_Location
    Atlanta, GA
  • Print_ISBN
    978-1-4799-2902-3
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2013.35
  • Filename
    6690021