• DocumentCode
    2580832
  • Title

    Learning Term Spaces Based on Visual Feedback

  • Author

    Granitzer, M. ; Neidhart, T. ; Lux, M.

  • Author_Institution
    Dept. of Knowledge Discovery, Know-Center Graz
  • fYear
    2006
  • fDate
    4-8 Sept. 2006
  • Firstpage
    176
  • Lastpage
    180
  • Abstract
    Extracting and visualizing concepts and relationship between text documents strongly depends on the used similarity measure. In order to provide meaningful visualizations and to extract useful knowledge from document collections, user needs must be captured by the internal representation of documents, and the used similarity measure. In most applications the vector space model and the cosine similarity are used therefore and serve as good approximations. Nevertheless, influencing similarities between documents is rather hard, since parameter tuning relies heavily on expert knowledge of the underlying algorithms, and the influence of different weighting schemes and similarity measures is not known before. In this paper we present an approach on how to adapt the vector space representation of documents by giving visual feedback to the system. Our approach starts by clustering a corpus of text documents and visualizing the results using multi dimensional scaling techniques. Afterwards, a 2D landscape visualization is shown which can be manipulated by the user. Based on these manipulations the high dimensional representation of the documents is adapted to fit the users need more precisely. Our experiments show that iterating these steps results in an adapted representation of documents and similarities, generating layouts as intended by the user and furthermore increases clustering accuracy. While this paper only investigates the influence on clustering and visualization, the method itself may also be used for increasing classification and retrieval performance since it adapts to the users need of similarity
  • Keywords
    information retrieval; learning (artificial intelligence); pattern clustering; text analysis; 2D landscape visualization; classification; cosine similarity measure; document collection; document representation; multidimensional scaling; term space learning; text document clustering; vector space model; vector space representation; visual feedback; Clustering algorithms; Current measurement; Feedback; Information retrieval; Knowledge management; Navigation; Performance analysis; Space technology; Vectors; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database and Expert Systems Applications, 2006. DEXA '06. 17th International Workshop on
  • Conference_Location
    Krakow
  • ISSN
    1529-4188
  • Print_ISBN
    0-7695-2641-1
  • Type

    conf

  • DOI
    10.1109/DEXA.2006.82
  • Filename
    1698330