• DocumentCode
    3259099
  • Title

    Clustering Workflow Requirements Using Compression Dissimilarity Measure

  • Author

    Li Wei ; Handley, John ; Martin, Nathaniel ; Tong Sun ; Keogh, Eamonn

  • Author_Institution
    Dept. of Comput. Sci., California Univ., Riverside, CA
  • fYear
    2006
  • fDate
    Dec. 2006
  • Firstpage
    50
  • Lastpage
    54
  • Abstract
    Xerox offers a bewildering array of printers and software configurations to satisfy the need of production print shops. A configuration tool in the hands of sales analysts elicits requirements from customers and recommends a list of product configurations. This tool generates special question and answer case logs that provide useful historical data. Given the unusual semi-structured question and answer format, this data is not amenable to any standard document clustering method. The authors discovered that a hierarchical agglomerative approach using a compression-based dissimilarity measure (CDM) provided readily interpretable clusters. The authors compared this method empirically to two reasonable alternatives, latent semantic analysis and probabilistic latent semantic analysis, and conclude that CDM offers an accurate and easily implemented approach to validate and augment our configuration tool
  • Keywords
    configuration management; digital printing; printers; workflow management software; answer case logs; bewildering array; clustering workflow requirements; compression dissimilarity measure; document clustering; hierarchical agglomerative; printers configurations; probabilistic latent semantic analysis; software configurations; unusual semistructured question; Books; Clustering methods; Computer science; Finishing; Marketing and sales; Presses; Printers; Printing; Production; Publishing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    0-7695-2702-7
  • Type

    conf

  • DOI
    10.1109/ICDMW.2006.44
  • Filename
    4063597