• DocumentCode
    3590235
  • Title

    An axiomatic inspection of the behavior of topic models with data aggregation

  • Author

    Deolalikar, Vinay

  • fYear
    2014
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Topic modeling has various applications in organizing and retrieving textual data in document collections. In enterprises, such collections are often distributed across various sites, and collaboratively aggregated at the time of processing. Therefore, the problem of topic modeling over aggregations of data is important. We study the behavior of a standard topic modeling technique-hierarchical Dirichlet process (HDP)-as the underlying data is aggregated. We formulate three axioms that reflect the assumptions that users frequently make when dealing with aggregated data. We empirically demonstrate that HDP does not necessarily satisfy these axioms. We discuss the ramifications of this on applications in enterprise settings.
  • Keywords
    document handling; information retrieval; HDP; axiomatic inspection; data aggregation; document collections; enterprise settings; hierarchical dirichlet process; standard topic modeling technique; textual data retrieval; topic model behavior; Analytical models; Computational modeling; Data mining; Data models; Distributed databases; Information management; Standards;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2014 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BigData.2014.7111660
  • Filename
    7111660