• DocumentCode
    734222
  • Title

    Incremental Sorting for Large Dynamic Data Sets

  • Author

    Aydin, Ahmet Arif ; Anderson, Kenneth M.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Colorado, Boulder, CO, USA
  • fYear
    2015
  • fDate
    March 30 2015-April 2 2015
  • Firstpage
    170
  • Lastpage
    175
  • Abstract
    In today´s world of pervasive computing, it is straightforward for organizations to generate large amounts of data in support of a variety of business needs. For this reason, it is important to build tools that allow analysts to manage and investigate these data sets quickly and efficiently. One feature needed by these tools is the ability to sort large amounts of data along a number of dimensions to facilitate the search for useful information. In this paper, we describe a new method for incrementally sorting large, multi-dimensional, dynamic data sets. Our particular use case involves sorting large Twitter data sets but our technique can be applied more generally across a variety of data types. Our approach is evaluated with respect to its scalability and by comparing it to several alternatives. It is currently able to efficiently sort data sets consisting of tens of millions of tweets along a variety of dimensions even when the data set is under active collection and new tweets are being added each day. The approach incrementally integrates the new tweets and provides sorted views of all tweets along various dimensions without having to re-sort the previously sorted tweets. The paper presents the benefits of the technique, discusses its limitations, and describes its software engineering contributions.
  • Keywords
    business data processing; social networking (online); software engineering; sorting; ubiquitous computing; Twitter data; business needs; incremental sorting; large dynamic data sets; multidimensional dynamic data sets; organizations; pervasive computing; software engineering contributions; sorted tweets; Browsers; Data analysis; Indexes; Scalability; Sorting; Twitter; big data; dynamic data sets; incremental sorting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on
  • Conference_Location
    Redwood City, CA
  • Type

    conf

  • DOI
    10.1109/BigDataService.2015.35
  • Filename
    7184878