• DocumentCode
    2345822
  • Title

    The Ultimate Debian Database: Consolidating bazaar metadata for Quality Assurance and data mining

  • Author

    Nussbaum, Lucas ; Zacchiroli, Stefano

  • Author_Institution
    LORIA, Nancy-Univ., Nancy, France
  • fYear
    2010
  • fDate
    2-3 May 2010
  • Firstpage
    52
  • Lastpage
    61
  • Abstract
    FLOSS distributions like RedHat and Ubuntu require a lot more complex infrastructures than most other FLOSS projects. In the case of community-driven distributions like Debian, the development of such an infrastructure is often not very organized, leading to new data sources being added in an impromptu manner while hackers set up new services that gain acceptance in the community. Mixing and matching data is then harder than should be, albeit being badly needed for Quality Assurance and data mining. Massive refactoring and integration is not a viable solution either, due to the constraints imposed by the bazaar development model. This paper presents the Ultimate Debian Database (UDD), which is the countermeasure adopted by the Debian project to the above ¿data hell¿. UDD gathers data from various data sources into a single, central SQL database, turning Quality Assurance needs that could not be easily implemented before into simple SQL queries. The paper also discusses the customs that have contributed to the data hell, the lessons learnt while designing UDD, and its applications and potentialities for data mining on FLOSS distributions.
  • Keywords
    SQL; data mining; marketing data processing; meta data; public domain software; quality assurance; Debian project; FLOSS distribution; FLOSS project; RedHat; SQL database; SQL query; Ubuntu; bazaar development model; bazaar metadata; community driven distribution; data mining; open source software; quality assurance; Computer bugs; Computer hacking; Data mining; Data warehouses; Databases; Open source software; Packaging; Quality assurance; Standardization; Turning; data mining; data warehouse; distribution; open source; quality assurance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on
  • Conference_Location
    Cape Town
  • Print_ISBN
    978-1-4244-6802-7
  • Electronic_ISBN
    978-1-4244-6803-4
  • Type

    conf

  • DOI
    10.1109/MSR.2010.5463277
  • Filename
    5463277