• DocumentCode
    650719
  • Title

    Latent Co-development Analysis Based Semantic Search for Large Code Repositories

  • Author

    Venkataramani, Rahul ; Asadullah, Allahbaksh ; Bhat, Vasudev ; Muddu, Basavaraju

  • Author_Institution
    Int. Inst. of Inf. Technol., Bangalore, India
  • fYear
    2013
  • fDate
    22-28 Sept. 2013
  • Firstpage
    372
  • Lastpage
    375
  • Abstract
    Distributed and collaborative software development has increased the popularity of source code repositories like GitHub. With the number of projects in such code repositories exceeding millions, it is important to identify the domains to which the projects belong. A domain is a concept or a hierarchy of concepts used to categorize a project. We have proposed a model to cluster projects in a code repository by mining the latent co-development network. These identified clusters are mapped to domains with the help of a taxonomy which we constructed using the metadata from an online Question and Answer (Q&A) website. To demonstrate the validity of the model, we built a prototype for semantic search on source code repositories. In this paper, we outline the proposed model and present the early results.
  • Keywords
    Web sites; data mining; distributed databases; groupware; meta data; question answering (information retrieval); software engineering; ubiquitous computing; GitHub; collaborative software development; concept hierarchy; distributed software development; latent co-development analysis based semantic search; latent co-development network mining; metadata; online question-and-answer Web site; project categorization; semantic search; source code repositories; Communities; Data mining; Electronic publishing; Information services; Semantics; Software; Taxonomy; Human aspects of software evolution; Software repository analysis and mining; semantic search; source code repositories;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Maintenance (ICSM), 2013 29th IEEE International Conference on
  • Conference_Location
    Eindhoven
  • ISSN
    1063-6773
  • Type

    conf

  • DOI
    10.1109/ICSM.2013.50
  • Filename
    6676910