• DocumentCode
    3453575
  • Title

    A Data Set for Social Diversity Studies of GitHub Teams

  • Author

    Vasilescu, Bogdan ; Serebrenik, Alexander ; Filkov, Vladimir

  • Author_Institution
    Univ. of California, Davis, Davis, CA, USA
  • fYear
    2015
  • fDate
    16-17 May 2015
  • Firstpage
    514
  • Lastpage
    517
  • Abstract
    Like any other team oriented activity, the software development process is effected by social diversity in the programmer teams. The effect of team diversity can be significant, but also complex, especially in decentralized teams. Discerning the precise contribution of diversity on teams´ effectiveness requires quantitative studies of large data sets. Here we present for the first time a large data set of social diversity attributes of programmers in GitHub teams. Using alias resolution, location data, and gender inference techniques, we collected a team social diversity data set of 23,493 GitHub projects. We illustrate how the data set can be used in practice with a series of case studies, and we hope its availability will foster more interest in studying diversity issues in software teams.
  • Keywords
    social aspects of automation; software engineering; GitHub teams; alias resolution; decentralized teams; gender inference techniques; location data; programmer teams; social diversity attributes; social diversity study; software development process; software teams; team diversity effect; team oriented activity; team social diversity data set; Cultural differences; Diversity reception; Electronic mail; History; Indexes; Productivity; Software;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/MSR.2015.77
  • Filename
    7180131