• DocumentCode
    3686476
  • Title

    Investigating Samples Representativeness for an Online Experiment in Java Code Search

  • Author

    Rafael M. de Mello;Kathryn T. Stolee;Guilherme H. Travassos

  • Author_Institution
    Fed. Univ. of Rio de Janeiro, Rio de Janeiro, Brazil
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Context: The results of large-scale studies in software engineering can be significantly impacted by samples´ representativeness. Diverse population sources can be used to support sampling for such studies. Goal: To compare two samples, one from the crowdsourcing platform Mechanical Turk and another from the professional social network LinkedIn, in an online experiment for evaluating the relevance of Java code snippets to programming tasks. Method: To compare the samples (subjects´ experience, programming habits) and experimental results concerned with three experimental trials. Results: LinkedIn´s subjects present significantly higher levels of experience in Java programming and programming in general than Mechanical Turk´s subjects. The experimental results revealed a significant difference between samples and suggested that LinkedIn´s subjects were more pessimistic than Mechanical Turk´s subjects despite a high level consistency in the experimental results. Conclusion: The combined use of sources of sampling can bring benefits to large scale studies in software engineering, especially when heterogeneity is desired in the population. Thus, it can be useful to investigate and characterize alternative sources of sampling for performing large-scale studies in software engineering.
  • Keywords
    "LinkedIn","Sociology","Statistics","Programming profession","Context","Java"
  • Publisher
    ieee
  • Conference_Titel
    Empirical Software Engineering and Measurement (ESEM), 2015 ACM/IEEE International Symposium on
  • Type

    conf

  • DOI
    10.1109/ESEM.2015.7321205
  • Filename
    7321205