Title :
Mining Open Source Software data using regular expressions
Author :
Li, Qifeng ; Li, Bing
Author_Institution :
State Key Lab. of Software Eng., Wuhan Univ., Wuhan, China
Abstract :
The Open Source Software (OSS) management has attracted considerable attention in the last few years. Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, and data dumps may require a huge effort to understand schemas and tables. It is difficult to collect coherent, quantitative data continuously and to utilize the data for practicing software process improvement. In this paper, we report our results of mining data acquired from SourceForge.net, the largest open source software hosting website. In the process we describe Mailing list Crawler (MC) which automatically collects Mailing lists repositories in widely used software development support systems. Providing integrated measurement results graphically, MC can help developers/managers keep projects under control in real time.
Keywords :
Web sites; data mining; project management; public domain software; software management; software process improvement; SourceForge.net; data mining; mailing list crawler; mailing lists repository; open source software data; open source software hosting Website; open source software management; project management; software development support system; software process improvement; Algorithm design and analysis; Crawlers; Data mining; Databases; Programming; Software; Software engineering; Mailing list; Open Source Software; Regular Expressions;
Conference_Titel :
Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-61284-203-5
DOI :
10.1109/CCIS.2011.6045129