Title :
"The 100 Most Influential Persons in History": A Data Mining Perspective
Author :
Al-Naimi, Noora Mohammad ; Shaban, Khaled Bashir
Author_Institution :
Comput. Sci. & Eng. Dept., Qatar Univ., Doha, Qatar
Abstract :
Data mining has been widely applied in various domains, however, there have been limited studies into discovering hidden knowledge from factual data about selected groups of people with special characteristics. It is important to mine data about such group of individuals to extract insightful knowledge that could lead to a better understanding of their personalities, in addition to further sociological conclusions. This paper presents the application and outcome of data mining techniques, namely data clustering and association rules extraction, to find common features and relations among social, environmental and socioeconomic factors from the lives of known influential individuals in history. The mining process was initiated by constructing a dataset through defining, extracting, and retrieving important known facts about these individuals from selected and reliable sources. Second, association rules discovery algorithms were applied in order to show interesting patterns and highlight relations between attributes. Finally, the data were clustered into different groups and each cluster was further analyzed to identify its most strongly defining attributes. The extracted association rules showed how some factors are related, such as the effect of environment type and order of birth in the family on the age at which the individual first engaged with their domain of influence. The clustering exercise demonstrated that influential people who grew up in families of a similar size and financial status share many similar characteristics.
Keywords :
data mining; pattern clustering; social sciences computing; association rules extraction; data clustering; data mining; environmental factors; factual data; hidden knowledge discovery; people group; social factors; socioeconomic factors; sociological conclusions; Art; Association rules; Clustering algorithms; Educational institutions; Feature extraction; History; association rules extraction; data clustering; mining of socioeconomic data; the 100 most influential people;
Conference_Titel :
Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4673-0005-6
DOI :
10.1109/ICDMW.2011.1