• DocumentCode
    121203
  • Title

    Misleading Generalized Itemset Mining in the Cloud

  • Author

    Baralis, Elena ; Cagliero, Luca ; Cerquitelli, Tania ; Chiusano, Silvia ; Garza, Paolo ; Grimaudo, Luigi ; Pulvirenti, Fabio

  • Author_Institution
    Dipt. di Autom. e Inf., Politec. di Torino, Turin, Italy
  • fYear
    2014
  • fDate
    26-28 Aug. 2014
  • Firstpage
    211
  • Lastpage
    216
  • Abstract
    In the era of smart cities huge data volumes are continuously generated and collected, thus prompting the need for efficient and distributed data mining approaches. Generalized itemset mining is an established data mining technique, which entails the discovery of multiple-level patterns hidden in the analyzed data by exploiting analyst-provided taxonomies. Among the generalized itemsets, the most peculiar high-level patterns are those with many contrasting correlations among items at different abstraction levels. They represent misleading situations that are worth analyzing separately by experts during manual inspection. This paper proposes a novel cloud-based service, named MGI-CLOUD, to efficiently mine misleading multiple-level patterns, i.e., the Misleading Generalized Itemsets, on a distributed computing environment. MGI-CLOUD consists of a set of distributed MapReduce jobs running in the cloud. As a case study, the system has been contextualized in a real-life scenario, i.e., the analysis of traffic law infractions committed in a smart city environment. The experiments, performed on real datasets, demonstrate the efficiency and effectiveness of MGI-CLOUD.
  • Keywords
    cloud computing; data mining; MGI-CLOUD; analyst-provided taxonomies; cloud-based service; contrasting correlations; data volumes; distributed MapReduce jobs; distributed data mining approaches; misleading generalized itemset mining; multiple-level patterns; smart cities; traffic law infractions; Big data; Cities and towns; Correlation; Data mining; Distributed databases; Itemsets; Taxonomy; Generalized itemset mining; cloud-based service; distributed computing model; smart city;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing with Applications (ISPA), 2014 IEEE International Symposium on
  • Conference_Location
    Milan
  • Type

    conf

  • DOI
    10.1109/ISPA.2014.36
  • Filename
    6924449