مرکز منطقه ای اطلاع رساني علوم و فناوري - Variable-grain and dynamic work generation for Minimal Unique Itemset mining

DocumentCode :

3247451

Title :

Variable-grain and dynamic work generation for Minimal Unique Itemset mining

Author :

Yiapanis, Paraskevas ; Haglin, David J. ; Manning, Anna M. ; Mayes, Ken ; Keane, John

Author_Institution :

Sch. of Comput. Sci., Univ. of Manchester, Manchester

fYear :

2008

fDate :

Sept. 29 2008-Oct. 1 2008

Firstpage :

Lastpage :

Abstract :

SUDA2 is a recursive search algorithm for minimal unique itemset detection. Such sets of items are formed via combinations of non-obvious attributes enabling individual record identification. The nature of SUDA2 allows work to be divided into non-overlapping tasks enabling parallel execution. Earlier work developed a parallel implementation for SUDA2 on an SMP cluster, and this was found to be several orders of magnitude faster than sequential SUDA2. However, if fixed-granularity parallel tasks are scheduled naively in the order of their generation, the system load tends to be imbalanced with little work at the beginning and end of the search. This paper investigates the effectiveness of variable-grained and dynamic work generation strategies for parallel SUDA2. These methods restrict the number of sub-tasks to be generated, based on the criterion of probable work size. The further we descend in the search recursion tree, the smaller the tasks become, thus we only select the largest tasks at each level of recursion as being suitable for scheduling. The revised algorithm runs approximately twice as fast as the existing parallel SUDA2 for finer levels of granularity when variable-grained work generation is applied. The dynamic method, performing level-wise task selection based on size, outperforms the other techniques investigated.

Keywords :

data mining; parallel processing; scheduling; search problems; trees (mathematics); SMP cluster; dynamic work generation; fixed-granularity parallel task scheduling; individual record identification; level-wise task selection based; minimal unique itemset mining; parallel SUDA2; recursive search algorithm; search recursion tree; variable-grained work generation; Bioinformatics; Clustering algorithms; Computer science; Data mining; Detection algorithms; Genetics; Itemsets; Iterative algorithms; NP-hard problem; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Cluster Computing, 2008 IEEE International Conference on

Conference_Location :

Tsukuba

ISSN :

1552-5244

Print_ISBN :

978-1-4244-2639-3

Electronic_ISBN :

1552-5244

Type :

conf

DOI :

10.1109/CLUSTR.2008.4663753

Filename :

4663753

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3247451