Title :
A New Technology for Combining Small Samples Based on Clustering and Its Applications
Author :
Zonglei, Lu ; Jiandong, Wang ; Yunfeng, Zai
Author_Institution :
Coll. of Inf. Sci. & Technol., Nanjing Univ. of Aeronaut. & Astronaut., Nanjing
Abstract :
Samples are important research objects of data mining. Limited by the basic theory of data mining, the sample size cannot be too small. However, it is difficult to collect enough data in some applications. Sometimes, strict requirement for sample collection lead to the generation of many small sample sets with similar characteristics. If the constraint for data collection is relaxed, the similar samples may be combined into a large sample set. The process of combining small samples is essentially a process of clustering, since both processes involve grouping data based on similarity. A new clustering algorithm, which is independent of the similarity, is presented in this paper. With this algorithm, 1516 samples of flights records are reduced to 4 large sample sets. The experiments show that the combining is helpful for determining the probability distribution of the samples, which is useful for flight delay early warning system.
Keywords :
data mining; learning (artificial intelligence); pattern clustering; probability; data mining; intelligent information processing; machine learning; probability distribution; sample clustering algorithm; sample collection; Aircraft; Clustering algorithms; Data mining; Delay effects; Large-scale systems; Learning systems; Machine learning; Probability distribution; Space technology; Statistical distributions; Clustering; Data Mining; Flights Delay; Sample Combining;
Conference_Titel :
Knowledge Acquisition and Modeling, 2008. KAM '08. International Symposium on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-3488-6