عنوان مقاله :
ارايه يك الگوريتم جديد تركيبي انتخاب زير مجموعه ويژگي جهت تحليل داده هاي طيف جرمي ليزري سرطان تخمدان
عنوان به زبان ديگر :
A New Hybrid Feature Subset Selection Algorithm for the Analysis of
Ovarian Cancer Data Using Laser Mass Spectrum
پديد آورندگان :
منتظري كردي، حسين نويسنده دانشگاه تربيت مدرس تهران Montazery Kordy, H. , ميران بيگي، محمدحسين نويسنده گروه مهندسي پزشكي -دانشگاه تربيت مدرس تهران Miran-Beigi2, M.H. , مرادي، محمدحسن نويسنده دانشگاه صنعتي اميركبيرتهران Muradi, H.H
اطلاعات موجودي :
فصلنامه سال 1386 شماره 14
كليدواژه :
Ovarian cancer , Feature subset selection algorithm , Laser mass spectrum , پروتيين شناسي , biomarkers , نشانگر حياتي , سرطان تخمدان , طيف جرمي ليزري , Proteornics , الگوريتم انتخاب زير مجموعه ويژگي
چكيده لاتين :
Introduction: A major problem in the treatment of cancer is the lack of an appropriate method for the
early diagnosis of the disease. The chemical reaction within an organ may be refleeted in the form of
proteomic patterns in the serum, sputum, or urine. Laser mass spectrometry is a valuable tool for
extracting the protcomic patterns from biological samples. A major challenge in extracting such patterns
is the optimum selection of feature subset from mass spectrum data.
Materials and Methods: In this research, the data corresponding to proteornic patterns of serum from
patients with ovarian cancer was analyzed in two independent groups. Using a mathematical model. the
baseline and electrical noises were eliminated in the preprocessing stage with subsequent normalization
of mass spectra. The proposed method uses a hybrid algorithm based on a statistical test and
Bhattacharyya distance measure. Using the final prediction error criteria, the best feature subset was
selected from 15154 data points while maintaining the resolution and the valuable information. The
selected feature subset was then used for the detection of biomarkers within the mass spectrum.
Results: Using the method of k-fold cross validation, the samples under study were divided into two sets
called the learning and test. Using the least threshold value, the points having significance difference (pvalue
< 0.05) were selected. The best subset was then extracted from the remaining points such that it
would have the maximum information content. By doing so, the number of input variables was reduced
from 15154 to 80 points. In the next step, 16 and 6 biomarkers were selected for the two independent
dataset. The obtained results show accuracy. specificity as well as sensitivity of I00%.
Discussion and Conclusion: To diagnose a disease in medicine is an example of pattern recognition in
engineering and physical science. In this paper, a filter approach is introduced for feature subset selection
which extracts appropriate features in the input space by using the combination of statistical method and
distance measure based on information criteria. The result of this study emphasizes that the usc of
combination approach in feature extraction and selection in high dimensional data can appropriately
separate the pattern classes in addition to maintaining the information content.
اطلاعات موجودي :
فصلنامه با شماره پیاپی 15 سال 1386
كلمات كليدي :
#تست#آزمون###امتحان