Title :
Classifying breast cancer types based on fine needle aspiration biopsy data using random forest classifier
Author :
Ahmad, Farzana Kabir ; Yusoff, Nooraini
Author_Institution :
Sch. of Comput., Univ. Utara Malaysia, Sintok, Malaysia
Abstract :
Breast cancer is a complex and heterogeneous disease due to its diverse morphological features, as well as different clinical outcome. As a result, breast cancer patients may response to different therapeutic options. Currently, difficulties in recognizing the breast cancer types lead to inefficient treatments. Generally, there are two types of breast cancer, known as malignant and benign. Therefore it is necessary to devise a clinically meaningful classification of the disease that can accurately classify breast cancer tissues into relevant classes. This study aims to classify breast cancer lesions which have been obtained from fine needle aspiration (FNA) procedure using random forest. Random forest is a classifier built based on the combination of decision trees and has been identified to perform well in comparison to other machine learning techniques. This method has been tested on approximately 700 data, which consists of 458 instances from benign cases and 241 instances belong to malignant cases. The performance of proposed method is measured based on sensitivity, specificity and accuracy. The experimental results show that, random forest achieved sensitivity of 75%, specificity of 70% and accuracy about 72%. Thus, it can be concluded that random forest can accurately classify breast cancer types given a small number of features and it works as a promising tool to differentiate malignant from benign tumor at early stage.
Keywords :
cancer; decision trees; image classification; learning (artificial intelligence); medical image processing; tumours; FNA procedure; benign tumor; breast cancer lesions; breast cancer patients; breast cancer tissues; breast cancer type classification; clinical outcome; decision trees; diverse morphological features; fine needle aspiration biopsy data; heterogeneous disease; machine learning techniques; malignant; random forest classifier; sensitivity; specificity; therapeutic options; Breast cancer; Current measurement; Diseases; Feature extraction; Needles; Breast cancer; fine needle aspiration; random forest;
Conference_Titel :
Intelligent Systems Design and Applications (ISDA), 2013 13th International Conference on
Conference_Location :
Bangi
Print_ISBN :
978-1-4799-3515-4
DOI :
10.1109/ISDA.2013.6920720