MACHINE LEARNING APPROACH FOR BREAST CANCER CLASSIFICATION

Aimufua, Gilbert Imuetinyan Osaze; Mbanaso, Uche M.; Abdullahi, M.U.

MACHINE LEARNING APPROACH FOR BREAST CANCER CLASSIFICATION

Files

28. June (2022).pdf (2.08 MB)

Date

2022-06-20

Authors

Aimufua, Gilbert Imuetinyan Osaze

Mbanaso, Uche M.

Abdullahi, M.U.

Publisher

Department of Computer Science, Nasarawa State University Keffi

Abstract

Breast cancer is the most common cancer among women in Africa. These facts have led researchers to continue studying how to treat and detect breast cancer in women, especially older women, who are at higher risk. Achieving satisfactory cancer classification accuracy with the complete set of genes remains a great challenge (most especially with microarray datasets), due to the high dimensions, small sample size, and presence of noise in gene expression data. Feature reduction is critical and sensitive to the classification task. One of the major drawbacks of cancer studies is recognizing informative genes (features) among the thousands of others in the dataset. A large number of features (genes) against a small sample size and redundancy in expressed data are the main two reasons that lead to poor classification accuracy in machine learning and data mining processes. Therefore, dimensionality reduction is an exciting research area in the fields of pattern recognition, machine learning, data mining, and statistics. The purpose of dimensionality reduction is to improve classification performance through the removal of redundant or irrelevant features. Furthermore, feature selection is typically useful in reducing computation time and memory complexity, which have always been challenges in big data tasks. Besides, the high complexity of the memory space or time as a result of high dimension, noise effect, and outliers but it also has adverse impacts on the performance of the algorithms This paper tends to improve the low general accuracy and minimize memory space and execution time in classification models of machine learning algorithms; hence, the system will employ InfoGain for dimensional reduction and the Random Forest algorithm for classification.

Keywords

Breast cancer, random forest algorithm, machine learning, database, information gain algorithm, confusion matrix, feature selection, filtering technique.

Citation

Aimufua, G.I.O. & et al. (2022) MACHINE LEARNING APPROACH FOR BREAST CANCER CLASSIFICATION

URI

https://keffi.nsuk.edu.ng/handle/20.500.14448/5573

Collections

Articles

Full item page

MACHINE LEARNING APPROACH FOR BREAST CANCER CLASSIFICATION

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections