A PRUNED VGG16 WITH HYBRID PREPROCESSING AND DATA BALANCING FOR ROBUST AND INTERPRETABLE LUNG CANCER CLASSIFICATION

Main Article Content

Marwa Salih Ramadhan
Mohammed Ahmed Shakir

Abstract

Lung cancer is the most common and deadliest type of cancer globally, creating a critical need for diagnostic tools that are not only accurate but also practical for clinical integration. This study introduces a robust, computationally efficient, and interpretable deep learning framework using Computed Tomography (CT) images to address limitations in existing models, such as high computational costs, poor data quality, and a lack of transparency. Our approach utilizes a VGG16 architecture, streamlined through structured pruning, which reduced the parameter count from 138.3M to 26.6M without compromising performance. We developed a hybrid pipeline with dual filtering and adaptive CLAHE to enhance image quality, while data diversity and imbalance were mitigated using hybrid augmentation and SMOTE. The model was trained with a rigorous strategy, including four-fold cross-validation and dual-phase fine-tuning with a dynamic learning rate, ensuring stable convergence. On a primary single-source dataset, the model achieved a test accuracy of 0.9910 and a Matthews Correlation Coefficient (MCC) of 0.9845. To validate real-world applicability, the framework was tested on a large multi-source dataset, demonstrating strong generalization with a balanced accuracy of 0.9693 and an MCC of 0.9427. Model interpretability was confirmed using Grad-CAM visualizations to highlight clinically relevant regions. This framework provides a highly accurate, computationally efficient, and generalizable solution with significant potential for clinical deployment as a reliable diagnostic aid

Downloads

Download data is not yet available.

Article Details

Section

Science Journal of University of Zakho

How to Cite

Salih, M., & Ahmed Shakir, M. (2025). A PRUNED VGG16 WITH HYBRID PREPROCESSING AND DATA BALANCING FOR ROBUST AND INTERPRETABLE LUNG CANCER CLASSIFICATION. Science Journal of University of Zakho, 13(4), 599-618. https://doi.org/10.25271/sjuoz.2025.13.4.1597

References

Al Najjar, Y. (2024). Comparative analysis of image quality assessment metrics: MSE, PSNR, SSIM, and FSIM. International Journal of Science and Research, 13(3), 110–114. https://doi.org/10.21275/SR24302013533

Alheeti, K. M. A., Al-Shouka, T. T., Majeed, S. H., & Ahmed, A. A. (2024). Lung cancer detection using machine learning and deep learning models. In 2024 21st International Multi-Conference on Systems, Signals & Devices (SSD) (pp. 63–69). IEEE. https://doi.org/10.1109/SSD61670.2024.1054950.

Al-Shouka, T. T., & Alheeti, K. M. A. (2023). A transfer learning for intelligent prediction of lung cancer detection. In 2023 Al-Sadiq International Conference on Communication and Information Technology (AICCIT) (pp. 54–59). IEEE. https://doi.org/10.1109/AICCIT57614.2023.10217967.

Al-Yasriy, H. F., Al-Husieny, M. S., Mohsen, F. Y., Khalil, E. A., & Hassan, Z. S. (2020). Diagnosis of lung cancer based on CT scans using CNN. IOP Conference Series: Materials Science and Engineering, 928, 032033. https://www.kaggle.com/datasets/hamdallak/the-iqothnccd-lung-cancer-dataset/data.

Anand, R., Rao, N., & Sumukh, D. (2022). Lung cancer detection and prediction using deep learning. International Journal of Engineering Applied Sciences and Technology.

Anusha, M., & Reddy, D. S. (2023). Lung carcinoma diagnosis and classification using deep learning. 2023 4th International Conference for Emerging Technology (INCET), 1–4. IEEE. https://doi.org/10.1109/INCET57972.2023.10170615.

Ardila, D., Kiraly, A. P., Bharadwaj, S., Choi, B., Reicher, L., Peng, L., Tse, D., Etemadi, M., Ye, W., Corrado, G. S., Naidich, D. P., & Shetty, S. (2019). End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nature Medicine, 25(6), 954–961. https://doi.org/10.1038/s41591-019-0447-x.

Benamara, Z., Zehani, S., & Zitouni, A. (2024). The effect of fully connected layers in different pre-trained CNN architectures on the enhancement of lung cancer classification. In 2024 8th International Conference on Image and Signal Processing and their Applications (ISPA) (pp. 1–6). IEEE. https://doi.org/10.1109/ISPA59904.2024.10536830.

Bray, F., Laversanne, M., Sung, H., Ferlay, J., Siegel, R. L., Soerjomataram, I., & Jemal, A. (2024). Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians, 74(3), 229–263. https://doi.org/10.3322/caac.21834.

Chang, Y., Jung, C., Ke, P., Song, H., & Hwang, J. (2018). Automatic contrast-limited adaptive histogram equalization with dual gamma correction. IEEE Access, 6, 11782–11792. https://doi.org/10.1109/ACCESS.2018.2797872

Chattopadhay, A., Sarkar, A., Howlader, P., & Balasubramanian, V. N. (2018, March). Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 839–847). IEEE. https://doi.org/10.1109/WACV.2018.00097.

Gupta, D., Dawn, S., & others. (2023). Detection and staging of lung cancer from CT scan images by deep learning. In 2023 International Conference on Disruptive Technologies (ICDT) (pp. 274–278). IEEE. https://doi.org/10.1109/ICDT57929.2023.10151194.

Ghosh, R., Ahamed, A., Sadhukhan, B., & Das, N. (2023). Lung nodule classification using MobileNet transfer learning. In 2023 9th International Conference on Smart Computing and Communications (ICSCC) (pp. 290–295). IEEE. https://doi.org/10.1109/ICSCC59169.2023.10335043.

Gugulothu, V. K., & Balaji, S. (2024). RETRACTED ARTICLE: An early prediction and classification of lung nodule diagnosis on CT images based on hybrid deep learning techniques. Multimedia Tools and Applications, 83, 1041–1061. https://doi.org/10.1007/s11042-023-15802-2

Ibrahim, W. R., & Mahmood, M. R. (2023). Classified Covid-19 By Densenet121-Based Deep Transfer Learning From Ct-Scan Images. Science Journal of University of Zakho, 11(4), 571 –https://doi.org/10.25271/sjuoz.2023.11.4.1166.

Jassim, O. A., Abed, M. J., & Saied, Z. H. (2024). Deep learning techniques in the cancer-related medical domain: A transfer deep learning ensemble model for lung cancer prediction. Baghdad Science Journal, 21(3). https://doi.org/10.21123/bsj.2023.8340.

Klangbunrueang, R., Pookduang, P., Chansanam, W., & Lunrasri, T. (2025, February). AI-powered lung cancer detection: Assessing VGG16 and CNN architectures for CT scan image classification. Informatics, 12(1), 18. MDPI. https://doi.org/10.3390/informatics12010018.

Kumaran, Y. S., Jeya, J. J., T. R, M., Khan, S. B., Alzahrani, S., & Alojail, M. (2024). Explainable lung cancer classification with ensemble transfer learning of VGG16, ResNet50 and InceptionV3 using Grad-CAM. BMC Medical Imaging, 24(1), 176. https://doi.org/10.1186/s12880-024-01345-x.

Leiter, A., Veluswamy, R. R., & Wisnivesky, J. P. (2023). The global burden of lung cancer: Current status and future trends. Nature Reviews Clinical Oncology, 20, 624–639. https://doi.org/10.1038/s41571-023-00798-3.

Murad, S. H., Awlla, A. H., & Moahmmed, B. T. (2023). Prediction lung cancer based critical factors using machine learning. Science Journal of University of Zakho, 11(3), 447–452. https://doi.org/10.25271/sjuoz.2023.11.3.1105

National Lung Screening Trial Research Team (NLST). (2011). Reduced lung-cancer mortality with low-dose computed tomographic screening. New England Journal of Medicine, 365(5), 395–409. https://doi.org/10.1056/NEJMoa1102873.

Naseer, I., Akram, S., Masood, T., Rashid, M., & Jaffar, A. (2023). Lung cancer classification using modified U-Net based lobe segmentation and nodule detection. IEEE Access, 11, 60279–60291. https://doi.org/10.1109/ACCESS.2023.3285821.

Park, S., Park, H., Lee, S. M., Kim, H., & Goo, J. M. (2022). Application of computer-aided diagnosis for Lung-RADS categorization in CT screening for lung cancer: Effect on inter-reader agreement. European Radiology, 32(2), 1054–1064. https://doi.org/10.1007/s00330-021-08202-3

Ravindra, C., Nalband, A. H., Kumar, G., Basheer, S., & Ravindra, M. (2024). From pixels to prognosis: A deep dive into lung cancer subtype classification using transfer learning. In 2024 IEEE International Conference on Contemporary Computing and Communications (InC4) (Vol. 1, pp. 1–6). IEEE. https://doi.org/10.1109/InC460750.2024.10649168.

Rodrigues, R., Lévêque, L., Gutiérrez, J., Jebbari, H., Outtas, M., Zhang, L., Chetouani, A., Al-Juboori, S., Martini, M. G., & Pinheiro, A. M. G. (2024). Objective quality assessment of medical images and videos: Review and challenges. Multimedia Tools and Applications, 1–34. https://doi.org/10.1007/s11042-024-20292-x.

Sangeetha, M., Devi, R. M., Gunasekaran, H., Venkatesan, R., Ramalakshmi, K., & Murugesan, P. (2023). Deep residual learning for lung cancer nodules detection and classification. In 2023 7th International Conference on Computing Methodologies and Communication (ICCMC) (pp. 907–912). IEEE. https://doi.org/10.1109/ICCMC56507.2023.10083783.

Singh, A., & Kamath, S. (2024). Preprocessing of CT scans for lung cancer detection. In 2024 4th International Conference on Intelligent Technologies (CONIT) (pp. 1–4). IEEE. https://doi.org/10.1109/CONIT61985.2024.10626594.

Tandon, R., Agrawal, S., Chang, A., & Band, S. S. (2022). VCNet: Hybrid deep learning model for detection and classification of lung carcinoma using chest radiographs. Frontiers in Public Health, 10, 894920. https://doi.org/10.3389/fpubh.2022.894920.

World Health Organization (WHO). (2023). Lung cancer (Report No. WHO/2023/LC_FS). https://www.who.int/news-room/fact-sheets/detail/lung-cancer.

Most read articles by the same author(s)