OPTIMIZED CROSS-DOMAIN MEDICAL IMAGE RETRIEVAL VIA MULTIMODAL FEATURE FUSION AND DEEP LEARNING-DRIVEN COMPUTATIONAL INTELLIGENCE

Hareem Ayesha; Muhammad Ans khalid; Muhammad Tanveer Meeran; Sami Ullah

doi:10.71146/kjmr453

Authors

Hareem Ayesha Institute of Computer Science and Information Technology, The Women University Multan Author
Muhammad Ans khalid Department of computer science and Information technology university of southern Punjab Multan Author
Muhammad Tanveer Meeran Faculty of computer science and mathematics, Universiti Malaysia Terengganu, Malaysia Author
Sami Ullah Department Computer Science, University of Hull London Author

DOI:

https://doi.org/10.71146/kjmr453

Keywords:

Digital Image, Machine Learning, Edge Histogram Descriptor, Scale-Invariant Feature transform

Abstract

Content-Based Image Retrieval (CBIR) has emerged as a pivotal research domain in computer vision, driven by the exponential growth of visual data and the demand for efficient retrieval systems that transcend traditional text-based methods. While existing tools leverage audio-visual content to navigate large-scale media databases, challenges persist in handling ever-expanding datasets, particularly in specialized fields like medical imaging. Medical images, acquired through diverse modalities (e.g., MRI, CT, X-ray), require precise modality identification to refine diagnostic workflows and enhance search accuracy. This study addresses this need by proposing a robust framework for modality classification and retrieval of medical images, leveraging advanced visual feature fusion and machine learning. Our approach integrates seven complementary visual features to capture texture, edge, and color characteristics: Scale-Invariant Feature Transform (SIFT), Local Binary Patterns (LBP), Local Ternary Patterns (LTP), Edge Histogram Descriptor (EHD), Color and Edge Directivity Descriptor (CEDD), wavelet-based color edge detection, and color histograms. These features are fused into a unified feature vector, enabling comprehensive representation of image content. The framework was evaluated on the ImageCLEF2012 modality classification dataset, a benchmark comprising 31 distinct modality classes. For classification, a Support Vector Machine (SVM) with a chi-square kernel was employed, chosen for its effectiveness in handling high-dimensional, non-linear data. The proposed system achieved an overall accuracy of 72.2%, surpassing the top ImageCLEF2012 visual feature-based result by 2.6%. This improvement underscores the efficacy of feature fusion in capturing discriminative patterns across modalities. Key innovations include the integration of wavelet-derived edge features with texture descriptors and the strategic use of a chi-square kernel to optimize similarity measurement in SVM decision boundaries. This work advances CBIR in medical imaging by demonstrating that hybrid feature engineering, combined with tailored classifier design, can significantly enhance modality identification. Future directions include exploring deep learning-based feature extraction and expanding the framework to multi-label classification scenarios.