Abstract
Histopathological image classification (HIC) plays a pivotal role in computer-aided diagnosis, enabling lesion characterization (e.g., tumor grading) and survival outcome prediction. Despite recent advances in HIC, existing methods still face challenges in integrating domain-specific knowledge, addressing class imbalance, and ensuring computational efficiency. To address these challenges, we propose AMLPF-CLIP, an enhanced CLIP-based framework for HIC featuring three key innovations. First, we introduce an Adaptive Multi-Level Prompt Fusion (AMLPF) strategy that leverages three levels of textual prompts: class labels, basic descriptions, and GPT-4o-generated detailed pathological features for enhanced semantic representation and cross-modal alignment. Second, we design a class-balanced resampling method that dynamically adjusts sampling weights based on both data imbalance and classification performance, targeting underrepresented, low-confidence classes. Third, we develop a Knowledge Distillation (KD) technique that leverages output-level alignment via L2 loss, transferring knowledge from a large Vision Transformer (ViT-L/16) to a lightweight ResNet-50-based CLIP model. Extensive experiments on three public datasets demonstrate that AMLPF-CLIP consistently outperforms eleven state-of-the-art methods, achieving accuracy improvements of 1.19% on Chaoyang, 2.64% on BreaKHis, and 0.90% on LungHist700. AMLFP-CLIP also demonstrates improved robustness and efficiency, highlighting its practical applicability.
| Original language | English |
|---|---|
| Pages (from-to) | 1-12 |
| Number of pages | 12 |
| Journal | IEEE Journal of Biomedical and Health Informatics |
| DOIs | |
| Publication status | E-pub ahead of print - 9 Oct 2025 |
Keywords
- CLIP
- Histopathological image classification
- Imbalanced classification
- Knowledge distillation
- Multimodal learning
ASJC Scopus subject areas
- Computer Science Applications
- Health Informatics
- Electrical and Electronic Engineering
- Health Information Management