Sensors, Vol. 26, Pages 209: Knowledge Distillation Meets Reinforcement Learning: A Cluster-Driven Approach to Image Processing
Sensors doi: 10.3390/s26010209
Authors:
Titinunt Kitrungrotsakul
Yingying Xu
Preeyanuch Srichola
Knowledge distillation (KD) enables the training of lightweight yet effective models, particularly in the visual domain. Meanwhile, reinforcement learning (RL) facilitates adaptive learning through environment-driven interactions, addressing the limitations of KD in handling dynamic and complex tasks. We propose a novel two-stage framework integrating Knowledge Distillation with Reinforcement Learning (KDRL) to enhance model adaptability to complex data distributions, such as remote sensing and medical imaging. In the first stage, supervised fine-tuning guides the student model using logit and feature-based distillation. The second stage refines the model via RL, leveraging confidence-based and cluster alignment rewards while dynamically reducing reliance on task loss. By combining the strengths of supervised knowledge distillation and reinforcement learning, KDRL provides a comprehensive approach to address the dual challenges of model efficiency and domain heterogeneity. A key innovation is the introduction of auxiliary layers within the student encoder to evaluate and reward the alignment of the characteristics with the teacher’s cluster centers, promoting robust feature learning. Our framework demonstrates superior performance and computational efficiency across diverse tasks, establishing a scalable design for efficient model training. Across remote sensing benchmarks, KDRL boosts the lightweight CLIP/ViT-B-32 student to 69.51% zero-shot accuracy on AID and 80.08% on RESISC45; achieves state-of-the-art cross-modal retrieval on RSITMD with 67.44% (I→T) and 74.76% (T→I) at R@10; and improves DIOR-RSVG visual-grounding precision to 64.21% at Pr@0.9. These gains matter in real deployments by reducing missed targets and speeding analyst search on resource-constrained platforms.
Source link
Titinunt Kitrungrotsakul www.mdpi.com
