Learning Paradigms

Latest Knowledge Distillation Research Papers

The newest Knowledge Distillation papers from across the field — arXiv, NeurIPS, CVPR, Nature, and more — refreshed daily and ranked by relevance. Distill AI tracks Knowledge Distillation so you don’t have to: get the standout work delivered to your inbox every morning, with 2-sentence summaries and the option to chat with any paper.

Get the latest Knowledge Distillation papers in your inbox — free →

Recent papers

X$^3$-OPD: Distilling Reasoning into Large Audio-Language Models via On-Policy Alignment
Dongjie Fu, Di Cao, Xize Cheng, Zihan Zhang et al. · arXiv · Jul 23, 2026
While large audio-language models have achieved remarkable progress in auditory perception, they still lag behind text-based large language models in deep logical reasoning, primarily due to the scarcity of high-quality audio reasoning data…
An End-to-End Trajectory Prediction Method for Unmanned Ground Vehicles via Multimodal Fusion
Y T Li, Erming Tian, Fuhe Yang, Huiyan Han et al. · Sensors · Jul 22, 2026
To enhance unmanned ground vehicle (UGV) intelligence in smart cities, disaster rescue, and infrastructure inspection, this paper investigates the collaborative optimization of multimodal fusion end-to-end architectures. Through dynamic ali…
CircuitKIT : Circuit Discovery, Evaluation, and Application Toolkit for Mechanistic Interpretability
Pratinav Seth, Hem Gosalia, Aditya Kasliwal, Vinay Kumar Sankarapu · arXiv · Jul 21, 2026
Circuit analysis can support not only model explanation but also downstream interventions such as pruning, editing, steering, and selective fine-tuning. However, conducting such analyses currently requires stitching together separate implem…
A Lightweight Graph Knowledge Distillation Framework for Real-Time Equipment Health Prediction on Edge Devices
Qiang Fei, Li Ping, Wei Zhang, Fan Li · International Journal of Co... · Jul 21, 2026
Real-time equipment health prediction on industrial edge devices requires predictive models that remain accurate under tight memory, energy, and latency constraints. Graph neural networks are attractive for this task because degradation sig…
A digital twin modeling for complex systems with multi-scale fault behaviors using active knowledge transposed distillation
Jinhan Zhou, Jinsong Yu, Diyin Tang, Pengcheng Zhang et al. · Applied Soft Computing · Jul 20, 2026
A Lightweight Universal Machine-Learning Interatomic Potential via Knowledge Distillation for Scalable Atomistic Simulations
Sangmin Oh, J.H. You, Jaesun Kim, Jiho Lee et al. · Journal of Chemical Informa... · Jul 20, 2026
We introduce a lightweight universal machine-learning interatomic potential (uMLIP), SevenNet-Nano, based on the graph neural network architecture SevenNet and enabled by a knowledge-distillation framework. The model inherits the broad gene…
Enhancing crack detection via memory-aware dynamic knowledge distillation
赵亮 Zhao Liang, Jiao Yutong, Chen Dengfeng, Shipeng Liu · Multimedia Systems · Jul 20, 2026
A Lightweight Student Network with Dynamic Multi-Teacher Distillation for Optical Remote Sensing Object Detection
Jiarui Cai, Xudong Su, Hu Deng, Jun Deng · Sensors · Jul 20, 2026
Optical remote sensing object detection faces challenges such as large variations in scale, slender and direction-sensitive targets, complex backgrounds, and limited deployment resources. This paper proposes a lightweight geometrically deco…
Personalized Data-Free Knowledge Distillation for Federated Learning under Heterogeneous Models and Data
Jingke Tu, Lei Yang, Chao Ma, Weigang Wu · ACM Transactions on Knowled... · Jul 18, 2026
Knowledge Distillation (KD) is considered as an efficient way to replace the parameter averaging in federated learning, aiming to handle the clients with heterogeneous model architectures. Relying on the prepared distillation datasets acros…
Real-Time PAUT Defect Classification with Class-Weighted Knowledge Distillation
Minsu Jeon, Robin Guyon, Clément Fisher, Duhwan Mun et al. · e-Journal of Nondestructive... · Jul 17, 2026
Phased array ultrasonic testing (PAUT) provides high-resolution subsurface imaging through electronic beam steering and focusing and is widely used for internal defect diagnosis in structures. However, effectively interpreting PAUT data req…
On-Policy Delta Distillation
Byeongho Heo, Jaehui Hwang, Sangdoo Yun, Dongyoon Han · arXiv · Jul 16, 2026
On-policy distillation is an alternative post-training method in reinforcement learning that alleviates the constraints imposed by reward models by providing token-level supervision from a teacher model. Although on-policy distillation has …
Efficient power quality identification algorithm based on knowledge distillation and improved GhostNet
Zhenguan Cao, Zhian Luo, Yue Wang, Tingxiang Fan et al. · Electric Power Systems Rese... · Jul 16, 2026
Lightweight Semantic Perception from UAV-Borne Visual Sensors via Conflict-Suppressed Heterogeneous Expert Distillation
Feng Ouyang, Yongpeng Ding, Miao Qin, Weiting Xie et al. · Sensors · Jul 16, 2026
UAV-borne visual sensors provide high-resolution aerial observations for low-altitude scene understanding, urban monitoring, traffic observation, emergency inspection, and infrastructure assessment. However, semantic perception from UAV vis…
Lightweight mmWave radar human action recognition via knowledge distillation and physically enhanced representation
F M Liu, Yue Wang, Zuoheng Liu, Hanbo Liu et al. · Measurement Science and Tec... · Jul 16, 2026
Abstract Human activity recognition (HAR) based on millimeter-wave (mmWave) radar has attracted sig-nificant attention due to its advantages in non-contact sensing and privacy preservation. Howev-er, extracting robust fine-grained features …
A future-guided knowledge distillation framework for online degradation prognostics of proton exchange membrane fuel cells
Jing He, Teng Xu, Kai Zhang, Yong Li et al. · Applied Energy · Jul 16, 2026
An informed regression-based knowledge distillation framework for simultaneous prediction of physical and mechanical properties of thermoset epoxy polymers
B.S. Sindu, Jan Hamaekers · Scientific Reports · Jul 16, 2026
Epoxy polymers are widely used due to their multifunctional properties, however their complex 3D molecular structure, multi-component nature, and lack of curated datasets have limited the application of machine learning (ML) for these mater…
Bioactive potential of Ecuadorian Lippia dulcis essential oil: accelerated wound healing and chemical profiling
Chabaco Armijos, Jorge Ramírez, Gaby Cevallos, Santiago Ballaz et al. · Scientific Reports · Jul 16, 2026
This study describes the results of a phytochemical and pharmacological study of Lippia dulcis Trevir, grown in southern Ecuador, where the plant, commonly known by the name buscapina, is widely used by the local population for its therapeu…
Gait-semantic relational knowledge distillation for ground reaction force estimation
Huisu Lim, Jisoo Lee, Noah Kettner, Omik M. Save et al. · Advanced Engineering Inform... · Jul 15, 2026
Noise-robust iterative knowledge distillation for MIL-based weakly supervised histopathology segmentation
Yinsheng He, Roger J. Zemp, Xingyu Li · Biomedical Signal Processin... · Jul 15, 2026
A Lightweight Mining-Area Remote Sensing Scene Classification Framework via Knowledge Distillation and Channel-Aware Non-Local Attention
Wenxi He, Zi Li, Liangjun Wang, Weitao Chen · Land · Jul 15, 2026
Mining-area remote sensing scene classification plays an important role in mineral resource monitoring and ecological environment assessment. However, high-accuracy deep learning models usually exhibit complex architectures, large parameter…
Requential Coding: Pushing the Limits of Model Compression with Self-Generated Training Data
Shikai Qiu, Marc Finzi, Yujia Zheng, Kun Zhang et al. · arXiv · Jul 13, 2026
Compression is fundamental to intelligence. A model that can represent its training data as a short code has discovered regularities that enable generalization. Large neural networks may learn functions far simpler than their parameter coun…
Generic saliency-guided image fusion GAN based on reconstruction knowledge distillation
Mohamed Kas, Ibrahim Kajo, Abderrazak Chahi, Yassine Ruichek · Multimedia Tools and Applic... · Jul 13, 2026
Research on a lightweight point-line feature extraction network based on knowledge distillation
Shizu Wei, Bo Xie, Pengju Hou, Xiaochun Yang et al. · OpenAlex · Jul 13, 2026
Achieving efficient and robust point-line feature extraction remains a key challenge for real-time visual simultaneous localization and mapping (Visual SLAM) systems, particularly on resource-constrained platforms. Existing deep learning-ba…
A Transformer-VAE Framework with Knowledge Distillation for Fast Prediction of Nuclear Power Plant Accident Transient Response
Bo Pang, Yuanfeng Lin, Guoxu Qin, Siyuan Zhang et al. · Processes · Jul 13, 2026
Conventional analysis of nuclear power plant accident transient responses heavily relies on physical simulation programs, whose computational time significantly exceeds the actual accident response duration, thereby severely hindering real-…
Lightweight visual generation model for mobile video creation
Z Y Li · OpenAlex · Jul 13, 2026
To enhance the real-time performance and deployment efficiency of image generation models in mobile video creation scenarios, we construct the lightweight visual generation network LIGN. This incorporates cross-modal conditional fusion and …
Crafting Medicine: Artisans, Knowledge and the Common Man in Hieronymus Brunschwig’s Books on Surgery and Distillation
Holly Fletcher · Ambix · Jul 13, 2026
Crafting Medicine is the first sustained study of the Strasbourg craftsman, surgeon, and author Hieronymus Brunschwig (ca.1450–ca.1530). As Taape states at the outset, Brunschwig is a somewhat obsc......
Device-specific lightweight color constancy via knowledge distillation and fuzzy PID-guided training
Chongbao Zhao, Yunhui Luo, Mingyu Shang, Haiming Qu · OpenAlex · Jul 13, 2026
In color constancy tasks, deep learning methods perform well but are computationally expensive, limiting deployment on resource-constrained devices. General models perform poorly on specific devices, and obtaining labeled data for these dev…
Decentralized federated distillation for privacy-preserving cross-league basketball data collaboration
S Liu, H X Guan, Q Y Wang · Scientific Reports · Jul 12, 2026
Cross-league basketball analytics promises richer, more transferable performance models, yet competitive sensitivities and data-protection regulations make raw data sharing across leagues impractical. We propose a decentralized federated di…
Exploring the Practice Elements of Attachment-Based Interventions: A Distillation-and-Matching Review
Ahmed Riaz Mohamed, P.S. Sterkenburg, Esmé van Rensburg, Carlo Schuengel · Child & Youth Care Forum · Jul 11, 2026
Abstract Background Attachment-based interventions enjoy a growing evidence-base for their effectiveness. These interventions may be decomposed into practice elements and studied for how they may cluster differentially for different childre…
Object Embedding-Based Knowledge Distillation for Enhanced Visual Question Answering
Himel Das Gupta, Victor S. Sheng · Neural Processing Letters · Jul 11, 2026
With the rapid advancement of AI, multi-modal tasks have become key components in enhancing machine intelligence. We can observe their presence in everyday technology. A prominent example is Visual Question Answering (VQA), where users inte…

Track Knowledge Distillation on Distill AI — start free →

Latest Knowledge Distillation Research Papers

Recent papers

Related topics