Latest Machine Learning Research Papers
The newest Machine Learning papers from across the field — arXiv, NeurIPS, CVPR, Nature, and more — refreshed daily and ranked by relevance. Distill AI tracks Machine Learning so you don’t have to: get the standout work delivered to your inbox every morning, with 2-sentence summaries and the option to chat with any paper.
Get the latest Machine Learning papers in your inbox — free →Recent papers
- LGAN: An Efficient High-Order Graph Neural Network via the Line Graph AggregationLin Du, Lu Bai, Jincheng Li, Lixin Cui et al. · AAAI 2026 · Dec 31, 2026
Graph Neural Networks (GNNs) have emerged as a dominant paradigm for graph classification. Specifically, most existing GNNs mainly rely on the message passing strategy between neighbor nodes, where the expressivity is limited by the 1-dimen…
- When to Align, When to Predict: A Phase Diagram for Multimodal LearningIlay Kamai, Hugues Van Assel, Aviv Regev, Hagai B. Perets et al. · arXiv · Jun 9, 2026
Cross-modal alignment (CA) and cross-modal prediction (CP) are the dominant paradigms for multimodal representation learning, yet there is no systematic understanding of when each succeeds, when each fails, and when cross-modal training hel…
- A Unifying Lens on Supervised Fine-Tuning Through Target Distribution DesignTong Xie, Yuanhao Ban, Yunqi Hong, Sohyun An et al. · arXiv · Jun 9, 2026
Supervised fine-tuning (SFT) typically maximizes the likelihood of every token in a demonstrated trajectory. However, an observed token can be non-unique, noisy, or misaligned with the model prior. Strictly fitting toward this one-hot targe…
- EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving AgentsWeixian Xu, Shilong Liu, Mengdi Wang · arXiv · Jun 9, 2026
In this paper, we propose EEVEE, the first multi-dataset test-time prompt learning framework for LLM agents, enabling test-time prompt learning under real-world task streams. Existing methods are largely designed for single-dataset settings…
- The Role of Feedback Alignment in Self-DistillationSemih Kara, Oğuzhan Ersoy · arXiv · Jun 9, 2026
Conditioning a language model on additional context, such as feedback on a previous attempt, typically improves its response. Self-distillation trains the model to retain this improvement when the context is not present. The method works by…
- Predicting Future Behaviors in Reasoning Models Enables Better SteeringEvgenii Kortukov, Piotr Komorowski, Florian Klein, Paula Engl et al. · arXiv · Jun 9, 2026
Deployed large reasoning models (LRMs) often behave unexpectedly. Test-time steering controls LRM outputs by intervening on their hidden representations, but it can degrade output quality. We argue that prior steering work implicitly relies…
- Algorithmic and Minimax Complexities in Kernel BanditsYunbei Xu · arXiv · Jun 9, 2026
Gaussian-process upper confidence bound (GP-UCB) and decision-estimation-coefficient (DEC) methods may appear, at first sight, to belong to different theories. This paper places the two viewpoints in a common algorithmic-information languag…
- COGENT: Continuous Graph Emulators with Neural Ordinary Differential Equations for Long-Term Physical ForecastingZesheng Liu, Maryam Rahnemoonfar · arXiv · Jun 9, 2026
In this work, we present COGENT, a continuous graph emulator with Neural Ordinary Differential Equations for long-term physical forecasting on irregular geospatial meshes. COGENT encodes a finite history of system states and associated forc…
- Itô maps for any-step SDEsZhengkai Pan, Peter Potaptchik, Wenxi Yao, Michael S. Albergo et al. · arXiv · Jun 9, 2026
Recent one-step generative models accelerate sampling by learning deterministic flow maps of the underlying dynamics. These methods rely on learning from ordinary differential equations, leaving open how to define an exact distillation proc…
- Efficiently Learning Drifting Halfspaces with Massart NoiseMingchen Ma, Guyang Cao, Jelena Diakonikolas, Ilias Diakonikolas · arXiv · Jun 9, 2026
We study the problem of learning a drifting concept in the presence of Massart noise. In this framework, an online learner has access to a history of independent samples whose labels are noisy versions of a target concept that may change fr…
- OncoTraj: a public benchmark for longitudinal resistance prediction in EGFR-mutant non-small-cell lung cancer on osimertinibAbhijoy Sarkar, Aarchi Singh Thakur · arXiv · Jun 9, 2026
Resistance to first-line osimertinib in EGFR-mutant non-small-cell lung cancer (NSCLC) is the canonical example of predictable clonal evolution under therapeutic pressure, yet no public benchmark exists for training or evaluating computatio…
- Data assimilation for subsurface flow using latent diffusion model parameterization: performance of ensemble-Kalman and Monte Carlo techniquesGuido Di Federico, Wenchao Teng, Louis J. Durlofsky · arXiv · Jun 9, 2026
Data assimilation (DA) in subsurface flow entails calibrating model parameters to match observed data, typically at wells, while preserving geological realism. Latent diffusion models (LDMs) provide efficient mappings from high-dimensional …
- First-Order Trajectory Matching: Fast Ensemble Predictions of Chaotic, Turbulent, Stochastic SystemsShreya Jha, Timo Schorlepp, Nicholas Geissler, Jules Berman et al. · arXiv · Jun 9, 2026
We introduce First-Order Trajectory Matching (FTM), a surrogate-modeling method that learns the first-order local transport of probability mass from trajectories of stochastic systems. By matching the symmetric first-order motion of traject…
- Robust Regression of General ReLUs with QueriesIlias Diakonikolas, Daniel M. Kane, Mingchen Ma · arXiv · Jun 9, 2026
We study the task of agnostically learning general (as opposed to homogeneous) ReLUs under the Gaussian distribution with respect to the squared loss. In the passive learning setting, recent work gave a computationally efficient algorithm t…
- DMT: Demographic Conditioning, Morphology-Enhanced Transformer for Cuffless Blood Pressure Estimation from PPG SignalsYidan Shen, Neville Mathew, Maham Rahimi, Deependra Dhakal et al. · arXiv · Jun 9, 2026
Blood pressure (BP) is a key marker for cardiovascular risk assessment and therapeutic decision-making, and Photoplethysmography (PPG) enables low-cost, wearable-friendly cuffless BP estimation. However, even with recent progress, many PPG-…
- Overcoming Rank Collapse in Feedback AlignmentGauthier Boeshertz, Razvan Pascanu, Claudia Clopath · arXiv · Jun 9, 2026
Backpropagation (BP) is widely viewed as biologically implausible, in part because it requires feedback weights to be the transpose of forward weights for error propagation. Interestingly, when training a network with fixed random feedback …
- TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement LearningHeming Zou, Qi Wang, Yun Qu, Yuhang Jiang et al. · arXiv · Jun 9, 2026
Reinforcement learning with verifiable rewards (RLVR) is a promising approach for enhancing reasoning and agentic behavior in large language models. However, rollout-intensive policy optimization is often limited by insufficient reward cont…
- Data-Driven Dynamic Assortment in Online Platforms: Learning about Two SidesRahul Roy, Nur Sunar, Jayashankar M. Swaminathan · arXiv · Jun 9, 2026
We study a dynamic assortment problem on a two-sided service platform with incomplete information and heterogeneous customers in a discrete-time setting. In each period, a customer arrives seeking service, and the platform chooses an assort…
- Multimodal Brain Tumour Classification Using Feature FusionWajih ul Islam, Muhammad Yaqoob, Javed Ali Khan, Volker Steuber · arXiv · Jun 9, 2026
Clinicians diagnose brain tumors by synthesizing patient symptoms, medical history, and quantitative imaging data from modalities such as MRI and CT scans into a unified clinical judgement. However, most deep learning models rely on MRI/CT …
- Limitations of Learning Tanh Neural Networks with Finite PrecisionPhilipp Grohs, Matěj Trödler · arXiv · Jun 9, 2026
We investigate limitations of learning $\tanh$ neural networks from point evaluations under finite-precision computations and $L^p$ accuracy guarantees, building on Berner, Grohs, and Voigtländer (2023). Our approach is based on a novel con…
- Do Transformers Actually Help Intrusion Detection? A Temporal Sequence Evaluation on CIC-IDS2017Zach Moczkodan, Hany Ragab · arXiv · Jun 9, 2026
Recent deep learning approaches for network intrusion detection increasingly incorporate temporal architectures such as recurrent networks and Transformers, often reporting near-perfect performance on CIC-IDS2017. However, many existing stu…
- Test-Time Gradient Guidance of Flow Policies in Reinforcement LearningZhiyuan Zhou, Andy Peng, Charles Xu, Qiyang Li et al. · arXiv · Jun 9, 2026
Expressive continuous control policies, such as diffusion and flow models, form the backbone of recent advances in scaling imitation learning for simulated and real robot control. While they are known to scale stably in the supervised imita…
- An Agency-Transferring Model-Free Policy Enhancement TechniqueAnton Bolychev, Georgiy Malaniya, Sinan Ibrahim, Pavel Osinenko · arXiv · Jun 8, 2026
Training reinforcement learning (RL) policies from scratch is costly: it requires careful reward and environment design, extensive tuning, and substantial computation. Yet many control problems already have a functional but suboptimal polic…
- Rethinking the Divergence Regularization in LLM RLJiarui Yao, Xiangxin Zhou, Penghui Qi, Wee Sun Lee et al. · arXiv · Jun 8, 2026
Reinforcement learning (RL) has become a key component of post-training large language models (LLMs). In practice, LLM RL is often off-policy because of training-inference mismatch and policy staleness, making trust-region control essential…
- Weighted universal approximation of differentiable maps on infinite-dimensional manifoldsPhilipp Schmocker, Josef Teichmann · arXiv · Jun 8, 2026
We generalize the universal approximation theorem for functional input neural networks (FNN) to differentiable maps by including the approximation of the derivatives. A FNN maps the input from a possibly infinite-dimensional weighted manifo…
- Topological Neural OperatorsLennart Bastian, Samuel Leventhal, Mustafa Hajij, Tolga Birdal · arXiv · Jun 8, 2026
We introduce Topological Neural Operators (TNOs), a principled framework for operator learning on cell complexes that lifts neural operators (NOs) from functions on points and/or edges to topological domains. TNOs represent data as features…
- Echo-Memory: A Controlled Study of Memory in Action World ModelsWayne King, Zeyue Xue, Yuxuan Bian, Jie Huang et al. · arXiv · Jun 8, 2026
We present \textbf{Echo-Memory}, a controlled study of memory mechanisms in action-conditioned world models. These models generate multi-segment videos from a first frame, text prompt, and camera-action sequence, but their central failure i…
- Bandits for Efficient Experimentation: Adapting to Control Group, Preferences, and Context DriftsUdvas Das, Waris Radji, Debabrota Basu, Odalric-Ambrym Maillard · arXiv · Jun 8, 2026
We consider a variant of the linear contextual stochastic multi-armed bandits, where the learner must provide recommendations to a group of users, each having its personalized preference vector, and in the presence of context distributions …
- Zero Touch Predictive Orchestration: Automating Time-Series Models for the Cloud-Edge ContinuumAbd Elghani Meliani, Arora Sagar, Adlen Ksentini, Raymond Knopp · arXiv · Jun 8, 2026
The Cloud-Edge Continuum (CEC) enables latency-critical applications by distributing resources to the far edge, but its extreme volatility makes proactive Zero Touch Management via time-series forecasting essential. However, orchestrators f…
- Discovering Functionally Selective Brain Regions with a Deep Topographic Multimodal ModelBadr AlKhamissi, Johannes Mehrer, Lara Marinov, Ahmed Abdelaal et al. · arXiv · Jun 8, 2026
Nearby neurons in cortex share similar response profiles, producing systematic spatial organization across sensory and cognitive systems. Recent topographic models reproduce aspects of this structure but remain unimodal and spatially constr…