Live heatRSS
Source

research · en · weight 1.1

  1. arXiv cs.LG/arxiv.org/
    A Discordance-Aware Multimodal Framework with Multi-Agent Clinical Reasoning

    arXiv:2604.16333v1 Announce Type: new Abstract: Knee osteoarthritis frequently exhibits discordance between structural damage observed in imaging and patient-reported symptoms such as pain. This mismatch complicates clinical interpretation and patient stratification and remains insufficiently modeled in existing decision support systems. We propose a discordance aware multimodal framework that co…

    1.0#cs.lg#cs.ai
  2. arXiv cs.LG/arxiv.org/
    Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning

    arXiv:2604.16332v1 Announce Type: new Abstract: We find that LoRA fine-tuning exhibits un-learning on contested examples: items with high annotator disagreement show increasing loss during training, a qualitatively distinct pattern largely absent under full fine-tuning and consistent across all six models tested (four encoder, two decoder-only). This discovery emerges from correlating annotation …

    1.0#cs.lg#cs.cl
  3. arXiv cs.LG/arxiv.org/
    Reasoning on the Manifold: Bidirectional Consistency for Self-Verification in Diffusion Language Models

    arXiv:2604.16565v1 Announce Type: new Abstract: While Diffusion Large Language Models (dLLMs) offer structural advantages for global planning, efficiently verifying that they arrive at correct answers via valid reasoning traces remains a critical challenge. In this work, we propose a geometric perspective: Reasoning on the Manifold. We hypothesize that valid generation trajectories reside as stab…

    1.3#cs.lg#cs.ai
  4. arXiv cs.LG/arxiv.org/
    An Interpretable Framework Applying Protein Words to Predict Protein-Small Molecule Complementary Pairing Rules

    arXiv:2604.16550v1 Announce Type: new Abstract: Despite the high accuracy of 'black box' deep learning models, drug discovery still relies on protein-ligand interaction principles and heuristics. To improve interpretability of protein-small molecule binding predictions, we developed the PWRules framework, which applies binding affinity data to identify privileged small molecule fragments and subs…

    1.0#cs.lg#cs.ai
  5. arXiv cs.LG/arxiv.org/
    Multi-Label Phase Diagram Prediction in Complex Alloys via Physics-Informed Graph Attention Networks

    arXiv:2604.16468v1 Announce Type: new Abstract: Accurate phase equilibria are foundational to alloy design because they encode the underlying thermodynamics governing stability, transformations, and processing windows. However, while the CALculation of Phase Diagrams (CALPHAD) provides a rigorous thermodynamic framework, exploring multicomponent composition-temperature space remains computational…

    1.0#cs.lg#cond-mat.mtrl-sci
  6. arXiv cs.LG/arxiv.org/
    (Sparse) Attention to the Details: Preserving Spectral Fidelity in ML-based Weather Forecasting Models

    arXiv:2604.16429v1 Announce Type: new Abstract: We introduce Mosaic, a probabilistic weather forecasting model that addresses two principal sources of spectral degradation in ML-based weather prediction: (1) deterministic training against ensemble means and (2) compressive encoding creating an information bottleneck. Mosaic generates ensemble members through learned functional perturbations and o…

    1.0#cs.lg#cs.ai
  7. arXiv cs.LG/arxiv.org/
    Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP

    arXiv:2604.16410v1 Announce Type: new Abstract: CLIP adaptation can improve in-domain accuracy while degrading out-of-domain transfer, but comparisons between Full Fine-Tuning (Full FT) and LoRA are often confounded by different learning-rate conventions. We study how adaptation method and optimization scale jointly shape attention drift and transfer retention in CLIP using a controlled matched-l…

    1.0#cs.lg
  8. arXiv cs.LG/arxiv.org/
    In Search of Lost DNA Sequence Pretraining

    arXiv:2604.16570v1 Announce Type: new Abstract: DNA sequence encoding is fundamental to gene function prediction, protein synthesis, and diverse downstream biological tasks. Despite the substantial progress achieved by large-scale DNA sequence pretraining, existing studies have overwhelmingly emphasized pretraining scale and custom downstream evaluation datasets, while neglecting some essential c…

    1.0#cs.lg#cs.ai
  9. arXiv cs.LG/arxiv.org/
    The Global Neural World Model: Spatially Grounded Discrete Topologies for Action-Conditioned Planning

    arXiv:2604.16585v1 Announce Type: new Abstract: We present the Global Neural World Model (GNWM), a self-stabilizing framework that achieves topological quantization through balanced continuous entropy constraints. Operating as a continuous, action-conditioned Joint-Embedding Predictive Architecture (JEPA), the GNWM maps environments onto a discrete 2D grid, enforcing translational equivariance wi…

    1.0#cs.lg#cs.ai
  10. arXiv cs.LG/arxiv.org/
    POLAR: Online Learning for LoRA Adapter Caching and Routing in Edge LLM Serving

    arXiv:2604.16583v1 Announce Type: new Abstract: Edge deployment of large language models (LLMs) increasingly relies on libraries of lightweight LoRA adapters, yet GPU/DRAM can keep only a small resident subset at a time. Serving a request through a non-resident adapter requires paging its weights from storage, incurring measurable latency. This creates a two-timescale online control problem: on a…

    1.3#cs.lg#cs.ai
  11. arXiv cs.LG/arxiv.org/
    SetFlow: Generating Structured Sets of Representations for Multiple Instance Learning

    arXiv:2604.16362v1 Announce Type: new Abstract: Data scarcity and weak supervision continue to limit the performance of machine learning models in many real-world applications, such as mammography, where Multiple Instance Learning (MIL) often offers the best formulation. While recent foundation models provide strong semantic representations out of the box, effective augmentation of such represent…

    1.0#cs.lg#cs.ai
  12. arXiv cs.LG/arxiv.org/
    UniMamba: A Unified Spatial-Temporal Modeling Framework with State-Space and Attention Integration

    arXiv:2604.16325v1 Announce Type: new Abstract: Multivariate time series forecasting is fundamental to numerous domains such as energy, finance, and environmental monitoring, where complex temporal dependencies and cross-variable interactions pose enduring challenges. Existing Transformer-based methods capture temporal correlations through attention mechanisms but suffer from quadratic computatio…

    1.0#cs.lg#cs.ai
  13. arXiv cs.LG/arxiv.org/
    FedLLM: A Privacy-Preserving Federated Large Language Model for Explainable Traffic Flow Prediction

    arXiv:2604.16612v1 Announce Type: new Abstract: Traffic prediction plays a central role in intelligent transportation systems (ITS) by supporting real-time decision-making, congestion management, and long-term planning. However, many existing approaches face practical limitations. Most spatio-temporal models are trained on centralized data, rely on numerical representations, and offer limited exp…

    1.3#cs.lg
  14. arXiv cs.LG/arxiv.org/
    Randomized Antipodal Search Done Right for Data Pareto Improvement of LLM Unlearning

    arXiv:2604.16591v1 Announce Type: new Abstract: Large language models (LLMs) sometimes memorize undesirable knowledge, which must be removed after deployment. Prior work on machine unlearning has focused largely on optimization methods that adjust parameters to enforce forgetting while preserving retention. However, these approaches assume that the forget and retain sets are readily available, wh…

    1.3#cs.lg#cs.ai
  15. arXiv cs.LG/arxiv.org/
    Global Attention with Linear Complexity for Exascale Generative Data Assimilation in Earth System Prediction

    arXiv:2604.16590v1 Announce Type: new Abstract: Accurate weather and climate prediction relies on data assimilation (DA), which estimates the Earth system state by integrating observations with models. While exascale computing has significantly advanced earth simulation, scalable and accurate inference of the Earth system state remains a fundamental bottleneck, limiting uncertainty quantification…

    1.0#cs.lg#cs.ai
  16. arXiv cs.LG/arxiv.org/
    Hybrid Spectro-Temporal Fusion Framework for Structural Health Monitoring

    arXiv:2604.16589v1 Announce Type: new Abstract: Structural health monitoring plays a critical role in ensuring structural safety by analyzing vibration responses from engineering systems. This paper proposes a Spectro-Temporal Alignment framework and a Hybrid Spectro-Temporal Fusion framework that integrate arrival-time interval descriptors with spectral features to capture both fine-scale and co…

    1.0#cs.lg#cs.ai
  17. arXiv cs.LG/arxiv.org/
    A Systematic Survey and Benchmark of Deep Learning for Molecular Property Prediction in the Foundation Model Era

    arXiv:2604.16586v1 Announce Type: new Abstract: Molecular property prediction integrates quantum chemistry, cheminformatics, and deep learning to connect molecular structure with physicochemical and biological behavior. This survey traces four complementary paradigms, including Quantum, Descriptor Machine Learning, Geometric Deep Learning, and Foundation Models, and outlines a unified taxonomy li…

    1.0#cs.lg#cs.ai
  18. arXiv cs.LG/arxiv.org/
    Towards Trustworthy Depression Estimation via Disentangled Evidential Learning

    arXiv:2604.16579v1 Announce Type: new Abstract: Automated depression estimation is highly vulnerable to signal corruption and ambient noise in real-world deployment. Prevailing deterministic methods produce uncalibrated point estimates, exposing safety-critical clinical systems to the severe risk of overconfident misdiagnoses. To establish a highly resilient and trustworthy assessment paradigm, w…

    1.0#cs.lg#cs.ai
  19. arXiv cs.LG/arxiv.org/
    NCO4CVRP: Neural Combinatorial Optimization for the Capacitated Vehicle Routing Problem

    arXiv:2604.16581v1 Announce Type: new Abstract: Neural Combinatorial Optimization (NCO) has emerged as a powerful framework for solving combinatorial optimization problems by integrating deep learning-based models. This work focuses on improving existing inference techniques to enhance solution quality and generalization. Specifically, we modify the Random Re-Construct (RRC) approach of the Light…

    1.0#cs.lg#cs.ai
  20. arXiv cs.LG/arxiv.org/
    Continuous ageing trajectory representations for knee-aware lifetime prediction of lithium-ion batteries across heterogeneous dataset

    arXiv:2604.16580v1 Announce Type: new Abstract: Accurate assessment of lithium-ion battery ageing is challenged by cell-to-cell variability, heterogeneous cycling protocols, and limited transferability of data-driven models across datasets. In particular, robust identification of degradation transitions, such as the knee point, and reliable early-life prediction of remaining useful life (RUL) rem…

    1.0#cs.lg#cs.ai
  21. arXiv cs.LG/arxiv.org/
    Evaluating Temporal and Structural Anomaly Detection Paradigms for DDoS Traffic

    arXiv:2604.16575v1 Announce Type: new Abstract: Unsupervised anomaly detection is widely used to detect Distributed Denial-of-Service (DDoS) attacks in cloud-native 5G networks, yet most studies assume a fixed traffic representation, either temporal or structural, without validating which feature space best matches the data. We propose a lightweight decision framework that prioritizes temporal or…

    1.0#cs.lg#cs.ai
  22. arXiv cs.LG/arxiv.org/
    From User Recognition to Activity Counting: An Identity-Agnostic Approach to Multi-User WiFi Sensing

    arXiv:2604.16572v1 Announce Type: new Abstract: Wi-Fi Channel State Information (CSI) enables device-free human activity recognition, but existing multi-user approaches assume a fixed set of known users during both training and inference. This closed-set assumption limits deployment, as models trained on a specific user set degrade when applied to new individuals or environments. We reformulate m…

    1.0#cs.lg
  23. arXiv cs.LG/arxiv.org/
    Positive-Only Drifting Policy Optimization

    arXiv:2604.16519v1 Announce Type: new Abstract: In the field of online reinforcement learning (RL), traditional Gaussian policies and flow-based methods are often constrained by their unimodal expressiveness, complex gradient clipping, or stringent trust-region requirements. Moreover, they all rely on post-hoc penalization of negative samples to correct erroneous actions. This paper introduces Po…

    1.0#cs.lg#cs.ro
  24. arXiv cs.LG/arxiv.org/
    Cross-Modal Generation: From Commodity WiFi to High-Fidelity mmWave and RFID Sensing

    arXiv:2604.16558v1 Announce Type: new Abstract: AIGC has shown remarkable success in CV and NLP, and has recently demonstrated promising potential in the wireless domain. However, significant data imbalance exists across RF modalities, with abundant WiFi data but scarce mmWave and RFID data due to high acquisition cost. This makes it difficult to train high-quality generative models for these dat…

    1.0#cs.lg
  25. arXiv cs.LG/arxiv.org/
    S-GRPO: Unified Post-Training for Large Vision-Language Models

    arXiv:2604.16557v1 Announce Type: new Abstract: Current post-training methodologies for adapting Large Vision-Language Models (LVLMs) generally fall into two paradigms: Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). Despite their prevalence, both approaches suffer from inefficiencies when applied in isolation. SFT forces the model's generation along a single expert trajectory, ofte…

    1.0#cs.lg#cs.cl
  26. arXiv cs.LG/arxiv.org/
    LLM as a Tool, Not an Agent: Code-Mined Tree Transformations for Neural Architecture Search

    arXiv:2604.16555v1 Announce Type: new Abstract: Neural Architecture Search (NAS) aims to automatically discover high-performing deep neural network (DNN) architectures. However, conventional algorithm-driven NAS relies on carefully hand-crafted search spaces to ensure executability, which restricts open-ended exploration. Recent coding-based agentic approaches using large language models (LLMs) r…

    1.3#cs.lg#cs.ai
  27. arXiv cs.LG/arxiv.org/
    Towards Reliable Testing of Machine Unlearning

    arXiv:2604.16536v1 Announce Type: new Abstract: Machine learning components are now central to AI-infused software systems, from recommendations and code assistants to clinical decision support. As regulations and governance frameworks increasingly require deleting sensitive data from deployed models, machine unlearning is emerging as a practical alternative to full retraining. However, unlearnin…

    1.0#cs.lg#cs.ai
  28. arXiv cs.LG/arxiv.org/
    SCATR: Simple Calibrated Test-Time Ranking

    arXiv:2604.16535v1 Announce Type: new Abstract: Test-time scaling (TTS) improves large language models (LLMs) by allocating additional compute at inference time. In practice, TTS is often achieved through parallel scaling: generating multiple candidate responses and selecting the best via a Best-of-N (BoN) strategy. Its effectiveness therefore hinges on the scoring function. Learned scorers such …

    1.3#cs.lg#cs.ai
  29. arXiv cs.LG/arxiv.org/
    BASIS: Balanced Activation Sketching with Invariant Scalars for "Ghost Backpropagation"

    arXiv:2604.16324v1 Announce Type: new Abstract: The activation memory required for exact backpropagation scales linearly with network depth, context length, and feature dimensionality, forming an O(L * BN ) spatial bottleneck (where B is the sequence-batch cardinality and N is the feature dimension). This constraint historically throttles the scaling of deep neural networks. While randomized auto…

    1.0#cs.lg
  30. arXiv cs.LG/arxiv.org/
    Sampling for Quality: Training-Free Reward-Guided LLM Decoding via Sequential Monte Carlo

    arXiv:2604.16453v1 Announce Type: new Abstract: We introduce a principled probabilistic framework for reward-guided decoding in large language models, addressing the limitations of standard decoding methods that optimize token-level likelihood rather than sequence-level quality. Our method defines a reward-augmented target distribution over complete sequences by combining model transition probabi…

    1.3#cs.lg#cs.ai
  31. arXiv cs.LG/arxiv.org/
    Dimensional Criticality at Grokking Across MLPs and Transformers

    arXiv:2604.16431v1 Announce Type: new Abstract: Abrupt transitions between distinct dynamical regimes are a hallmark of complex systems. Grokking in deep neural networks provides a striking example -- an abrupt transition from memorization to generalization long after training accuracy saturates -- yet robust macroscopic signatures of this transition remain elusive. Here we introduce \textbf{TDU-…

    1.3#cs.lg#cond-mat.dis-nn
  32. arXiv cs.LG/arxiv.org/
    Non-Stationarity in the Embedding Space of Time Series Foundation Models

    arXiv:2604.16428v1 Announce Type: new Abstract: Time series foundation models (TSFMs) are widely used as generic feature extractors, yet the notion of non-stationarity in their embedding spaces remains poorly understood. Recent work often conflates non-stationarity with distribution shift, blurring distinctions fundamental to classical time-series analysis and long-standing methodologies such as …

    1.0#cs.lg#cs.ai
  33. arXiv cs.LG/arxiv.org/
    Functional Similarity Metric for Neural Networks: Overcoming Parametric Ambiguity via Activation Region Analysis

    arXiv:2604.16426v1 Announce Type: new Abstract: As modern deep learning architectures grow in complexity, representational ambiguity emerges as a critical barrier to their interpretability and reliable merging. For ReLU networks, identical functional mappings can be achieved through entirely different weight configurations due to algebraic symmetries: neuron permutation and positive diagonal scal…

    1.0#cs.lg
  34. arXiv cs.LG/arxiv.org/
    FedOBP: Federated Optimal Brain Personalization through Cloud-Edge Element-wise Decoupling

    arXiv:2604.16574v1 Announce Type: new Abstract: Federated Learning (FL) faces challenges from client data heterogeneity and resource-constrained mobile devices, which can degrade model accuracy. Personalized Federated Learning (PFL) addresses this issue by adapting shared global knowledge to local data distributions. A promising approach in PFL is model decoupling, which separates the model into …

    1.0#cs.lg#cs.ai
  35. arXiv cs.LG/arxiv.org/
    Shifting the Gradient: Understanding How Defensive Training Methods Protect Language Model Integrity

    arXiv:2604.16423v1 Announce Type: new Abstract: Defensive training methods such as positive preventative steering (PPS) and inoculation prompting (IP) offer surprising results through seemingly similar processes: both add trait-inducing objects to large language models (LLMs) during training, and both defend the LLM against acquiring the trait. The surprising success of these methods comes with t…

    1.3#cs.lg#cs.ai
  36. arXiv cs.LG/arxiv.org/
    G-PARC: Graph-Physics Aware Recurrent Convolutional Neural Networks for Spatiotemporal Dynamics on Unstructured Meshes

    arXiv:2604.16533v1 Announce Type: new Abstract: Physics-aware recurrent convolutional networks (PARC) have demonstrated strong performance in predicting nonlinear spatiotemporal dynamics by embedding differential operators directly into the computational graph of a neural network. However, pixel-based convolutions are restricted to static, uniform Cartesian grids, making them ill-suited to follow…

    1.0#cs.lg#cs.ai
  37. arXiv cs.LG/arxiv.org/
    CGCMA: Conditionally-Gated Cross-Modal Attention for Event-Conditioned Asynchronous Fusion

    arXiv:2604.16411v1 Announce Type: new Abstract: We study asynchronous alignment, a first-class multimodal learning setting in which a dense primary stream must be fused with sporadic external context whose value depends on when it arrives. Unlike standard multimodal benchmarks that assume structural synchrony, this setting requires models to reason explicitly about freshness and trust. We focus o…

    1.0#cs.lg
  38. arXiv cs.LG/arxiv.org/
    SaFeR-Steer: Evolving Multi-Turn MLLMs via Synthetic Bootstrapping and Feedback Dynamics

    arXiv:2604.16358v1 Announce Type: new Abstract: MLLMs are increasingly deployed in multi-turn settings, where attackers can escalate unsafe intent through the evolving visual-text history and exploit long-context safety decay. Yet safety alignment is still dominated by single-turn data and fixed-template dialogues, leaving a mismatch between training and deployment.To bridge this gap, we propose …

    1.3#cs.lg#cs.cl
  39. arXiv cs.LG/arxiv.org/
    Beyond Verifiable Rewards: Rubric-Based GRM for Reinforced Fine-Tuning SWE Agents

    arXiv:2604.16335v1 Announce Type: new Abstract: Despite recent progress in Large Language Model (LLM) Agents for Software Engineering (SWE) tasks, end-to-end fine-tuning typically relies on verifiable terminal rewards such as whether all unit tests pass. While these binary signals reflect whether the final solution is correct, they provide little guidance for shaping intermediate behaviors during…

    1.3#cs.lg#cs.ai
  40. arXiv cs.LG/arxiv.org/
    Preventing overfitting in deep learning using differential privacy

    arXiv:2604.16334v1 Announce Type: new Abstract: The use of Deep Neural Network based systems in the real world is growing. They have achieved state-of-the-art performance on many image, speech and text datasets. They have been shown to be powerful systems that are capable of learning detailed relationships and abstractions from the data. This is a double-edged sword which makes such systems vulne…

    1.0#cs.lg#cs.ai
  41. arXiv cs.LG/arxiv.org/
    PINNACLE: An Open-Source Computational Framework for Classical and Quantum PINNs

    arXiv:2604.15645v1 Announce Type: new Abstract: We present PINNACLE, an open-source computational framework for physics-informed neural networks (PINNs) that integrates modern training strategies, multi-GPU acceleration, and hybrid quantum-classical architectures within a unified modular workflow. The framework enables systematic evaluation of PINN performance across benchmark problems including …

    0.7#cs.lg#physics.comp-ph
  42. arXiv cs.LG/arxiv.org/
    NK-GAD: Neighbor Knowledge-Enhanced Unsupervised Graph Anomaly Detection

    arXiv:2604.15668v1 Announce Type: new Abstract: Graph anomaly detection aims to identify irregular patterns in graph-structured data. Most unsupervised GNN-based methods rely on the homophily assumption that connected nodes share similar attributes. However, real-world graphs often exhibit attribute-level heterophily, where connected nodes have dissimilar attributes. Our analysis of attribute-lev…

    0.7#cs.lg
  43. arXiv cs.LG/arxiv.org/
    Optimizing Stochastic Gradient Push under Broadcast Communications

    arXiv:2604.15549v1 Announce Type: new Abstract: We consider the problem of minimizing the convergence time for decentralized federated learning (DFL) in wireless networks under broadcast communications, with focus on mixing matrix design. The mixing matrix is a critical hyperparameter for DFL that simultaneously controls the convergence rate across iterations and the communication demand per iter…

    0.7#cs.lg#cs.dc
  44. arXiv cs.LG/arxiv.org/
    Predicting Where Steering Vectors Succeed

    arXiv:2604.15557v1 Announce Type: new Abstract: Steering vectors work for some concepts and layers but fail for others, and practitioners have no way to predict which setting applies before running an intervention. We introduce the Linear Accessibility Profile (LAP), a per-layer diagnostic that repurposes the logit lens as a predictor of steering vector effectiveness. The key measure, $A_{\mathrm…

    0.7#cs.lg#cs.cl
  45. arXiv cs.LG/arxiv.org/
    Reward Weighted Classifier-Free Guidance as Policy Improvement in Autoregressive Models

    arXiv:2604.15577v1 Announce Type: new Abstract: Consider an auto-regressive model that produces outputs x (e.g., answers to questions, molecules) each of which can be summarized by an attribute vector y (e.g., helpfulness vs. harmlessness, or bio-availability vs. lipophilicity). An arbitrary reward function r(y) encodes tradeoffs between these properties. Typically, tilting the model's sampling d…

    0.7#cs.lg#cs.ai
  46. arXiv cs.LG/arxiv.org/
    PAWN: Piece Value Analysis with Neural Networks

    arXiv:2604.15585v1 Announce Type: new Abstract: Predicting the relative value of any given chess piece in a position remains an open challenge, as a piece's contribution depends on its spatial relationships with every other piece on the board. We demonstrate that incorporating the state of the full chess board via latent position representations derived using a CNN-based autoencoder significantly…

    0.7#cs.lg#cs.ai
  47. arXiv cs.LG/arxiv.org/
    Adapting in the Dark: Efficient and Stable Test-Time Adaptation for Black-Box Models

    arXiv:2604.15609v1 Announce Type: new Abstract: Test-Time Adaptation (TTA) for black-box models accessible only via APIs remains a largely unexplored challenge. Existing approaches such as post-hoc output refinement offer limited adaptive capacity, while Zeroth-Order Optimization (ZOO) enables input-space adaptation but faces high query costs and optimization challenges in the unsupervised TTA se…

    0.7#cs.lg#cs.cv
  48. arXiv cs.LG/arxiv.org/
    VoodooNet: Achieving Analytic Ground States via High-Dimensional Random Projections

    arXiv:2604.15613v1 Announce Type: new Abstract: We present VoodooNet, a non-iterative neural architecture that replaces the stochastic gradient descent (SGD) paradigm with a closed-form analytic solution via Galactic Expansion. By projecting input manifolds into a high-dimensional, high-entropy "Galactic" space ($d \gg 784$), we demonstrate that complex features can be untangled without the therm…

    0.7#cs.lg#cs.ai
  49. arXiv cs.LG/arxiv.org/
    Flexible Empowerment at Reasoning with Extended Best-of-N Sampling

    arXiv:2604.15614v1 Announce Type: new Abstract: This paper proposes a novel method that incorporates empowerment when reasoning actions in reinforcement learning (RL), thereby achieving the flexibility of exploration-exploitation dilemma (EED). In previous methods, empowerment for promoting exploration has been provided as a bonus term to the task-specific reward function as an intrinsically-moti…

    0.7#cs.lg
  50. arXiv cs.LG/arxiv.org/
    Majority Voting for Code Generation

    arXiv:2604.15618v1 Announce Type: new Abstract: We investigate Functional Majority Voting (FMV), a method based on functional consensus for code generation with Large Language Models, which identifies a representative solution from multiple generations using their runtime execution signatures on test inputs. We find that FMV is an effective test-time inference strategy, substantially boosting per…

    0.7#cs.lg
  51. arXiv cs.LG/arxiv.org/
    Graph self-supervised learning based on frequency corruption

    arXiv:2604.15699v1 Announce Type: new Abstract: Graph self-supervised learning can reduce the need for labeled graph data and has been widely used in recommendation, social networks, and other web applications. However, existing methods often underuse high-frequency signals and may overfit to specific local patterns, which limits representation quality and generalization. We propose Frequency-Cor…

    0.7#cs.lg#cs.si
  52. arXiv cs.LG/arxiv.org/
    Towards Robust Endogenous Reasoning: Unifying Drift Adaptation in Non-Stationary Tuning

    arXiv:2604.15705v1 Announce Type: new Abstract: Reinforcement Fine-Tuning (RFT) has established itself as a critical paradigm for the alignment of Multi-modal Large Language Models (MLLMs) with complex human values and domain-specific requirements. Nevertheless, current research primarily focuses on mitigating exogenous distribution shifts arising from data-centric factors, the non-stationarity i…

    0.9#cs.lg
  53. arXiv cs.LG/arxiv.org/
    Reasoning-targeted Jailbreak Attacks on Large Reasoning Models via Semantic Triggers and Psychological Framing

    arXiv:2604.15725v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) have demonstrated strong capabilities in generating step-by-step reasoning chains alongside final answers, enabling their deployment in high-stakes domains such as healthcare and education. While prior jailbreak attack studies have focused on the safety of final answers, little attention has been given to the safety of …

    0.7#cs.lg#cs.ai
  54. arXiv cs.LG/arxiv.org/
    M3R: Localized Rainfall Nowcasting with Meteorology-Informed MultiModal Attention

    arXiv:2604.15377v1 Announce Type: new Abstract: Accurate and timely rainfall nowcasting is crucial for disaster mitigation and water resource management. Despite recent advances in deep learning, precipitation prediction remains challenging due to limitations in effectively leveraging diverse multimedia data sources. We introduce M3R, a Meteorology-informed MultiModal attention-based architecture…

    0.7#cs.lg#cs.cv
  55. arXiv cs.LG/arxiv.org/
    FineSteer: A Unified Framework for Fine-Grained Inference-Time Steering in Large Language Models

    arXiv:2604.15488v1 Announce Type: new Abstract: Large language models (LLMs) often exhibit undesirable behaviors, such as safety violations and hallucinations. Although inference-time steering offers a cost-effective way to adjust model behavior without updating its parameters, existing methods often fail to be simultaneously effective, utility-preserving, and training-efficient due to their rigi…

    0.9#cs.lg#cs.ai
  56. arXiv cs.LG/arxiv.org/
    Lightweight Geometric Adaptation for Training Physics-Informed Neural Networks

    arXiv:2604.15392v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) often suffer from slow convergence, training instability, and reduced accuracy on challenging partial differential equations due to the anisotropic and rapidly varying geometry of their loss landscapes. We propose a lightweight curvature-aware optimization framework that augments existing first-order optimize…

    0.7#cs.lg#cs.ai
  57. arXiv cs.LG/arxiv.org/
    StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models

    arXiv:2604.15416v1 Announce Type: new Abstract: Sign-based optimization algorithms, such as SignSGD, have garnered significant attention for their remarkable performance in distributed learning and training large foundation models. Despite their empirical superiority, SignSGD is known to diverge on non-smooth objectives, which are ubiquitous in modern machine learning due to ReLUs, max-pools, and…

    0.7#cs.lg#cs.ai
  58. arXiv cs.LG/arxiv.org/
    Sequential KV Cache Compression via Probabilistic Language Tries: Beyond the Per-Vector Shannon Limit

    arXiv:2604.15356v1 Announce Type: new Abstract: Recent work on KV cache quantization, culminating in TurboQuant, has approached the Shannon entropy limit for per-vector compression of transformer key-value caches. We observe that this limit applies to a strictly weaker problem than the one that actually matters: compressing the KV cache as a sequence. The tokens stored in a KV cache are not arbit…

    0.7#cs.lg#cs.ai
  59. arXiv cs.LG/arxiv.org/
    The Spectral Geometry of Thought: Phase Transitions, Instruction Reversal, Token-Level Dynamics, and Perfect Correctness Prediction in How Transformers Reason

    arXiv:2604.15350v1 Announce Type: new Abstract: We discover that large language models exhibit \emph{spectral phase transitions} in their hidden activation spaces when engaging in reasoning versus factual recall. Through systematic spectral analysis across \textbf{11 models} spanning \textbf{5 architecture families} (Qwen, Pythia, Phi, Llama, DeepSeek-R1), we identify \textbf{seven} core phenomen…

    1.3#cs.lg
  60. arXiv cs.LG/arxiv.org/
    Aletheia: Gradient-Guided Layer Selection for Efficient LoRA Fine-Tuning Across Architectures

    arXiv:2604.15351v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) has become the dominant parameter-efficient fine-tuning method for large language models, yet standard practice applies LoRA adapters uniformly to all transformer layers regardless of their relevance to the downstream task. We introduce Aletheia, a gradient-guided layer selection method that identifies the most task-releva…

    0.7#cs.lg#cs.cl
  61. arXiv cs.LG/arxiv.org/
    Mapping High-Performance Regions in Battery Scheduling across Data Uncertainty, Battery Design, and Planning Horizons

    arXiv:2604.15360v1 Announce Type: new Abstract: This study presents a triadic analysis of energy storage operation under multi-stage model predictive control, investigating the interplay between data characteristics, forecast uncertainty, planning horizon, and battery c-rate. Synthetic datasets are generated to systematically explore variations in data profiles and uncertainty, enabling parametri…

    0.7#cs.lg#cs.sy
  62. arXiv cs.LG/arxiv.org/
    PRL-Bench: A Comprehensive Benchmark Evaluating LLMs' Capabilities in Frontier Physics Research

    arXiv:2604.15411v1 Announce Type: new Abstract: The paradigm of agentic science requires AI systems to conduct robust reasoning and engage in long-horizon, autonomous exploration. However, current scientific benchmarks remain confined to domain knowledge comprehension and complex reasoning, failing to evaluate the exploratory nature and procedural complexity of real-world research. In this work, …

    0.9#cs.lg#cs.ai
  63. arXiv cs.LG/arxiv.org/
    Python library supporting Discrete Variational Formulations and training solutions with Collocation-based Robust Variational Physics Informed Neural Networks (DVF-CRVPINN)

    arXiv:2604.15398v1 Announce Type: new Abstract: We explore the possibility of solving Partial Differential Equations (PDEs) using discrete weak formulations. We propose a programming environment for defining a discrete computational domain, introducing discrete functions defined over a set of points, constructing discrete inner products, and introducing discrete weak formulations employing Kronec…

    0.7#cs.lg#cs.na
  64. arXiv cs.LG/arxiv.org/
    Hallucination as Trajectory Commitment: Causal Evidence for Asymmetric Attractor Dynamics in Transformer Generation

    arXiv:2604.15400v1 Announce Type: new Abstract: We present causal evidence that hallucination in autoregressive language models is an early trajectory commitment governed by asymmetric attractor dynamics. Using same-prompt bifurcation, in which we repeatedly sample identical inputs to observe spontaneous divergence, we isolate trajectory dynamics from prompt-level confounds. On Qwen2.5-1.5B acros…

    0.9#cs.lg#cs.ai
  65. arXiv cs.LG/arxiv.org/
    Dispatch-Aware Ragged Attention for Pruned Vision Transformers

    arXiv:2604.15408v1 Announce Type: new Abstract: Token pruning methods for Vision Transformers (ViTs) promise quadratic reductions in attention FLOPs by dropping uninformative patches. Yet when pruned sequences are executed with state-of-the-art variable-length attention APIs -- including FlashAttention-2's varlen and PyTorch's NestedTensor SDPA-the wall-clock attention latency doesn't scale accor…

    0.7#cs.lg#cs.ai
  66. arXiv cs.LG/arxiv.org/
    The Illusion of Equivalence: Systematic FP16 Divergence in KV-Cached Autoregressive Inference

    arXiv:2604.15409v1 Announce Type: new Abstract: KV caching is a ubiquitous optimization in autoregressive transformer inference, long presumed to be numerically equivalent to cache-free computation. This assumption fails under standard FP16 precision: cache-ON and cache-OFF execution paths employ different floating-point accumulation orderings which, due to FP16 non-associativity, produce a deter…

    0.7#cs.lg#cs.ai
  67. arXiv cs.LG/arxiv.org/
    Beyond Single-Model Optimization: Preserving Plasticity in Continual Reinforcement Learning

    arXiv:2604.15414v1 Announce Type: new Abstract: Continual reinforcement learning must balance retention with adaptation, yet many methods still rely on \emph{single-model preservation}, committing to one evolving policy as the main reusable solution across tasks. Even when a previously successful policy is retained, it may no longer provide a reliable starting point for rapid adaptation after int…

    0.7#cs.lg#cs.ai
  68. arXiv cs.LG/arxiv.org/
    Neural Continuous-Time Markov Chain: Discrete Diffusion via Decoupled Jump Timing and Direction

    arXiv:2604.15694v1 Announce Type: new Abstract: Discrete diffusion models based on continuous-time Markov chains (CTMCs) have shown strong performance on language and discrete data generation, yet existing approaches typically parameterize the reverse rate matrix as a single object -- via concrete scores, clean-data predictions ($x_0$-parameterization), or denoising distributions -- rather than a…

    0.7#cs.lg#math.pr
  69. arXiv cs.LG/arxiv.org/
    Transfer Learning from Foundational Optimization Embeddings to Unsupervised SAT Representations

    arXiv:2604.15448v1 Announce Type: new Abstract: Foundational optimization embeddings have recently emerged as powerful pre-trained representations for mixed-integer programming (MIP) problems. These embeddings were shown to enable cross-domain transfer and reduce reliance on solver-generated labels. In this work, we investigate whether such representations generalize beyond optimization to decisi…

    0.7#cs.lg#cs.ai
  70. arXiv cs.LG/arxiv.org/
    Evaluating LLM Simulators as Differentially Private Data Generators

    arXiv:2604.15461v1 Announce Type: new Abstract: LLM-based simulators offer a promising path for generating complex synthetic data where traditional differentially private (DP) methods struggle with high-dimensional user profiles. But can LLMs faithfully reproduce statistical distributions from DP-protected inputs? We evaluate this using PersonaLedger, an agentic financial simulator, seeded with D…

    0.9#cs.lg#cs.cl
  71. arXiv cs.LG/arxiv.org/
    ${\pi}_{0.7}$: a Steerable Generalist Robotic Foundation Model with Emergent Capabilities

    arXiv:2604.15483v1 Announce Type: new Abstract: We present a new robotic foundation model, called ${\pi}_{0.7}$, that can enable strong out-of-the-box performance in a wide range of scenarios. ${\pi}_{0.7}$ can follow diverse language instructions in unseen environments, including multi-stage tasks with various kitchen appliances, provide zero-shot cross-embodiment generalization, for example ena…

    0.7#cs.lg#cs.ro
  72. arXiv cs.LG/arxiv.org/
    ProtoTTA: Prototype-Guided Test-Time Adaptation

    arXiv:2604.15494v1 Announce Type: new Abstract: Deep networks that rely on prototypes-interpretable representations that can be related to the model input-have gained significant attention for balancing high accuracy with inherent interpretability, which makes them suitable for critical domains such as healthcare. However, these models are limited by their reliance on training data, which hampers…

    0.7#cs.lg#cs.cv
  73. arXiv cs.LG/arxiv.org/
    Natural gradient descent with momentum

    arXiv:2604.15554v1 Announce Type: new Abstract: We consider the problem of approximating a function by an element of a nonlinear manifold which admits a differentiable parametrization, typical examples being neural networks with differentiable activation functions or tensor networks. Natural gradient descent (NGD) for the optimization of a loss function can be seen as a preconditioned gradient de…

    0.7#cs.lg#cs.ai
  74. arXiv cs.LG/arxiv.org/
    Why Colors Make Clustering Harder:Global Integrality Gaps, the Price of Fairness, and Color-Coupled Algorithms in Chromatic Correlation Clustering

    arXiv:2604.15738v1 Announce Type: new Abstract: Chromatic Correlation Clustering (CCC) extends Correlation Clustering by assigning semantic colors to edges and requiring each cluster to receive a single color label. Unlike standard CC, whose LP relaxation has integrality gap 2 on complete graphs and admits a 2.06-approximation, the analogous LP for CCC has a strict lower bound of 2.11, and the be…

    0.7#cs.lg
  75. arXiv cs.LG/arxiv.org/
    Harmonizing Multi-Objective LLM Unlearning via Unified Domain Representation and Bidirectional Logit Distillation

    arXiv:2604.15482v1 Announce Type: new Abstract: Large Language Models (LLMs) unlearning is crucial for removing hazardous or privacy-leaking information from the model. Practical LLM unlearning demands satisfying multiple challenging objectives simultaneously: removing undesirable knowledge, preserving general utility, avoiding over-refusal of neighboring concepts, and, crucially, ensuring robust…

    0.9#cs.lg#cs.ai
  76. arXiv cs.LG/arxiv.org/
    Learning Affine-Equivariant Proximal Operators

    arXiv:2604.15556v1 Announce Type: new Abstract: Proximal operators are fundamental across many applications in signal processing and machine learning, including solving ill-posed inverse problems. Recent work has introduced Learned Proximal Networks (LPNs), providing parametric functions that compute exact proximals for data-driven and potentially non-convex regularizers. However, in many setting…

    0.7#cs.lg#cs.cv
  77. arXiv cs.LG/arxiv.org/
    Stargazer: A Scalable Model-Fitting Benchmark Environment for AI Agents under Astrophysical Constraints

    arXiv:2604.15664v1 Announce Type: new Abstract: The rise of autonomous AI agents suggests that dynamic benchmark environments with built-in feedback on scientifically grounded tasks are needed to evaluate the capabilities of these agents in research work. We introduce Stargazer, a scalable environment for evaluating AI agents on dynamic, iterative physics-grounded model-fitting tasks using infere…

    0.7#cs.lg
  78. arXiv cs.LG/arxiv.org/
    Collective Kernel EFT for Pre-activation ResNets

    arXiv:2604.15742v1 Announce Type: new Abstract: In finite-width deep neural networks, the empirical kernel $G$ evolves stochastically across layers. We develop a collective kernel effective field theory (EFT) for pre-activation ResNets based on a $G$-only closure hierarchy and diagnose its finite validity window. Exploiting the exact conditional Gaussianity of residual increments, we derive an ex…

    0.7#cs.lg#hep-th
  79. arXiv cs.LG/arxiv.org/
    Faster LLM Inference via Sequential Monte Carlo

    arXiv:2604.15672v1 Announce Type: new Abstract: Speculative decoding (SD) accelerates language model inference by drafting tokens from a cheap proposal model and verifying them against an expensive target model via rejection sampling. Because rejection truncates the draft block at the first error, throughput degrades when draft and target diverge. Rather than rejecting draft tokens outright, we p…

    0.9#cs.lg#cs.cl
  80. arXiv cs.LG/arxiv.org/
    Hierarchical Active Inference using Successor Representations

    arXiv:2604.15679v1 Announce Type: new Abstract: Active inference, a neurally-inspired model for inferring actions based on the free energy principle (FEP), has been proposed as a unifying framework for understanding perception, action, and learning in the brain. Active inference has previously been used to model ecologically important tasks such as navigation and planning, but scaling it to solve…

    0.7#cs.lg#cs.ai