Related papers: Your Causal Self-Attentive Recommender Hosts a Lonely Neighborhood

Related papers

Reasoning and Tool-use Compete in Agentic RL:From Quantifying Interference to Disentangled Tuning [26.401906729658688]
Agentic Reinforcement Learning (ARL) focuses on training large language models to interleave reasoning with external tool execution to solve complex tasks.<n>Most existing ARL methods train a single shared model parameters to support both reasoning and tool use behaviors, implicitly assuming that joint training leads to improved overall agent performance.<n>We show that these two capabilities often induce misaligned gradient directions, leading to training interference that undermines the effectiveness of joint optimization.<n>We propose Disentangled Action Reasoning Tuning(DART), a simple and efficient framework that explicitly decouples parameter updates for reasoning and tool-use via separate low-rank
arXiv Detail & Related papers (2026-02-01T03:19:22Z)
How to Set the Learning Rate for Large-Scale Pre-training? [73.03133634525635]
We formalize this investigation into two distinct research paradigms: Fitting and Transfer.<n>Within the Fitting Paradigm, we introduce a Scaling Law for search factor, effectively reducing the search complexity from O(n3) to O(n*C_D*C_) via predictive modeling.<n>We extend the principles of $$Transfer to the Mixture of Experts (MoE) architecture, broadening its applicability to encompass model depth, weight decay, and token horizons.
arXiv Detail & Related papers (2026-01-08T15:55:13Z)
Evaluating Parameter Efficient Methods for RLVR [38.45552186628944]
Reinforcement Learning with Verifiable Rewards (RLVR) incentivizes language models to enhance their reasoning capabilities through verifiable feedback.<n>While methods like LoRA are commonly used, the optimal PEFT architecture for RLVR remains unidentified.<n>We conduct the first comprehensive evaluation of over 12 PEFT methodologies across the DeepSeek-R1-Distill families on mathematical reasoning benchmarks.
arXiv Detail & Related papers (2025-12-29T03:13:08Z)
From Harm to Help: Turning Reasoning In-Context Demos into Assets for Reasoning LMs [58.02809208460186]
We revisit this paradox using high-quality traces from DeepSeek-R1 as demonstrations.<n>We find that adding more exemplars consistently degrades accuracy, even when demonstrations are optimal.<n>We introduce Insight-to-solve (I2S), a sequential test-time procedure that turns demonstrations into explicit, reusable insights.
arXiv Detail & Related papers (2025-09-27T08:59:31Z)
Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization [63.169050703903515]
We propose Aes-R1, a comprehensive aesthetic reasoning framework with reinforcement learning (RL)<n>Aes-R1 integrates a pipeline, AesCoT, to construct and filter high-quality chain-of-thought aesthetic reasoning data.<n>Experiments demonstrate that Aes-R1 improves the backbone's average PLCC/SRCC by 47.9%/34.8%.
arXiv Detail & Related papers (2025-09-26T04:55:00Z)
STARec: An Efficient Agent Framework for Recommender Systems via Autonomous Deliberate Reasoning [54.28691219536054]
We introduce STARec, a slow-thinking augmented agent framework that endows recommender systems with autonomous deliberative reasoning capabilities.<n>We develop anchored reinforcement training - a two-stage paradigm combining structured knowledge distillation from advanced reasoning models with preference-aligned reward shaping.<n>Experiments on MovieLens 1M and Amazon CDs benchmarks demonstrate that STARec achieves substantial performance gains compared with state-of-the-art baselines.
arXiv Detail & Related papers (2025-08-26T08:47:58Z)
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z)
$\ ext{R}^2\ ext{ec}$: Towards Large Recommender Models with Reasoning [50.291998724376654]
We propose name, a unified large recommender model with intrinsic reasoning capabilities.<n> RecPO is a corresponding reinforcement learning framework that optimize name both the reasoning and recommendation capabilities simultaneously in a single policy update.<n> Experiments on three datasets with various baselines verify the effectiveness of name, showing relative improvements of 68.67% in Hit@5 and 45.21% in NDCG@20.
arXiv Detail & Related papers (2025-05-22T17:55:43Z)
LARES: Latent Reasoning for Sequential Recommendation [96.26996622771593]
We present LARES, a novel and scalable LAtent REasoning framework for Sequential recommendation.<n>Our proposed approach employs a recurrent architecture that allows flexible expansion of reasoning depth without increasing parameter complexity.<n>Our framework exhibits seamless compatibility with existing advanced models, further improving their recommendation performance.
arXiv Detail & Related papers (2025-05-22T16:22:54Z)
DGRO: Enhancing LLM Reasoning via Exploration-Exploitation Control and Reward Variance Management [18.953750405635393]
Decoupled Group Reward Optimization (DGRO) is a general RL algorithm for Large Language Models (LLMs) reasoning.<n>We show that DGRO achieves state-of-the-art performance on the Logic dataset with an average accuracy of 96.9%, and demonstrates strong generalization across mathematical benchmarks.
arXiv Detail & Related papers (2025-05-19T10:44:49Z)
Retrieval is Not Enough: Enhancing RAG Reasoning through Test-Time Critique and Optimization [58.390885294401066]
Retrieval-augmented generation (RAG) has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>We propose AlignRAG, a novel iterative framework grounded in Critique-Driven Alignment (CDA)<n>We introduce AlignRAG-auto, an autonomous variant that dynamically terminates refinement, removing the need to pre-specify the number of critique iterations.
arXiv Detail & Related papers (2025-04-21T04:56:47Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
ExpertRAG: Efficient RAG with Mixture of Experts -- Optimizing Context Retrieval for Adaptive LLM Responses [0.0]
ExpertRAG is a novel theoretical framework that integrates Mixture-of-Experts (MoE) architectures with Retrieval Augmented Generation (RAG) We propose a dynamic retrieval gating mechanism coupled with expert routing, enabling the model to selectively consult an external knowledge store or rely on specialized internal experts. We derive formulae to quantify the expected computational cost savings from selective retrieval and the capacity gains from sparse expert utilization.
arXiv Detail & Related papers (2025-03-23T17:26:23Z)
A Differentiable Alignment Framework for Sequence-to-Sequence Modeling via Optimal Transport [12.835774667953187]
We propose a novel differentiable alignment framework based on one-dimensional optimal transport. We show that our method considerably improves alignment performance, though with a trade-off in ASR performance when compared to CTC.
arXiv Detail & Related papers (2025-02-03T18:20:29Z)
Optimizing Sequential Recommendation Models with Scaling Laws and Approximate Entropy [104.48511402784763]
Performance Law for SR models aims to theoretically investigate and model the relationship between model performance and data quality. We propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics.
arXiv Detail & Related papers (2024-11-30T10:56:30Z)
A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on. Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions. The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z)
REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via Model Fusion of Embedding Models [14.023953508288628]
Retrieval augmented generation (RAG) pipelines are commonly used in tasks such as question-answering (QA) We propose REFINE, a novel technique that generates synthetic data from available documents and then uses a model fusion approach to fine-tune embeddings.
arXiv Detail & Related papers (2024-10-16T08:43:39Z)
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling [15.013242103936625]
We introduce a new paradigm for AutoRegressive (AR) image generation, termed Set AutoRegressive Modeling (SAR) SAR generalizes the conventional AR to the next-set setting, i.e., splitting the sequence into arbitrary sets containing multiple tokens. We explore the properties of SAR by analyzing the impact of sequence order and output intervals on performance.
arXiv Detail & Related papers (2024-10-14T13:49:06Z)
Beyond Exact Match: Semantically Reassessing Event Extraction by Large Language Models [65.8478860180793]
Event extraction has gained extensive research attention due to its broad range of applications. Current evaluation method for event extraction relies on token-level exact match. We propose a reliable and semantic evaluation framework for event extraction, named RAEE.
arXiv Detail & Related papers (2024-10-12T07:54:01Z)
Long-Sequence Recommendation Models Need Decoupled Embeddings [49.410906935283585]
We identify and characterize a neglected deficiency in existing long-sequence recommendation models. A single set of embeddings struggles with learning both attention and representation, leading to interference between these two processes. We propose the Decoupled Attention and Representation Embeddings (DARE) model, where two distinct embedding tables are learned separately to fully decouple attention and representation.
arXiv Detail & Related papers (2024-10-03T15:45:15Z)
Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond [44.154393889313724]
Transformers with linear attention are capable of in-context learning (ICL) by implementing a linear gradient estimator through descent steps. We develop a stronger characterization of the optimization and generalization landscape of ICL through contributions on architectures, low-rank parameterization, and correlated designs.
arXiv Detail & Related papers (2024-07-13T21:13:55Z)
A Critical Evaluation of AI Feedback for Aligning Large Language Models [60.42291111149438]
We show that simple supervised fine-tuning with GPT-4 as the teacher outperforms existing RLAIF pipelines. More generally, we find that the gains from RLAIF vary substantially across base model families, test-time evaluation protocols, and critic models.
arXiv Detail & Related papers (2024-02-19T18:53:54Z)
SLEM: Machine Learning for Path Modeling and Causal Inference with Super Learner Equation Modeling [3.988614978933934]
Causal inference is a crucial goal of science, enabling researchers to arrive at meaningful conclusions using observational data. Path models, Structural Equation Models (SEMs) and Directed Acyclic Graphs (DAGs) provide a means to unambiguously specify assumptions regarding the causal structure underlying a phenomenon. We propose Super Learner Equation Modeling, a path modeling technique integrating machine learning Super Learner ensembles.
arXiv Detail & Related papers (2023-08-08T16:04:42Z)
Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer. The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z)
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator. This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z)
Rethinking Missing Data: Aleatoric Uncertainty-Aware Recommendation [59.500347564280204]
We propose a new Aleatoric Uncertainty-aware Recommendation (AUR) framework. AUR consists of a new uncertainty estimator along with a normal recommender model. As the chance of mislabeling reflects the potential of a pair, AUR makes recommendations according to the uncertainty.
arXiv Detail & Related papers (2022-09-22T04:32:51Z)
You Only Need One Model for Open-domain Question Answering [26.582284346491686]
Recent works for Open-domain Question Answering refer to an external knowledge base using a retriever model. We propose casting the retriever and the reranker as hard-attention mechanisms applied sequentially within the transformer architecture. We evaluate our model on Natural Questions and TriviaQA open datasets and our model outperforms the previous state-of-the-art model by 1.0 and 0.7 exact match scores.
arXiv Detail & Related papers (2021-12-14T13:21:11Z)
Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation [25.37244686572865]
We propose a novel method called Adversarial and Contrastive Variational Autoencoder (ACVAE) for sequential recommendation. We first introduce the adversarial training for sequence generation under the Adversarial Variational Bayes framework, which enables our model to generate high-quality latent variables. Besides, when encoding the sequence, we apply a recurrent and convolutional structure to capture global and local relationships in the sequence.
arXiv Detail & Related papers (2021-03-19T09:01:14Z)
Hierarchical Variational Autoencoder for Visual Counterfactuals [79.86967775454316]
Conditional Variational Autos (VAE) are gathering significant attention as an Explainable Artificial Intelligence (XAI) tool. In this paper we show how relaxing the effect of the posterior leads to successful counterfactuals. We introduce VAEX an Hierarchical VAE designed for this approach that can visually audit a classifier in applications.
arXiv Detail & Related papers (2021-02-01T14:07:11Z)
Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks. Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL. Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z)
Model-based actor-critic: GAN (model generator) + DRL (actor-critic) => AGI [0.0]
We propose adding an (generative/predictive) environment model to the actor-critic (model-free) architecture. The proposed AI model is similar to (model-free) DDPG and therefore it's called model-based DDPG. Our initial limited experiments show that DRL and GAN in model-based actor-critic results in an incremental goal-driven intellignce required to solve each task with similar performance to (model-free) DDPG.
arXiv Detail & Related papers (2020-04-04T02:05:54Z)
A Distributionally Robust Area Under Curve Maximization Model [1.370633147306388]
We propose two new distributionally robust AUC models (DR-AUC) DR-AUC models rely on the Kantorovich metric and approximate the AUC with the hinge loss function. numerical experiments show that the proposed DR-AUC models perform better in general and in particular improve the worst-case out-of-sample performance.
arXiv Detail & Related papers (2020-02-18T02:50:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.