Related papers: GPT Meets Graphs and KAN Splines: Testing Novel Frameworks on Multitask Fine-Tuned GPT-2 with LoRA

Related papers

Knowledge Regularized Negative Feature Tuning of Vision-Language Models for Out-of-Distribution Detection [54.433899174017185]
Out-of-distribution (OOD) detection is crucial for building reliable machine learning models.<n>We propose a novel method called Knowledge Regularized Negative Feature Tuning (KR-NFT)<n>NFT applies distribution-aware transformations to pre-trained text features, effectively separating positive and negative features into distinct spaces.<n>When trained with few-shot samples from ImageNet dataset, KR-NFT not only improves ID classification accuracy and OOD detection but also significantly reduces the FPR95 by 5.44%.
arXiv Detail & Related papers (2025-07-26T07:44:04Z)
SCoRE: Streamlined Corpus-based Relation Extraction using Multi-Label Contrastive Learning and Bayesian kNN [0.2812395851874055]
We introduce SCoRE, a modular and cost-effective sentence-level relation extraction system.<n>SCoRE enables easy PLM switching, requires no finetuning, and adapts smoothly to diverse corpora and KGs.<n>We show that SCoRE matches or surpasses state-of-the-art methods while significantly reducing energy consumption.
arXiv Detail & Related papers (2025-07-09T14:33:07Z)
Enhancing Spatial Reasoning in Vision-Language Models via Chain-of-Thought Prompting and Reinforcement Learning [0.42855555838080844]
This study investigates the spatial reasoning capabilities of vision-language models (VLMs) through Chain-of-Thought prompting and reinforcement learning.<n>We find that simple CoT formats, where the model generates a reasoning step before the answer, can harm the model's original performance.<n>In contrast, structured multi-stage prompting based on scene graphs (SceneGraph CoT) significantly improves spatial reasoning accuracy.
arXiv Detail & Related papers (2025-07-06T10:51:12Z)
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning [53.894789613838654]
We introduce SEED-Bench-R1, a benchmark with complex real-world videos requiring balanced perception and reasoning.<n>Using SEED-Bench-R1, we find that standard GRPO, while improving answer accuracy, often reduces logical coherence between reasoning steps and answers, with only a 57.9% consistency rate.<n>We propose GRPO-CARE, a consistency-aware RL framework optimizing both answer correctness and reasoning coherence without explicit supervision.
arXiv Detail & Related papers (2025-06-19T08:49:13Z)
MetaGMT: Improving Actionable Interpretability of Graph Multilinear Networks via Meta-Learning Filtration [6.102559098873098]
We present MetaGMT, a meta-learning framework that enhances explanation fidelity through a novel bi-level optimization approach.<n>We demonstrate that MetaGMT significantly improves both explanation quality (AUC-ROC, Precision@K) and robustness to spurious patterns.<n>Our work contributes to building more trustworthy and actionable GNN systems for real-world applications.
arXiv Detail & Related papers (2025-05-26T03:07:58Z)
T2I-Eval-R1: Reinforcement Learning-Driven Reasoning for Interpretable Text-to-Image Evaluation [60.620408007636016]
We propose T2I-Eval-R1, a novel reinforcement learning framework that trains open-source MLLMs using only coarse-grained quality scores.<n>Our approach integrates Group Relative Policy Optimization into the instruction-tuning process, enabling models to generate both scalar scores and interpretable reasoning chains.
arXiv Detail & Related papers (2025-05-23T13:44:59Z)
Token-Efficient RL for LLM Reasoning [0.02488650627593658]
We propose reinforcement learning strategies tailored for reasoning in large language models (LLMs) under strict memory and compute limits.<n>Building on early policy gradient methods with baseline subtraction, we design critic-free methods that operate on a small, informative subset of output tokens.<n>We show that our methods raise accuracy on the SVAMP benchmark from 46% to over 70% and show strong performance on multi-digit multiplication.
arXiv Detail & Related papers (2025-04-29T14:58:43Z)
TWSSenti: A Novel Hybrid Framework for Topic-Wise Sentiment Analysis on Social Media Using Transformer Models [0.0]
This study explores a hybrid framework combining transformer-based models to improve sentiment classification accuracy and robustness.<n>The framework addresses challenges such as noisy data, contextual ambiguity, and generalization across diverse datasets.<n>This research highlights its applicability to real-world tasks such as social media monitoring, customer sentiment analysis, and public opinion tracking.
arXiv Detail & Related papers (2025-04-14T05:44:11Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance.<n>We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning [73.93639228235622]
Continual Learning with foundation models has emerged as a promising paradigm to exploit abundant knowledge acquired during pre-training for tackling sequential tasks.<n>Existing prompt-based and Low-Rank Adaptation-based (LoRA-based) methods often require expanding a prompt/LoRA pool or retaining samples of previous tasks.<n>We propose Scalable Decoupled LoRA (SD-LoRA) for class incremental learning, which continually separates the learning of the magnitude and direction of LoRA components without rehearsal.
arXiv Detail & Related papers (2025-01-22T20:00:41Z)
Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques [0.0]
Large Language Models (LLMs) are increasingly adopted for complex scientific text generation tasks. They often suffer from limitations in accuracy, consistency, and hallucination control. This thesis introduces a. ‘Fine-Tuning’ approach tailored for GPT-like models, aiming to mitigate hallucinations and enhance adapters.
arXiv Detail & Related papers (2024-11-10T12:28:09Z)
Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models. LoRA often under performs when compared to full- parameter fine-tuning. We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z)
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation [93.38604803625294]
We present UncertaintyRAG, a novel approach for long-context Retrieval-Augmented Generation (RAG) We use Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks. UncertaintyRAG outperforms baselines by 2.03% on LLaMA-2-7B, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-03T17:39:38Z)
A Critical Evaluation of AI Feedback for Aligning Large Language Models [60.42291111149438]
We show that simple supervised fine-tuning with GPT-4 as the teacher outperforms existing RLAIF pipelines. More generally, we find that the gains from RLAIF vary substantially across base model families, test-time evaluation protocols, and critic models.
arXiv Detail & Related papers (2024-02-19T18:53:54Z)
Generative Parameter-Efficient Fine-Tuning [8.481707805559589]
GIFT learns to generate the fine-tuned weights for a layer directly from its pretrained weights. We show this formulation bridges parameter-efficient fine-tuning and representation fine-tuning.
arXiv Detail & Related papers (2023-12-01T16:33:57Z)
Sparse Low-rank Adaptation of Pre-trained Language Models [79.74094517030035]
We introduce sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process. Our approach strengthens the representation power of LoRA by initializing it with a higher rank, while efficiently taming a temporarily increased number of parameters. Our experimental results demonstrate that SoRA can outperform other baselines even with 70% retained parameters and 70% training time.
arXiv Detail & Related papers (2023-11-20T11:56:25Z)
Stay on topic with Classifier-Free Guidance [57.28934343207042]
We show that CFG can be used broadly as an inference-time technique in pure language modeling. We show that CFG improves the performance of Pythia, GPT-2 and LLaMA-family models across an array of tasks.
arXiv Detail & Related papers (2023-06-30T17:07:02Z)
Quantifying the Optimization and Generalization Advantages of Graph Neural Networks Over Multilayer Perceptrons [50.33260238739837]
Graph networks (GNNs) have demonstrated remarkable capabilities in learning from graph-structured data.<n>There remains a lack of analysis comparing GNNs and generalizations from an optimization and generalization perspective.
arXiv Detail & Related papers (2023-06-24T10:21:11Z)
Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images [26.51970603200391]
This paper investigates optimizing the detection head based on the sparse convolution. It suffers from inadequate integration of contextual information of tiny objects. We propose a novel global context-enhanced adaptive sparse convolutional network.
arXiv Detail & Related papers (2023-03-25T14:42:50Z)
Adaptive Depth Graph Attention Networks [19.673509341792606]
The graph attention networks (GAT) is considered the most advanced learning architecture for graph representation. We find that the main factor limiting the accuracy of the GAT model as the number of layers increases is the oversquashing phenomenon. We propose a GAT variant model-ADGAT that adaptively selects the number of layers based on the sparsity of the graph.
arXiv Detail & Related papers (2023-01-16T05:22:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.