Related papers: Advancing Sentiment Analysis: A Novel LSTM Framework with Multi-head Attention

Related papers

IAM: Efficient Inference through Attention Mapping between Different-scale LLMs [74.81417160018856]
IAM framework achieves dual benefits of accelerated attention computation and reduced KV cache usage.<n>We show that IAM can accelerate prefill by 15% and reduce KV cache usage by 22.1% without appreciably sacrificing performance.
arXiv Detail & Related papers (2025-07-16T06:39:11Z)
Divergence Minimization Preference Optimization for Diffusion Model Alignment [58.651951388346525]
Divergence Minimization Preference Optimization (DMPO) is a principled method for aligning diffusion models by minimizing reverse KL divergence.<n>Our results show that diffusion models fine-tuned with DMPO can consistently outperform or match existing techniques.<n>DMPO unlocks a robust and elegant pathway for preference alignment, bridging principled theory with practical performance in diffusion models.
arXiv Detail & Related papers (2025-07-10T07:57:30Z)
Online high-precision prediction method for injection molding product weight by integrating time series/non-time series mixed features and feature attention mechanism [3.5881814709064934]
This study proposes a mixed feature attention-artificial neural network (MFA-ANN) model for high-precision online prediction of product weight.<n>Results demonstrate that the MFA-ANN model achieves a RMSE of 0.0281 with 0.5 g weight fluctuation tolerance, outperforming conventional benchmarks.
arXiv Detail & Related papers (2025-06-23T08:40:50Z)
Multimodal Sentiment Analysis on CMU-MOSEI Dataset using Transformer-based Models [0.0]
This project performs multimodal sentiment analysis using the CMU-MOSEI dataset.<n>We use transformer-based models with early fusion to integrate text, audio, and visual modalities.<n>The model achieves strong performance, with 97.87% 7-class accuracy and a 0.9682 F1-score on the test set.
arXiv Detail & Related papers (2025-05-09T15:10:57Z)
A Deep Learning Framework for Sequence Mining with Bidirectional LSTM and Multi-Scale Attention [11.999319439383918]
This paper addresses the challenges of mining latent patterns and modeling contextual dependencies in complex sequence data. A sequence pattern mining algorithm is proposed by integrating Bidirectional Long Short-Term Memory (BiLSTM) with a multi-scale attention mechanism. BiLSTM captures both forward and backward dependencies in sequences, enhancing the model's ability to perceive global contextual structures.
arXiv Detail & Related papers (2025-04-21T16:53:02Z)
Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance [3.666887868385651]
Existing benchmarks are saturated and struggle to separate model performances due to factors like data contamination.<n>This paper introduces theEnhanced Model Differentiation Metric, a novel weighted metric that revitalizes benchmarks by enhancing model separation.
arXiv Detail & Related papers (2025-03-07T16:25:09Z)
FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models [79.41859481668618]
Large Language Models (LLMs) have significantly advanced the fact-checking studies. Existing automated fact-checking evaluation methods rely on static datasets and classification metrics. We introduce FACT-AUDIT, an agent-driven framework that adaptively and dynamically assesses LLMs' fact-checking capabilities.
arXiv Detail & Related papers (2025-02-25T07:44:22Z)
CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory [38.32540433374892]
Large Language Models (LLMs) have achieved significant advancements, but the increasing complexity of tasks and higher performance demands highlight the need for continuous improvement.<n>Some approaches utilize synthetic data generated by advanced LLMs based on evaluation results to train models.<n>In this paper, we introduce the Cognitive Diagnostic Synthesis (CDS) method, which incorporates a diagnostic process inspired by Cognitive Diagnosis Theory (CDT) to refine evaluation results and characterize model profiles at the knowledge component level.
arXiv Detail & Related papers (2025-01-13T20:13:59Z)
EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation [58.546205554954454]
We propose Enhancing Alignment in MLLMs via Critical Observation (EACO)<n>EACO aligns MLLMs by self-generated preference data using only 5k images economically.<n>EACO reduces the overall hallucinations by 65.6% on HallusionBench and improves the reasoning ability by 21.8% on MME-Cognition.
arXiv Detail & Related papers (2024-12-06T09:59:47Z)
Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning [71.2981957820888]
We propose a novel Star-Agents framework, which automates the enhancement of data quality across datasets. The framework initially generates diverse instruction data with multiple LLM agents through a bespoke sampling method. The generated data undergo a rigorous evaluation using a dual-model method that assesses both difficulty and quality.
arXiv Detail & Related papers (2024-11-21T02:30:53Z)
MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z)
Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations. Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations. We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z)
A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models [7.428199805959228]
Few-shot semantic segmentation (FSS) is a crucial challenge in computer vision.<n>With the emergence of vision foundation models (VFM) as generalist feature extractors, we seek to explore the adaptation of these models for FSS.<n>We propose a novel realistic benchmark with a simple and straightforward adaptation process tailored for this task.
arXiv Detail & Related papers (2024-01-20T19:50:51Z)
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration [59.48235003469116]
We show that data augmentation consistently enhances OOD performance. We also show that CF augmented models which are easier to calibrate also exhibit much lower entropy when assigning importance.
arXiv Detail & Related papers (2023-09-14T16:16:40Z)
DiversiGATE: A Comprehensive Framework for Reliable Large Language Models [2.616506436169964]
We introduce DiversiGATE, a unified framework that consolidates diverse methodologies for LLM verification. We propose a novel SelfLearner' model that conforms to the DiversiGATE framework and refines its performance over time. Our results demonstrate that our approach outperforms traditional LLMs, achieving a considerable 54.8% -> 61.8% improvement on the GSM8K benchmark.
arXiv Detail & Related papers (2023-06-22T22:29:40Z)
The DONUT Approach to EnsembleCombination Forecasting [0.0]
This paper presents an ensemble forecasting method that shows strong results on the M4Competition dataset. Our assumption reductions, consisting mainly of auto-generated features and a more diverse model pool, significantly outperforms the statistical-feature-based ensemble method FFORMA. We also present a formal ex-post-facto analysis of optimal combination and selection for ensembles, quantifying differences through linear optimization on the M4 dataset.
arXiv Detail & Related papers (2022-01-02T22:19:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.