Enhancing Long-Range Dependency with State Space Model and Kolmogorov-Arnold Networks for Aspect-Based Sentiment Analysis
- URL: http://arxiv.org/abs/2407.10347v3
- Date: Thu, 26 Dec 2024 11:47:56 GMT
- Title: Enhancing Long-Range Dependency with State Space Model and Kolmogorov-Arnold Networks for Aspect-Based Sentiment Analysis
- Authors: Adamu Lawan, Juhua Pu, Haruna Yunusa, Aliyu Umar, Muhammad Lawan,
- Abstract summary: We present a novel approach to enhance long-range dependencies between aspect and opinion words in ABSA (MambaForGCN)
Experimental results on three benchmark datasets demonstrate MambaForGCN's effectiveness, outperforming state-of-the-art (SOTA) baseline models.
- Score: 0.6885635732944716
- License:
- Abstract: Aspect-based Sentiment Analysis (ABSA) evaluates sentiments toward specific aspects of entities within the text. However, attention mechanisms and neural network models struggle with syntactic constraints. The quadratic complexity of attention mechanisms also limits their adoption for capturing long-range dependencies between aspect and opinion words in ABSA. This complexity can lead to the misinterpretation of irrelevant contextual words, restricting their effectiveness to short-range dependencies. To address the above problem, we present a novel approach to enhance long-range dependencies between aspect and opinion words in ABSA (MambaForGCN). This approach incorporates syntax-based Graph Convolutional Network (SynGCN) and MambaFormer (Mamba-Transformer) modules to encode input with dependency relations and semantic information. The Multihead Attention (MHA) and Selective State Space model (Mamba) blocks in the MambaFormer module serve as channels to enhance the model with short and long-range dependencies between aspect and opinion words. We also introduce the Kolmogorov-Arnold Networks (KANs) gated fusion, an adaptive feature representation system that integrates SynGCN and MambaFormer and captures non-linear, complex dependencies. Experimental results on three benchmark datasets demonstrate MambaForGCN's effectiveness, outperforming state-of-the-art (SOTA) baseline models.
Related papers
- Multi-View Attention Syntactic Enhanced Graph Convolutional Network for Aspect-based Sentiment Analysis [33.68786386700902]
Aspect-based Sentiment Analysis (ABSA) is the task aimed at predicting the sentiment polarity of aspect words within sentences.
Recent incorporating graph neural networks (GNNs) to capture additional syntactic structure information in the dependency tree has been proven to be an effective paradigm for boosting ABSA.
We propose a new multi-view attention syntactic enhanced graph convolutional network (MASGCN) that weighs different syntactic information of views using attention mechanisms.
arXiv Detail & Related papers (2025-01-27T11:26:13Z) - Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement [54.427965535613886]
Mamba, as a novel state-space model (SSM), has gained widespread application in natural language processing and computer vision.
In this work, we introduce Mamba-SEUNet, an innovative architecture that integrates Mamba with U-Net for SE tasks.
arXiv Detail & Related papers (2024-12-21T13:43:51Z) - PPMamba: A Pyramid Pooling Local Auxiliary SSM-Based Model for Remote Sensing Image Semantic Segmentation [1.5136939451642137]
This paper proposes a novel network called Pyramid Pooling Mamba (PPMamba), which integrates CNN and Mamba for semantic segmentation tasks.
PPMamba achieves competitive performance compared to state-of-the-art models.
arXiv Detail & Related papers (2024-09-10T08:08:50Z) - DualKanbaFormer: Kolmogorov-Arnold Networks and State Space Model Transformer for Multimodal Aspect-based Sentiment Analysis [0.6498237940960344]
Multimodal aspect-based sentiment analysis (MABSA) enhances sentiment detection by combining text with other data types like images.
We propose Kolmogorov-Arnold Networks (KANs) and Selective State Space model (Mamba) transformer (DualKanbaFormer)
Our model outperforms some state-of-the-art (SOTA) studies on two public datasets.
arXiv Detail & Related papers (2024-08-27T19:33:15Z) - SIGMA: Selective Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.
We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.
Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z) - Mamba-Spike: Enhancing the Mamba Architecture with a Spiking Front-End for Efficient Temporal Data Processing [4.673285689826945]
Mamba-Spike is a novel neuromorphic architecture that integrates a spiking front-end with the Mamba backbone to achieve efficient temporal data processing.
The architecture consistently outperforms state-of-the-art baselines, achieving higher accuracy, lower latency, and improved energy efficiency.
arXiv Detail & Related papers (2024-08-04T14:10:33Z) - SPMamba: State-space model is all you need in speech separation [20.168153319805665]
CNN-based speech separation models face local receptive field limitations and cannot effectively capture long time dependencies.
We introduce an innovative speech separation method called SPMamba.
This model builds upon the robust TF-GridNet architecture, replacing its traditional BLSTM modules with bidirectional Mamba modules.
arXiv Detail & Related papers (2024-04-02T16:04:31Z) - MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models [56.37780601189795]
We propose a framework named MamMIL for WSI analysis.
We represent each WSI as an undirected graph.
To address the problem that Mamba can only process 1D sequences, we propose a topology-aware scanning mechanism.
arXiv Detail & Related papers (2024-03-08T09:02:13Z) - A Novel Energy based Model Mechanism for Multi-modal Aspect-Based
Sentiment Analysis [85.77557381023617]
We propose a novel framework called DQPSA for multi-modal sentiment analysis.
PDQ module uses the prompt as both a visual query and a language query to extract prompt-aware visual information.
EPE module models the boundaries pairing of the analysis target from the perspective of an Energy-based Model.
arXiv Detail & Related papers (2023-12-13T12:00:46Z) - SpatioTemporal Focus for Skeleton-based Action Recognition [66.8571926307011]
Graph convolutional networks (GCNs) are widely adopted in skeleton-based action recognition.
We argue that the performance of recent proposed skeleton-based action recognition methods is limited by the following factors.
Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information.
arXiv Detail & Related papers (2022-03-31T02:45:24Z) - Multi-Scale Semantics-Guided Neural Networks for Efficient
Skeleton-Based Human Action Recognition [140.18376685167857]
A simple yet effective multi-scale semantics-guided neural network is proposed for skeleton-based action recognition.
MS-SGN achieves the state-of-the-art performance on the NTU60, NTU120, and SYSU datasets.
arXiv Detail & Related papers (2021-11-07T03:50:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.