Related papers: MambaForGCN: Enhancing Long-Range Dependency with State Space Model and Kolmogorov-Arnold Networks for Aspect-Based Sentiment Analysis

MambaForGCN: Enhancing Long-Range Dependency with State Space Model and Kolmogorov-Arnold Networks for Aspect-Based Sentiment Analysis

URL: http://arxiv.org/abs/2407.10347v1
Date: Sun, 14 Jul 2024 22:23:07 GMT
Title: MambaForGCN: Enhancing Long-Range Dependency with State Space Model and Kolmogorov-Arnold Networks for Aspect-Based Sentiment Analysis
Authors: Adamu Lawan, Juhua Pu, Haruna Yunusa, Aliyu Umar, Muhammad Lawan,
Abstract summary: We present MambaForGCN, a novel approach to enhance short and long-range dependencies between aspect and opinion words in ABSA. Experimental results on three benchmark datasets demonstrate MambaForGCN's effectiveness, outperforming state-of-the-art (SOTA) baseline models.
Score: 0.6885635732944716
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Aspect-based sentiment Analysis (ABSA) identifies and evaluates sentiments toward specific aspects of entities within text, providing detailed insights beyond overall sentiment. However, Attention mechanisms and neural network models struggle with syntactic constraints, and the quadratic complexity of attention mechanisms hinders their adoption for capturing long-range dependencies between aspect and opinion words in ABSA. This complexity can lead to the misinterpretation of irrelevant con-textual words, restricting their effectiveness to short-range dependencies. Some studies have investigated merging semantic and syntactic approaches but face challenges in effectively integrating these methods. To address the above problems, we present MambaForGCN, a novel approach to enhance short and long-range dependencies between aspect and opinion words in ABSA. This innovative approach incorporates syntax-based Graph Convolutional Network (SynGCN) and MambaFormer (Mamba-Transformer) modules to encode input with dependency relations and semantic information. The Multihead Attention (MHA) and Mamba blocks in the MambaFormer module serve as channels to enhance the model with short and long-range dependencies between aspect and opinion words. We also introduce the Kolmogorov-Arnold Networks (KANs) gated fusion, an adaptively integrated feature representation system combining SynGCN and MambaFormer representations. Experimental results on three benchmark datasets demonstrate MambaForGCN's effectiveness, outperforming state-of-the-art (SOTA) baseline models.

Related papers

Mamba-Driven Topology Fusion for Monocular 3-D Human Pose Estimation [32.185238802221576]
Recently, the Mamba model has substantially reduced computational overhead.<n>The ability of SSM to process sequential data is not suitable for 3-D joint sequences with topological structures.<n>We propose the Mamba-Driven Topology Fusion framework to address these issues.
arXiv Detail & Related papers (2025-05-27T01:21:57Z)
vGamba: Attentive State Space Bottleneck for efficient Long-range Dependencies in Visual Recognition [0.0]
State-space models (SSMs) offer an alternative, but their application in vision remains underexplored. This work introduces vGamba, a hybrid vision backbone that integrates SSMs with attention mechanisms to enhance efficiency and expressiveness. Tests on classification, detection, and segmentation tasks demonstrate that vGamba achieves a superior trade-off between accuracy and computational efficiency, outperforming several existing models.
arXiv Detail & Related papers (2025-03-27T08:39:58Z)
Multi-View Attention Syntactic Enhanced Graph Convolutional Network for Aspect-based Sentiment Analysis [33.68786386700902]
Aspect-based Sentiment Analysis (ABSA) is the task aimed at predicting the sentiment polarity of aspect words within sentences. Recent incorporating graph neural networks (GNNs) to capture additional syntactic structure information in the dependency tree has been proven to be an effective paradigm for boosting ABSA. We propose a new multi-view attention syntactic enhanced graph convolutional network (MASGCN) that weighs different syntactic information of views using attention mechanisms.
arXiv Detail & Related papers (2025-01-27T11:26:13Z)
Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement [54.427965535613886]
Mamba, as a novel state-space model (SSM), has gained widespread application in natural language processing and computer vision. In this work, we introduce Mamba-SEUNet, an innovative architecture that integrates Mamba with U-Net for SE tasks.
arXiv Detail & Related papers (2024-12-21T13:43:51Z)
Stuffed Mamba: Oversized States Lead to the Inability to Forget [69.36377985746878]
We show that Mamba-based models struggle to effectively forget earlier tokens even with built-in forgetting mechanisms.<n>We show that the minimum training length required for the model to learn forgetting scales linearly with the state size, and the maximum context length for accurate retrieval of a 5-digit passkey scales exponentially with the state size.<n>Our work suggests that future RNN designs must account for the interplay between state size, training length, and forgetting mechanisms to achieve robust performance in long-context tasks.
arXiv Detail & Related papers (2024-10-09T17:54:28Z)
Mamba in Vision: A Comprehensive Survey of Techniques and Applications [3.4580301733198446]
Mamba is emerging as a novel approach to overcome the challenges faced by Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) in computer vision. Mamba addresses these limitations by leveraging Selective Structured State Space Models to effectively capture long-range dependencies with linear computational complexity.
arXiv Detail & Related papers (2024-10-04T02:58:49Z)
PPMamba: A Pyramid Pooling Local Auxiliary SSM-Based Model for Remote Sensing Image Semantic Segmentation [1.5136939451642137]
This paper proposes a novel network called Pyramid Pooling Mamba (PPMamba), which integrates CNN and Mamba for semantic segmentation tasks. PPMamba achieves competitive performance compared to state-of-the-art models.
arXiv Detail & Related papers (2024-09-10T08:08:50Z)
DualKanbaFormer: Kolmogorov-Arnold Networks and State Space Model Transformer for Multimodal Aspect-based Sentiment Analysis [0.6498237940960344]
Multimodal aspect-based sentiment analysis (MABSA) enhances sentiment detection by combining text with other data types like images. We propose Kolmogorov-Arnold Networks (KANs) and Selective State Space model (Mamba) transformer (DualKanbaFormer) Our model outperforms some state-of-the-art (SOTA) studies on two public datasets.
arXiv Detail & Related papers (2024-08-27T19:33:15Z)
Bidirectional Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction. We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation. Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z)
Mamba-Spike: Enhancing the Mamba Architecture with a Spiking Front-End for Efficient Temporal Data Processing [4.673285689826945]
Mamba-Spike is a novel neuromorphic architecture that integrates a spiking front-end with the Mamba backbone to achieve efficient temporal data processing. The architecture consistently outperforms state-of-the-art baselines, achieving higher accuracy, lower latency, and improved energy efficiency.
arXiv Detail & Related papers (2024-08-04T14:10:33Z)
SPMamba: State-space model is all you need in speech separation [20.168153319805665]
CNN-based speech separation models face local receptive field limitations and cannot effectively capture long time dependencies. We introduce an innovative speech separation method called SPMamba. This model builds upon the robust TF-GridNet architecture, replacing its traditional BLSTM modules with bidirectional Mamba modules.
arXiv Detail & Related papers (2024-04-02T16:04:31Z)
MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models [56.37780601189795]
We propose a framework named MamMIL for WSI analysis. We represent each WSI as an undirected graph. To address the problem that Mamba can only process 1D sequences, we propose a topology-aware scanning mechanism.
arXiv Detail & Related papers (2024-03-08T09:02:13Z)
A Novel Energy based Model Mechanism for Multi-modal Aspect-Based Sentiment Analysis [85.77557381023617]
We propose a novel framework called DQPSA for multi-modal sentiment analysis. PDQ module uses the prompt as both a visual query and a language query to extract prompt-aware visual information. EPE module models the boundaries pairing of the analysis target from the perspective of an Energy-based Model.
arXiv Detail & Related papers (2023-12-13T12:00:46Z)
MGNNI: Multiscale Graph Neural Networks with Implicit Layers [53.75421430520501]
implicit graph neural networks (GNNs) have been proposed to capture long-range dependencies in underlying graphs. We introduce and justify two weaknesses of implicit GNNs: the constrained expressiveness due to their limited effective range for capturing long-range dependencies, and their lack of ability to capture multiscale information on graphs at multiple resolutions. We propose a multiscale graph neural network with implicit layers (MGNNI) which is able to model multiscale structures on graphs and has an expanded effective range for capturing long-range dependencies.
arXiv Detail & Related papers (2022-10-15T18:18:55Z)
SpatioTemporal Focus for Skeleton-based Action Recognition [66.8571926307011]
Graph convolutional networks (GCNs) are widely adopted in skeleton-based action recognition. We argue that the performance of recent proposed skeleton-based action recognition methods is limited by the following factors. Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information.
arXiv Detail & Related papers (2022-03-31T02:45:24Z)
Multi-Scale Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition [140.18376685167857]
A simple yet effective multi-scale semantics-guided neural network is proposed for skeleton-based action recognition. MS-SGN achieves the state-of-the-art performance on the NTU60, NTU120, and SYSU datasets.
arXiv Detail & Related papers (2021-11-07T03:50:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.