Related papers: Redundancy-optimized Multi-head Attention Networks for Multi-View Multi-Label Feature Selection

Redundancy-optimized Multi-head Attention Networks for Multi-View Multi-Label Feature Selection

URL: http://arxiv.org/abs/2511.12462v1
Date: Sun, 16 Nov 2025 05:16:46 GMT
Title: Redundancy-optimized Multi-head Attention Networks for Multi-View Multi-Label Feature Selection
Authors: Yuzhou Liu, Jiarui Liu, Wanfu Gao,
Abstract summary: Multi-view multi-label data offers richer perspectives for artificial intelligence.<n>It presents significant challenges for feature selection due to the inherent complexity of interrelations among features, views and labels.<n>We propose a novel method based on Redundancy-optimized Multi-head Attention Networks for Multi-view Multi-label Feature Selection.
Score: 14.533409384742116
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Multi-view multi-label data offers richer perspectives for artificial intelligence, but simultaneously presents significant challenges for feature selection due to the inherent complexity of interrelations among features, views and labels. Attention mechanisms provide an effective way for analyzing these intricate relationships. They can compute importance weights for information by aggregating correlations between Query and Key matrices to focus on pertinent values. However, existing attention-based feature selection methods predominantly focus on intra-view relationships, neglecting the complementarity of inter-view features and the critical feature-label correlations. Moreover, they often fail to account for feature redundancy, potentially leading to suboptimal feature subsets. To overcome these limitations, we propose a novel method based on Redundancy-optimized Multi-head Attention Networks for Multi-view Multi-label Feature Selection (RMAN-MMFS). Specifically, we employ each individual attention head to model intra-view feature relationships and use the cross-attention mechanisms between different heads to capture inter-view feature complementarity. Furthermore, we design static and dynamic feature redundancy terms: the static term mitigates redundancy within each view, while the dynamic term explicitly models redundancy between unselected and selected features across the entire selection process, thereby promoting feature compactness. Comprehensive evaluations on six real-world datasets, compared against six multi-view multi-label feature selection methods, demonstrate the superior performance of the proposed method.

Related papers

Orthogonalized Multimodal Contrastive Learning with Asymmetric Masking for Structured Representations [4.67724003380452]
Multimodal learning seeks to integrate information from heterogeneous sources, where signals may be shared across modalities, specific to individual modalities, or emerge only through their interaction.<n>While self-supervised multimodal contrastive learning has achieved remarkable progress, most existing methods predominantly capture redundant cross-modal signals, often neglecting modality-specific (unique) and interaction-driven (synergistic) information.<n>Recent extensions broaden this perspective, yet they either fail to explicitly model synergistic interactions or learn different information components in an entangled manner, leading to incomplete representations and potential information leakage.<n>We introduce textbfCOrAL, a principled framework
arXiv Detail & Related papers (2026-02-16T18:06:53Z)
Cross-Modal Attention Network with Dual Graph Learning in Multimodal Recommendation [12.802844514133255]
Cross-modal Recursive Attention Network with dual graph Embedding (CRANE)<n>We design a core Recursive Cross-Modal Attention (RCA) mechanism that iteratively refines modality features based on cross-correlations in a joint latent space.<n>For symmetric multimodal learning, we explicitly construct users' multimodal profiles by aggregating features of their interacted items.
arXiv Detail & Related papers (2026-01-16T10:09:39Z)
Multi-label Classification with Panoptic Context Aggregation Networks [61.82285737410154]
This paper introduces the Deep Panoptic Context Aggregation Network (PanCAN), a novel approach that hierarchically integrates multi-order geometric contexts.<n>PanCAN learns multi-order neighborhood relationships at each scale by combining random walks with an attention mechanism.<n>Experiments on NUS-WIDE, PASCAL VOC,2007, and MS-COCO benchmarks demonstrate that PanCAN consistently achieves competitive results.
arXiv Detail & Related papers (2025-12-29T14:16:21Z)
Embedded Multi-label Feature Selection via Orthogonal Regression [45.55795914923279]
State-of-the-art embedded multi-label feature selection algorithms based on at least square regression cannot preserve sufficient discriminative information in multi-label data. A novel embedded multi-label feature selection method is proposed to facilitate the multi-label feature selection. Extensive experimental results on ten multi-label data sets demonstrate the effectiveness of GRROOR.
arXiv Detail & Related papers (2024-03-01T06:18:40Z)
Interactive Multi-Head Self-Attention with Linear Complexity [60.112941134420204]
We show that the interactions between cross-heads of the attention matrix enhance the information flow of the attention operation. We propose an effective method to decompose the attention operation into query- and key-less components.
arXiv Detail & Related papers (2024-02-27T13:47:23Z)
BiVRec: Bidirectional View-based Multimodal Sequential Recommendation [55.87443627659778]
We propose an innovative framework, BivRec, that jointly trains the recommendation tasks in both ID and multimodal views. BivRec achieves state-of-the-art performance on five datasets and showcases various practical advantages.
arXiv Detail & Related papers (2024-02-27T09:10:41Z)
MvFS: Multi-view Feature Selection for Recommender System [7.0190343591422115]
We propose Multi-view Feature Selection (MvFS), which selects informative features for each instance more effectively. MvFS employs a multi-view network consisting of multiple sub-networks, each of which learns to measure the feature importance of a part of data. MvFS adopts an effective importance score modeling strategy which is applied independently to each field.
arXiv Detail & Related papers (2023-09-05T09:06:34Z)
Asymmetric double-winged multi-view clustering network for exploring Diverse and Consistent Information [28.300395619444796]
In unsupervised scenarios, deep contrastive multi-view clustering (DCMVC) is becoming a hot research spot. We propose a novel multi-view clustering network termed CodingNet to explore the diverse and consistent information simultaneously. Our framework's efficacy is validated through extensive experiments on six widely used benchmark datasets.
arXiv Detail & Related papers (2023-09-01T14:13:22Z)
Compositional Attention: Disentangling Search and Retrieval [66.7108739597771]
Multi-head, key-value attention is the backbone of the Transformer model and its variants. Standard attention heads learn a rigid mapping between search and retrieval. We propose a novel attention mechanism, called Compositional Attention, that replaces the standard head structure.
arXiv Detail & Related papers (2021-10-18T15:47:38Z)
Embedded Deep Bilinear Interactive Information and Selective Fusion for Multi-view Learning [70.67092105994598]
We propose a novel multi-view learning framework to make the multi-view classification better aimed at the above-mentioned two aspects. In particular, we train different deep neural networks to learn various intra-view representations. Experiments on six publicly available datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2020-07-13T01:13:23Z)
Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-based Person Re-identification [98.7585431239291]
Video-based person re-identification aims at matching the same person across video clips. In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-Attentive Feature aggregation module MG-RAFA. Our framework achieves the state-of-the-art ablation performance on three benchmark datasets.
arXiv Detail & Related papers (2020-03-27T03:49:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.