I$^3$-MRec: Invariant Learning with Information Bottleneck for Incomplete Modality Recommendation
- URL: http://arxiv.org/abs/2508.04247v1
- Date: Wed, 06 Aug 2025 09:29:50 GMT
- Title: I$^3$-MRec: Invariant Learning with Information Bottleneck for Incomplete Modality Recommendation
- Authors: Huilin Chen, Miaomiao Cai, Fan Liu, Zhiyong Cheng, Richang Hong, Meng Wang,
- Abstract summary: We introduce textbfI$3$-MRec, which uses textbfInvariant learning with textbfIn bottleneck principle for textbfModality textbfRecommendation.<n>I$3$-MRec consistently outperforms existing state-of-the-art MRS methods across various modality-missing scenarios.
- Score: 42.37107069331169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal recommender systems (MRS) improve recommendation performance by integrating diverse semantic information from multiple modalities. However, the assumption of the availability of all modalities rarely holds in practice due to missing images, incomplete descriptions, or inconsistent user content. These challenges significantly degrade the robustness and generalization capabilities of current models. To address these challenges, we introduce a novel method called \textbf{I$^3$-MRec}, which uses \textbf{I}nvariant learning with \textbf{I}nformation bottleneck principle for \textbf{I}ncomplete \textbf{M}odality \textbf{Rec}ommendation. To achieve robust performance in missing modality scenarios, I$^3$-MRec enforces two pivotal properties: (i) cross-modal preference invariance, which ensures consistent user preference modeling across varying modality environments, and (ii) compact yet effective modality representation, which filters out task-irrelevant modality information while maximally preserving essential features relevant to recommendation. By treating each modality as a distinct semantic environment, I$^3$-MRec employs invariant risk minimization (IRM) to learn modality-specific item representations. In parallel, a missing-aware fusion module grounded in the Information Bottleneck (IB) principle extracts compact and effective item embeddings by suppressing modality noise and preserving core user preference signals. Extensive experiments conducted on three real-world datasets demonstrate that I$^3$-MRec consistently outperforms existing state-of-the-art MRS methods across various modality-missing scenarios, highlighting its effectiveness and robustness in practical applications. The code and processed datasets are released at https://github.com/HuilinChenJN/I3-MRec.
Related papers
- BiXFormer: A Robust Framework for Maximizing Modality Effectiveness in Multi-Modal Semantic Segmentation [55.486872677160015]
We reformulate multi-modal semantic segmentation as a mask-level classification task.<n>We propose BiXFormer, which integrates Unified Modality Matching (UMM) and Cross Modality Alignment (CMA)<n> Experiments on both synthetic and real-world multi-modal benchmarks demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2025-06-04T08:04:58Z) - Disentangling and Generating Modalities for Recommendation in Missing Modality Scenarios [21.73914052076956]
We propose Disentangling and Generating Modality Recommender (DGMRec) for missing modality scenarios.<n>DGMRec disentangles modality features into general and specific modality features from an information-based perspective.<n>It consistently outperforms state-of-the-art MRSs in challenging scenarios.
arXiv Detail & Related papers (2025-04-23T02:04:14Z) - Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness [61.87055159919641]
Multi-modal semantic segmentation (MMSS) addresses the limitations of single-modality data by integrating complementary information across modalities.<n>Despite notable progress, a significant gap persists between research and real-world deployment due to variability and uncertainty in multi-modal data quality.<n>We introduce a robustness benchmark that evaluates MMSS models under three scenarios: Entire-Missing Modality (EMM), Random-Missing Modality (RMM), and Noisy Modality (NM)
arXiv Detail & Related papers (2025-03-24T08:46:52Z) - MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching [54.740256498985026]
Keypoint detection and description methods often struggle with multimodal data.<n>We propose a modality-invariant feature learning network (MIFNet) to compute modality-invariant features for keypoint descriptions in multimodal image matching.
arXiv Detail & Related papers (2025-01-20T06:56:30Z) - Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning [27.867369806400834]
We propose RAGPT, a novel Retrieval-AuGmented dynamic Prompt Tuning framework.<n>RAGPT comprises three modules: (I) the multi-channel retriever, (II) the missing modality generator, and (III) the context-aware prompter.<n>Experiments conducted on three real-world datasets show that RAGPT consistently outperforms all competitive baselines in handling incomplete modality problems.
arXiv Detail & Related papers (2025-01-02T07:39:48Z) - MAGIC++: Efficient and Resilient Modality-Agnostic Semantic Segmentation via Hierarchical Modality Selection [20.584588303521496]
We introduce the MAGIC++ framework, which comprises two key plug-and-play modules for effective multi-modal fusion and hierarchical modality selection.<n>Our method achieves state-of-the-art performance on both real-world and synthetic benchmarks.<n>Our method is superior in the novel modality-agnostic setting, where it outperforms prior arts by a large margin.
arXiv Detail & Related papers (2024-12-22T06:12:03Z) - LLM-based Bi-level Multi-interest Learning Framework for Sequential Recommendation [54.396000434574454]
We propose a novel multi-interest SR framework combining implicit behavioral and explicit semantic perspectives.<n>It includes two modules: the Implicit Behavioral Interest Module and the Explicit Semantic Interest Module.<n>Experiments on four real-world datasets validate the framework's effectiveness and practicality.
arXiv Detail & Related papers (2024-11-14T13:00:23Z) - Centering the Value of Every Modality: Towards Efficient and Resilient Modality-agnostic Semantic Segmentation [7.797154022794006]
Recent endeavors regard RGB modality as the center and the others as the auxiliary, yielding an asymmetric architecture with two branches.
We propose a novel method, named MAGIC, that can be flexibly paired with various backbones, ranging from compact to high-performance models.
Our method achieves state-of-the-art performance while reducing the model parameters by 60%.
arXiv Detail & Related papers (2024-07-16T03:19:59Z) - Exploiting modality-invariant feature for robust multimodal emotion
recognition with missing modalities [76.08541852988536]
We propose to use invariant features for a missing modality imagination network (IF-MMIN)
We show that the proposed model outperforms all baselines and invariantly improves the overall emotion recognition performance under uncertain missing-modality conditions.
arXiv Detail & Related papers (2022-10-27T12:16:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.