Related papers: BELT: Blockwise Missing Embedding Learning Transfomer

BELT: Blockwise Missing Embedding Learning Transfomer

URL: http://arxiv.org/abs/2105.10360v1
Date: Fri, 21 May 2021 13:55:30 GMT
Title: BELT: Blockwise Missing Embedding Learning Transfomer
Authors: Doudou Zhou, and Tianxi Cai, and Junwei Lu
Abstract summary: We propose the model bf Blockwise missing bf Embedding bf Learning bf Transformer (BELT) to treat row-wise/column-wise missingness. Specifically, our proposed method aims at efficient matrix recovery when every pair of matrices from multiple sources has an overlap.
Score: 9.341699514447113
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Matrix completion has attracted a lot of attention in many fields including statistics, applied mathematics and electrical engineering. Most of works focus on the independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in the integration of multiple (point-wise mutual information) PMI matrices, we propose the model {\bf B}lockwise missing {\bf E}mbedding {\bf L}earning {\bf T}ransformer (BELT) to treat row-wise/column-wise missingness. Specifically, our proposed method aims at efficient matrix recovery when every pair of matrices from multiple sources has an overlap. We provide theoretical justification for the proposed BELT method. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several PMI matrices built by EHR data and Chinese medical text data, which enables us to construct a comprehensive embedding set for CUI and Chinese with high quality.

Related papers

Truncated Matrix Completion - An Empirical Study [7.912206996605676]
Low-rank Matrix Completion describes the problem where we wish to recover missing entries of partially observed low-rank matrix. We consider various settings where the sampling mask is dependent on the underlying data values, motivated by applications in sensing, sequential decision-making, and recommender systems.
arXiv Detail & Related papers (2025-04-14T04:42:00Z)
Partially Supervised Unpaired Multi-Modal Learning for Label-Efficient Medical Image Segmentation [53.723234136550055]
We term the new learning paradigm as Partially Supervised Unpaired Multi-Modal Learning (PSUMML) We propose a novel Decomposed partial class adaptation with snapshot Ensembled Self-Training (DEST) framework for it. Our framework consists of a compact segmentation network with modality specific normalization layers for learning with partially labeled unpaired multi-modal data.
arXiv Detail & Related papers (2025-03-07T07:22:42Z)
Robust Multi-View Learning via Representation Fusion of Sample-Level Attention and Alignment of Simulated Perturbation [61.64052577026623]
Real-world multi-view datasets are often heterogeneous and imperfect. We propose a novel robust MVL method (namely RML) with simultaneous representation fusion and alignment. In experiments, we employ it in unsupervised multi-view clustering, noise-label classification, and as a plug-and-play module for cross-modal hashing retrieval.
arXiv Detail & Related papers (2025-03-06T07:01:08Z)
Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems [102.36545569092777]
We propose Heterogeneous Swarms, an algorithm to design multi-LLM systems by jointly optimizing model roles and weights.<n>Experiments demonstrate that Heterogeneous Swarms outperforms 15 role- and/or weight-based baselines by 18.5% on average across 12 tasks.
arXiv Detail & Related papers (2025-02-06T21:27:11Z)
Optimal Estimation of Shared Singular Subspaces across Multiple Noisy Matrices [3.3373545585860596]
This study focuses on estimating shared (left) singular subspaces across multiple matrices within a low-rank matrix denoising framework. We establish that Stack-SVD achieves minimax rate-optimality when the true singular subspaces of the signal matrices are identical. For various cases of partial sharing, we rigorously characterize the conditions under which Stack-SVD remains effective, achieves minimax optimality, or fails to deliver consistent estimates.
arXiv Detail & Related papers (2024-11-26T02:49:30Z)
Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation [56.75665429851673]
This paper introduces a novel instruction curation algorithm, derived from two unique perspectives, human and LLM preference alignment. Experiments demonstrate that we can maintain or even improve model performance by compressing synthetic multimodal instructions by up to 90%.
arXiv Detail & Related papers (2024-09-27T08:20:59Z)
Empirical Bayes Linked Matrix Decomposition [0.0]
We propose an empirical variational Bayesian approach to this problem. We describe an associated iterative imputation approach that is novel for the single-matrix context. We show that the method performs very well under different scenarios with respect to recovering underlying low-rank signal.
arXiv Detail & Related papers (2024-08-01T02:13:11Z)
Enhanced Latent Multi-view Subspace Clustering [25.343388834470247]
We propose an Enhanced Latent Multi-view Subspace Clustering (ELMSC) method for recovering latent space representation. Our proposed ELMSC is able to achieve higher clustering performance than some state-of-art multi-view clustering methods.
arXiv Detail & Related papers (2023-12-22T15:28:55Z)
Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning [53.445068584013896]
We study matrix estimation problems arising in reinforcement learning (RL) with low-rank structure. In low-rank bandits, the matrix to be recovered specifies the expected arm rewards, and for low-rank Markov Decision Processes (MDPs), it may for example characterize the transition kernel of the MDP. We show that simple spectral-based matrix estimation approaches efficiently recover the singular subspaces of the matrix and exhibit nearly-minimal entry-wise error.
arXiv Detail & Related papers (2023-10-10T17:06:41Z)
Large-scale gradient-based training of Mixtures of Factor Analyzers [67.21722742907981]
This article contributes both a theoretical analysis as well as a new method for efficient high-dimensional training by gradient descent. We prove that MFA training and inference/sampling can be performed based on precision matrices, which does not require matrix inversions after training is completed. Besides the theoretical analysis and matrices, we apply MFA to typical image datasets such as SVHN and MNIST, and demonstrate the ability to perform sample generation and outlier detection.
arXiv Detail & Related papers (2023-08-26T06:12:33Z)
Multi-modal Multi-view Clustering based on Non-negative Matrix Factorization [0.0]
We propose a study on multi-modal clustering algorithms and present a novel method called multi-modal multi-view non-negative matrix factorization. The experimental results show the value of the proposed approach, which was evaluated using a variety of data sets.
arXiv Detail & Related papers (2023-08-09T08:06:03Z)
MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs) We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles. Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z)
Multi-view Data Visualisation via Manifold Learning [0.03222802562733786]
This manuscript proposes extensions of Student's t-distributed SNE, LLE and ISOMAP, to allow for dimensionality reduction and visualisation of multi-view data. We show that by incorporating the low-dimensional embeddings obtained via the multi-view manifold learning approaches into the K-means algorithm, clusters of the samples are accurately identified.
arXiv Detail & Related papers (2021-01-17T19:54:36Z)
Kernel learning approaches for summarising and combining posterior similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models. A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z)
Federated Multi-view Matrix Factorization for Personalized Recommendations [53.74747022749739]
We introduce the federated multi-view matrix factorization method that extends the federated learning framework to matrix factorization with multiple data sources. Our method is able to learn the multi-view model without transferring the user's personal data to a central server.
arXiv Detail & Related papers (2020-04-08T21:07:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.