Recovery of Sparse Signals from a Mixture of Linear Samples
- URL: http://arxiv.org/abs/2006.16406v2
- Date: Tue, 14 Jul 2020 16:38:36 GMT
- Title: Recovery of Sparse Signals from a Mixture of Linear Samples
- Authors: Arya Mazumdar and Soumyabrata Pal
- Abstract summary: Mixture of linear regressions is a popular learning theoretic model that is used widely to represent heterogeneous data.
Recent works focus on an experimental design setting of model recovery for this problem.
In this work, we address this query complexity problem and provide efficient algorithms that improves on the previously best known results.
- Score: 44.3425205248937
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mixture of linear regressions is a popular learning theoretic model that is
used widely to represent heterogeneous data. In the simplest form, this model
assumes that the labels are generated from either of two different linear
models and mixed together. Recent works of Yin et al. and Krishnamurthy et al.,
2019, focus on an experimental design setting of model recovery for this
problem. It is assumed that the features can be designed and queried with to
obtain their label. When queried, an oracle randomly selects one of the two
different sparse linear models and generates a label accordingly. How many such
oracle queries are needed to recover both of the models simultaneously? This
question can also be thought of as a generalization of the well-known
compressed sensing problem (Cand\`es and Tao, 2005, Donoho, 2006). In this
work, we address this query complexity problem and provide efficient algorithms
that improves on the previously best known results.
Related papers
- Addressing Multilabel Imbalance with an Efficiency-Focused Approach Using Diffusion Model-Generated Synthetic Samples [2.5399059426702575]
Multilabel learning (MLL) algorithms are used to classify patterns, rank labels, or learn the distribution of outputs.
The generation of new instances associated with minority labels, so that empty areas of the feature space are filled, helps to improve the obtained models.
In this paper, a diffusion model tailored to produce new instances for MLL data, called MLDM, is proposed.
arXiv Detail & Related papers (2025-01-18T16:56:50Z) - Agnostic Learning of Mixed Linear Regressions with EM and AM Algorithms [22.79595679373698]
Mixed linear regression is a well-studied problem in statistics and machine learning.
In this paper, we consider the more general problem of learning of mixed linear regression from samples.
We show that the AM and EM algorithms lead to learning in mixed linear regression by converging to the population loss minimizers.
arXiv Detail & Related papers (2024-06-03T09:43:24Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - On Learning Mixture Models with Sparse Parameters [44.3425205248937]
We study mixtures with high dimensional sparse latent parameter vectors and consider the problem of support recovery of those vectors.
We provide efficient algorithms for support recovery that have a logarithmic sample complexity dependence on the dimensionality of the latent space.
arXiv Detail & Related papers (2022-02-24T07:44:23Z) - Towards a Better Understanding of Linear Models for Recommendation [28.422943262159933]
We derivation and analysis the closed-form solutions for two basic regression and matrix factorization approaches.
We introduce a new learning algorithm in searching (hyper) parameters for the closed-form solution.
The experimental results demonstrate that the basic models and their closed-form solutions are indeed quite competitive against the state-of-the-art models.
arXiv Detail & Related papers (2021-05-27T04:17:04Z) - A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling.
We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk.
We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Two-step penalised logistic regression for multi-omic data with an
application to cardiometabolic syndrome [62.997667081978825]
We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately.
Our approach should be preferred if the goal is to select as many relevant predictors as possible.
Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.
arXiv Detail & Related papers (2020-08-01T10:36:27Z) - Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates.
The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature.
It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z) - Unsupervised Pool-Based Active Learning for Linear Regression [29.321275647107928]
This paper studies unsupervised pool-based AL for linear regression problems.
We propose a novel AL approach that considers simultaneously the informativeness, representativeness, and diversity, three essential criteria in AL.
arXiv Detail & Related papers (2020-01-14T20:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.