A Distance Metric Learning Model Based On Variational Information
Bottleneck
- URL: http://arxiv.org/abs/2403.02794v1
- Date: Tue, 5 Mar 2024 09:08:20 GMT
- Title: A Distance Metric Learning Model Based On Variational Information
Bottleneck
- Authors: YaoDan Zhang, Zidong Wang, Ru Jia and Ru Li
- Abstract summary: This paper proposes a new metric learning model VIB-DML (Variational Information Bottleneck Distance Metric Learning) for rating prediction.
The results show that the generalization of VIB-DML is excellent. Compared with the general metric learning model MetricF, the prediction error is reduced by 7.29%.
- Score: 34.06440004780627
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, personalized recommendation technology has flourished and
become one of the hot research directions. The matrix factorization model and
the metric learning model which proposed successively have been widely studied
and applied. The latter uses the Euclidean distance instead of the dot product
used by the former to measure the latent space vector. While avoiding the
shortcomings of the dot product, the assumption of Euclidean distance is
neglected, resulting in limited recommendation quality of the model. In order
to solve this problem, this paper combines the Variationl Information
Bottleneck with metric learning model for the first time, and proposes a new
metric learning model VIB-DML (Variational Information Bottleneck Distance
Metric Learning) for rating prediction, which limits the mutual information of
the latent space feature vector to improve the robustness of the model and
satisfiy the assumption of Euclidean distance by decoupling the latent space
feature vector. In this paper, the experimental results are compared with the
root mean square error (RMSE) on the three public datasets. The results show
that the generalization ability of VIB-DML is excellent. Compared with the
general metric learning model MetricF, the prediction error is reduced by
7.29%. Finally, the paper proves the strong robustness of VIBDML through
experiments.
Related papers
- Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Curvature Augmented Manifold Embedding and Learning [9.195829534223982]
A new dimensional reduction (DR) and data visualization method, Curvature-Augmented Manifold Embedding and Learning (CAMEL), is proposed.
The key novel contribution is to formulate the DR problem as a mechanistic/physics model.
Compared with many existing attractive-repulsive force-based methods, one unique contribution of the proposed method is to include a non-pairwise force.
arXiv Detail & Related papers (2024-03-21T19:59:07Z) - Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - Learning new physics efficiently with nonparametric methods [11.970219534238444]
We present a machine learning approach for model-independent new physics searches.
The corresponding algorithm is powered by recent large-scale implementations of kernel methods.
We show that our approach has dramatic advantages compared to neural network implementations in terms of training times and computational resources.
arXiv Detail & Related papers (2022-04-05T16:17:59Z) - Hyperbolic Vision Transformers: Combining Improvements in Metric
Learning [116.13290702262248]
We propose a new hyperbolic-based model for metric learning.
At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space.
We evaluate the proposed model with six different formulations on four datasets.
arXiv Detail & Related papers (2022-03-21T09:48:23Z) - Benign-Overfitting in Conditional Average Treatment Effect Prediction
with Linear Regression [14.493176427999028]
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE) with linear regression models.
We show that the T-learner fails to achieve the consistency except the random assignment, while the IPW-learner converges the risk to zero if the propensity score is known.
arXiv Detail & Related papers (2022-02-10T18:51:52Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - A Data-driven feature selection and machine-learning model benchmark for
the prediction of longitudinal dispersion coefficient [29.58577229101903]
An accurate prediction on Longitudinal Dispersion(LD) coefficient can produce a performance leap in related simulation.
In this study, a global optimal feature set was proposed through numerical comparison of the distilled local optimums in performance with representative ML models.
Results show that the support vector machine has significantly better performance than other models.
arXiv Detail & Related papers (2021-07-16T09:50:38Z) - Exploring Adversarial Robustness of Deep Metric Learning [25.12224002984514]
DML uses deep neural architectures to learn semantic embeddings of the input.
We tackle the primary challenge of the metric losses being dependent on the samples in a mini-batch.
Using experiments on three commonly-used DML datasets, we demonstrate 5-76 fold increases in adversarial accuracy.
arXiv Detail & Related papers (2021-02-14T23:18:12Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.