Related papers: Mixing Deep Learning and Multiple Criteria Optimization: An Application to Distributed Learning with Multiple Datasets

Mixing Deep Learning and Multiple Criteria Optimization: An Application to Distributed Learning with Multiple Datasets

URL: http://arxiv.org/abs/2112.01358v1
Date: Thu, 2 Dec 2021 16:00:44 GMT
Title: Mixing Deep Learning and Multiple Criteria Optimization: An Application to Distributed Learning with Multiple Datasets
Authors: Davide La Torre, Danilo Liuzzi, Marco Repetto, Matteo Rocca
Abstract summary: Training phase is the most important stage during the machine learning process. We develop a multiple criteria optimization model in which each criterion measures the distance between the output associated with a specific input and its label. We propose a scalarization approach to implement this model and numerical experiments in digit classification using MNIST data.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The training phase is the most important stage during the machine learning process. In the case of labeled data and supervised learning, machine training consists in minimizing the loss function subject to different constraints. In an abstract setting, it can be formulated as a multiple criteria optimization model in which each criterion measures the distance between the output associated with a specific input and its label. Therefore, the fitting term is a vector function and its minimization is intended in the Pareto sense. We provide stability results of the efficient solutions with respect to perturbations of input and output data. We then extend the same approach to the case of learning with multiple datasets. The multiple dataset environment is relevant when reducing the bias due to the choice of a specific training set. We propose a scalarization approach to implement this model and numerical experiments in digit classification using MNIST data.

Related papers

Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training. We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO. As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z)
A CLIP-Powered Framework for Robust and Generalizable Data Selection [51.46695086779598]
Real-world datasets often contain redundant and noisy data, imposing a negative impact on training efficiency and model performance. Data selection has shown promise in identifying the most representative samples from the entire dataset. We propose a novel CLIP-powered data selection framework that leverages multimodal information for more robust and generalizable sample selection.
arXiv Detail & Related papers (2024-10-15T03:00:58Z)
Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection [89.42023974249122]
Adapt-$infty$ is a new multi-way and adaptive data selection approach for Lifelong Instruction Tuning. We construct pseudo-skill clusters by grouping gradient-based sample vectors. We select the best-performing data selector for each skill cluster from a pool of selector experts.
arXiv Detail & Related papers (2024-10-14T15:48:09Z)
On minimizing the training set fill distance in machine learning regression [0.552480439325792]
We study a data selection approach that aims to minimize the fill distance of the selected set. We show that selecting training sets with the FPS can also increase model stability for the specific case of Gaussian kernel regression approaches.
arXiv Detail & Related papers (2023-07-20T16:18:33Z)
MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training. Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z)
A Penalty Approach for Normalizing Feature Distributions to Build Confounder-Free Models [11.818509522227565]
MetaData Normalization (MDN) estimates the linear relationship between the metadata and each feature based on a non-trainable closed-form solution. We extend the MDN method by applying a Penalty approach (referred to as PDMN) We show improvement in model accuracy and greater independence from confounders using PMDN over MDN in a synthetic experiment and a multi-label, multi-site dataset of magnetic resonance images (MRIs)
arXiv Detail & Related papers (2022-07-11T04:02:12Z)
A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled. We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples. We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z)
Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics. We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data. Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
Finding High-Value Training Data Subset through Differentiable Convex Programming [5.5180456567480896]
In this paper, we study the problem of selecting high-value subsets of training data. The key idea is to design a learnable framework for online subset selection. Using this framework, we design an online alternating minimization-based algorithm for jointly learning the parameters of the selection model and ML model.
arXiv Detail & Related papers (2021-04-28T14:33:26Z)
Balancing Constraints and Submodularity in Data Subset Selection [43.03720397062461]
We show that one can achieve similar accuracy to traditional deep-learning models, while using less training data. We propose a novel diversity driven objective function, and balancing constraints on class labels and decision boundaries using matroids.
arXiv Detail & Related papers (2021-04-26T19:22:27Z)
Learning by Minimizing the Sum of Ranked Range [58.24935359348289]
We introduce the sum of ranked range (SoRR) as a general approach to form learning objectives. A ranked range is a consecutive sequence of sorted values of a set of real numbers. We explore two applications in machine learning of the minimization of the SoRR framework, namely the AoRR aggregate loss for binary classification and the TKML individual loss for multi-label/multi-class classification.
arXiv Detail & Related papers (2020-10-05T01:58:32Z)
A Markov Decision Process Approach to Active Meta Learning [24.50189361694407]
In supervised learning, we fit a single statistical model to a given data set, assuming that the data is associated with a singular task. In meta-learning, the data is associated with numerous tasks, and we seek a model that may perform well on all tasks simultaneously.
arXiv Detail & Related papers (2020-09-10T15:45:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.