Related papers: Distribution-Based Invariant Deep Networks for Learning Meta-Features

Distribution-Based Invariant Deep Networks for Learning Meta-Features

URL: http://arxiv.org/abs/2006.13708v2
Date: Sun, 18 Oct 2020 16:52:35 GMT
Title: Distribution-Based Invariant Deep Networks for Learning Meta-Features
Authors: Gwendoline De Bie, Herilalaina Rakotoarison, Gabriel Peyr\'e, Mich\`ele Sebag
Abstract summary: Recent advances in deep learning from probability distributions successfully achieve classification or regression from distribution samples, thus invariant under permutation of the samples. The proposed architecture, called Dida, inherits the NN properties of universal approximation, and its robustness w.r.t. Lipschitz-bounded transformations of the input distribution are established. The paper empirically and comparatively demonstrate the merits of the approach on two tasks defined at the dataset level.
Score: 2.179313476241343
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in deep learning from probability distributions successfully achieve classification or regression from distribution samples, thus invariant under permutation of the samples. The first contribution of the paper is to extend these neural architectures to achieve invariance under permutation of the features, too. The proposed architecture, called Dida, inherits the NN properties of universal approximation, and its robustness w.r.t. Lipschitz-bounded transformations of the input distribution is established. The second contribution is to empirically and comparatively demonstrate the merits of the approach on two tasks defined at the dataset level. On both tasks, Dida learns meta-features supporting the characterization of a (labelled) dataset. The first task consists of predicting whether two dataset patches are extracted from the same initial dataset. The second task consists of predicting whether the learning performance achieved by a hyper-parameter configuration under a fixed algorithm (ranging in k-NN, SVM, logistic regression and linear classifier with SGD) dominates that of another configuration, for a dataset extracted from the OpenML benchmarking suite. On both tasks, Dida outperforms the state of the art: DSS (Maron et al., 2020) and Dataset2Vec (Jomaa et al., 2019) architectures, as well as the models based on the hand-crafted meta-features of the literature.

Related papers

ORIGAMI: A generative transformer architecture for predictions from semi-structured data [3.5639148953570836]
ORIGAMI is a transformer-based architecture that processes nested key/value pairs. By reformulating classification as next-token prediction, ORIGAMI naturally handles both single-label and multi-label tasks.
arXiv Detail & Related papers (2024-12-23T07:21:17Z)
Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance. We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks. We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z)
Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance. We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features. In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z)
On Characterizing the Evolution of Embedding Space of Neural Networks using Algebraic Topology [9.537910170141467]
We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers. We demonstrate that as depth increases, a topologically complicated dataset is transformed into a simple one, resulting in Betti numbers attaining their lowest possible value.
arXiv Detail & Related papers (2023-11-08T10:45:12Z)
Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples. We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z)
Ensemble Classifier Design Tuned to Dataset Characteristics for Network Intrusion Detection [0.0]
Two new algorithms are proposed to address the class overlap issue in the dataset. The proposed design is evaluated for both binary and multi-category classification.
arXiv Detail & Related papers (2022-05-08T21:06:42Z)
Merging Two Cultures: Deep and Statistical Learning [3.15863303008255]
Merging the two cultures of deep and statistical learning provides insights into structured high-dimensional data. We show that prediction, optimisation and uncertainty can be achieved using probabilistic methods at the output layer of the model.
arXiv Detail & Related papers (2021-10-22T02:57:21Z)
An Optimization-Based Meta-Learning Model for MRI Reconstruction with Diverse Dataset [4.9259403018534496]
We develop a generalizable MRI reconstruction model in the meta-learning framework. The proposed network learns regularization function in a learner adaptional model. We test the result of quick training on the unseen tasks after meta-training and in the saving half of the time.
arXiv Detail & Related papers (2021-10-02T03:21:52Z)
Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible. Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples. We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z)
Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results. We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z)
Meta-learning framework with applications to zero-shot time-series forecasting [82.61728230984099]
This work provides positive evidence using a broad meta-learning framework. residual connections act as a meta-learning adaptation mechanism. We show that it is viable to train a neural network on a source TS dataset and deploy it on a different target TS dataset without retraining.
arXiv Detail & Related papers (2020-02-07T16:39:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.