Related papers: Robust Feature Learning for Multi-Index Models in High Dimensions

Robust Feature Learning for Multi-Index Models in High Dimensions

URL: http://arxiv.org/abs/2410.16449v1
Date: Mon, 21 Oct 2024 19:20:34 GMT
Title: Robust Feature Learning for Multi-Index Models in High Dimensions
Authors: Alireza Mousavi-Hosseini, Adel Javanmard, Murat A. Erdogdu,
Abstract summary: We take the first steps towards understanding adversarially robust feature learning with neural networks. We show that adversarially robust learning is just as easy as standard learning.
Score: 17.183648775698167
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, there have been numerous studies on feature learning with neural networks, specifically on learning single- and multi-index models where the target is a function of a low-dimensional projection of the input. Prior works have shown that in high dimensions, the majority of the compute and data resources are spent on recovering the low-dimensional projection; once this subspace is recovered, the remainder of the target can be learned independently of the ambient dimension. However, implications of feature learning in adversarial settings remain unexplored. In this work, we take the first steps towards understanding adversarially robust feature learning with neural networks. Specifically, we prove that the hidden directions of a multi-index model offer a Bayes optimal low-dimensional projection for robustness against $\ell_2$-bounded adversarial perturbations under the squared loss, assuming that the multi-index coordinates are statistically independent from the rest of the coordinates. Therefore, robust learning can be achieved by first performing standard feature learning, then robustly tuning a linear readout layer on top of the standard representations. In particular, we show that adversarially robust learning is just as easy as standard learning, in the sense that the additional number of samples needed to robustly learn multi-index models when compared to standard learning, does not depend on dimensionality.

Related papers

Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning [51.170479006249195]
We introduce a new dataset, benchmark, and a dynamic coarse-to-fine learning scheme in this study. Our proposed dataset, AI-TOD-R, features the smallest object sizes among all oriented object detection datasets. We present a benchmark spanning a broad range of detection paradigms, including both fully-supervised and label-efficient approaches.
arXiv Detail & Related papers (2024-12-16T09:14:32Z)
iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning [22.14627083675405]
We propose incremental neural mesh models that can be extended with new meshes over time. We demonstrate the effectiveness of our method through extensive experiments on the Pascal3D and ObjectNet3D datasets. Our work also presents the first incremental learning approach for pose estimation.
arXiv Detail & Related papers (2024-07-12T13:57:49Z)
Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
Learning to Continually Learn with the Bayesian Principle [36.75558255534538]
In this work, we adopt the meta-learning paradigm to combine the strong representational power of neural networks and simple statistical models' robustness to forgetting. Since the neural networks remain fixed during continual learning, they are protected from catastrophic forgetting.
arXiv Detail & Related papers (2024-05-29T04:53:31Z)
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions [20.036783417617652]
We investigate the training dynamics of two-layer shallow neural networks trained with gradient-based algorithms. We show that a simple modification of the idealized single-pass gradient descent training scenario drastically improves its computational efficiency. Our results highlight the ability of networks to learn relevant structures from data alone without any pre-processing.
arXiv Detail & Related papers (2024-05-24T11:34:31Z)
Is Deep Learning finally better than Decision Trees on Tabular Data? [19.657605376506357]
Tabular data is a ubiquitous data modality due to its versatility and ease of use in many real-world applications. Recent studies on data offer a unique perspective on the limitations of neural networks in this domain. Our study categorizes ten state-of-the-art models based on their underlying learning paradigm.
arXiv Detail & Related papers (2024-02-06T12:59:02Z)
FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data. We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC) Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z)
PointMoment:Mixed-Moment-based Self-Supervised Representation Learning for 3D Point Clouds [11.980787751027872]
We propose PointMoment, a novel framework for point cloud self-supervised representation learning. Our framework does not require any special techniques such as asymmetric network architectures, gradient stopping, etc.
arXiv Detail & Related papers (2023-12-06T08:49:55Z)
Learning Single-Index Models with Shallow Neural Networks [43.6480804626033]
We introduce a natural class of shallow neural networks and study its ability to learn single-index models via gradient flow. We show that the corresponding optimization landscape is benign, which in turn leads to generalization guarantees that match the near-optimal sample complexity of dedicated semi-parametric methods.
arXiv Detail & Related papers (2022-10-27T17:52:58Z)
Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks. Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts. Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z)
Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models. We propose a realistic medical diagnosis benchmark for tabular transfer learning. We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z)
What Makes Good Contrastive Learning on Small-Scale Wearable-based Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task. This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.