What shapes feature representations? Exploring datasets, architectures,
and training
- URL: http://arxiv.org/abs/2006.12433v2
- Date: Thu, 22 Oct 2020 20:09:34 GMT
- Title: What shapes feature representations? Exploring datasets, architectures,
and training
- Authors: Katherine L. Hermann and Andrew K. Lampinen
- Abstract summary: In naturalistic learning problems, a model's input contains a wide range of features, some useful for the task at hand, and others not.
These questions are important for understanding the basis of models' decisions.
We study these questions using synthetic datasets in which the task-relevance of input features can be controlled directly.
- Score: 14.794135558227682
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In naturalistic learning problems, a model's input contains a wide range of
features, some useful for the task at hand, and others not. Of the useful
features, which ones does the model use? Of the task-irrelevant features, which
ones does the model represent? Answers to these questions are important for
understanding the basis of models' decisions, as well as for building models
that learn versatile, adaptable representations useful beyond the original
training task. We study these questions using synthetic datasets in which the
task-relevance of input features can be controlled directly. We find that when
two features redundantly predict the labels, the model preferentially
represents one, and its preference reflects what was most linearly decodable
from the untrained model. Over training, task-relevant features are enhanced,
and task-irrelevant features are partially suppressed. Interestingly, in some
cases, an easier, weakly predictive feature can suppress a more strongly
predictive, but more difficult one. Additionally, models trained to recognize
both easy and hard features learn representations most similar to models that
use only the easy feature. Further, easy features lead to more consistent
representations across model runs than do hard features. Finally, models have
greater representational similarity to an untrained model than to models
trained on a different task. Our results highlight the complex processes that
determine which features a model represents.
Related papers
- Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - On the Foundations of Shortcut Learning [20.53986437152018]
We study how predictivity and availability interact to shape models' feature use.
We find that linear models are relatively unbiased, but introducing a single hidden layer with ReLU or Tanh units yields a bias.
arXiv Detail & Related papers (2023-10-24T22:54:05Z) - Small Language Models for Tabular Data [0.0]
We show the ability of deep representation learning to address problems of classification and regression from small and poorly formed datasets.
We find that small models have sufficient capacity for approximation of various functions and achieve record classification benchmark accuracy.
arXiv Detail & Related papers (2022-11-05T16:57:55Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Model-agnostic multi-objective approach for the evolutionary discovery
of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results.
We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z) - Sufficiently Accurate Model Learning for Planning [119.80502738709937]
This paper introduces the constrained Sufficiently Accurate model learning approach.
It provides examples of such problems, and presents a theorem on how close some approximate solutions can be.
The approximate solution quality will depend on the function parameterization, loss and constraint function smoothness, and the number of samples in model learning.
arXiv Detail & Related papers (2021-02-11T16:27:31Z) - What do we expect from Multiple-choice QA Systems? [70.86513724662302]
We consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets.
We evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs.
arXiv Detail & Related papers (2020-11-20T21:27:10Z) - Lifting Interpretability-Performance Trade-off via Automated Feature
Engineering [5.802346990263708]
Complex black-box predictive models may have high performance, but lack of interpretability causes problems.
We propose a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models.
arXiv Detail & Related papers (2020-02-11T09:16:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.