On the Joint Interaction of Models, Data, and Features
- URL: http://arxiv.org/abs/2306.04793v1
- Date: Wed, 7 Jun 2023 21:35:26 GMT
- Title: On the Joint Interaction of Models, Data, and Features
- Authors: Yiding Jiang, Christina Baek, J. Zico Kolter
- Abstract summary: We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
- Score: 82.60073661644435
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning features from data is one of the defining characteristics of deep
learning, but our theoretical understanding of the role features play in deep
learning is still rudimentary. To address this gap, we introduce a new tool,
the interaction tensor, for empirically analyzing the interaction between data
and model through features. With the interaction tensor, we make several key
observations about how features are distributed in data and how models with
different random seeds learn different features. Based on these observations,
we propose a conceptual framework for feature learning. Under this framework,
the expected accuracy for a single hypothesis and agreement for a pair of
hypotheses can both be derived in closed-form. We demonstrate that the proposed
framework can explain empirically observed phenomena, including the recently
discovered Generalization Disagreement Equality (GDE) that allows for
estimating the generalization error with only unlabeled data. Further, our
theory also provides explicit construction of natural data distributions that
break the GDE. Thus, we believe this work provides valuable new insight into
our understanding of feature learning.
Related papers
- Learning Discrete Concepts in Latent Hierarchical Models [73.01229236386148]
Learning concepts from natural high-dimensional data holds potential in building human-aligned and interpretable machine learning models.
We formalize concepts as discrete latent causal variables that are related via a hierarchical causal model.
We substantiate our theoretical claims with synthetic data experiments.
arXiv Detail & Related papers (2024-06-01T18:01:03Z) - Class-wise Activation Unravelling the Engima of Deep Double Descent [0.0]
Double descent presents a counter-intuitive aspect within the machine learning domain.
In this study, we revisited the phenomenon of double descent and discussed the conditions of its occurrence.
arXiv Detail & Related papers (2024-05-13T12:07:48Z) - Harmonizing Feature Attributions Across Deep Learning Architectures:
Enhancing Interpretability and Consistency [2.2237337682863125]
This study examines the generalization of feature attributions across various deep learning architectures.
We aim to develop a more coherent and optimistic understanding of feature attributions.
Our findings highlight the potential for harmonized feature attribution methods to improve interpretability and foster trust in machine learning applications.
arXiv Detail & Related papers (2023-07-05T09:46:41Z) - How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features [19.261178173399784]
We consider spurious features that are uncorrelated with the learning task.
We provide a precise characterization of how they are memorized via two separate terms.
We prove that the memorization of spurious features weakens as the generalization capability increases.
arXiv Detail & Related papers (2023-05-20T05:27:41Z) - Exploring the cloud of feature interaction scores in a Rashomon set [17.775145325515993]
We introduce the feature interaction score (FIS) in the context of a Rashomon set.
We demonstrate the properties of the FIS via synthetic data and draw connections to other areas of statistics.
Our results suggest that the proposed FIS can provide valuable insights into the nature of feature interactions in machine learning models.
arXiv Detail & Related papers (2023-05-17T13:05:26Z) - Towards a mathematical understanding of learning from few examples with
nonlinear feature maps [68.8204255655161]
We consider the problem of data classification where the training set consists of just a few data points.
We reveal key relationships between the geometry of an AI model's feature space, the structure of the underlying data distributions, and the model's generalisation capabilities.
arXiv Detail & Related papers (2022-11-07T14:52:58Z) - Inherent Inconsistencies of Feature Importance [6.02357145653815]
Feature importance is a method that assigns scores to the contribution of individual features on prediction outcomes.
This paper presents an axiomatic framework designed to establish coherent relationships among the different contexts of feature importance scores.
arXiv Detail & Related papers (2022-06-16T14:21:51Z) - Learning from few examples with nonlinear feature maps [68.8204255655161]
We explore the phenomenon and reveal key relationships between dimensionality of AI model's feature space, non-degeneracy of data distributions, and the model's generalisation capabilities.
The main thrust of our present analysis is on the influence of nonlinear feature transformations mapping original data into higher- and possibly infinite-dimensional spaces on the resulting model's generalisation capabilities.
arXiv Detail & Related papers (2022-03-31T10:36:50Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - OR-Net: Pointwise Relational Inference for Data Completion under Partial
Observation [51.083573770706636]
This work uses relational inference to fill in the incomplete data.
We propose Omni-Relational Network (OR-Net) to model the pointwise relativity in two aspects.
arXiv Detail & Related papers (2021-05-02T06:05:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.