Related papers: Going Beyond Neural Network Feature Similarity: The Network Feature Complexity and Its Interpretation Using Category Theory

Going Beyond Neural Network Feature Similarity: The Network Feature Complexity and Its Interpretation Using Category Theory

URL: http://arxiv.org/abs/2310.06756v2
Date: Sun, 26 Nov 2023 17:24:31 GMT
Title: Going Beyond Neural Network Feature Similarity: The Network Feature Complexity and Its Interpretation Using Category Theory
Authors: Yiting Chen, Zhanpeng Zhou, Junchi Yan
Abstract summary: We provide the definition of what we call functionally equivalent features. These features produce equivalent output under certain transformations. We propose an efficient algorithm named Iterative Feature Merging.
Score: 64.06519549649495
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The behavior of neural networks still remains opaque, and a recently widely noted phenomenon is that networks often achieve similar performance when initialized with different random parameters. This phenomenon has attracted significant attention in measuring the similarity between features learned by distinct networks. However, feature similarity could be vague in describing the same feature since equivalent features hardly exist. In this paper, we expand the concept of equivalent feature and provide the definition of what we call functionally equivalent features. These features produce equivalent output under certain transformations. Using this definition, we aim to derive a more intrinsic metric for the so-called feature complexity regarding the redundancy of features learned by a neural network at each layer. We offer a formal interpretation of our approach through the lens of category theory, a well-developed area in mathematics. To quantify the feature complexity, we further propose an efficient algorithm named Iterative Feature Merging. Our experimental results validate our ideas and theories from various perspectives. We empirically demonstrate that the functionally equivalence widely exists among different features learned by the same neural network and we could reduce the number of parameters of the network without affecting the performance.The IFM shows great potential as a data-agnostic model prune method. We have also drawn several interesting empirical findings regarding the defined feature complexity.

Related papers

Making Sense Of Distributed Representations With Activation Spectroscopy [44.94093096989921]
There is growing evidence to suggest that relevant features are encoded across many neurons in a distributed fashion. This work explores one feasible path to both detecting and tracing the joint influence of neurons in a distributed representation.
arXiv Detail & Related papers (2025-01-26T07:33:42Z)
Explainable Neural Networks with Guarantees: A Sparse Estimation Approach [11.142723510517778]
This paper introduces a novel approach to constructing an explainable neural network that harmonizes predictiveness and explainability. Our model, termed SparXnet, is designed as a linear combination of a sparse set of jointly learned features. Our research paves the way for further research on sparse and explainable neural networks with guarantee.
arXiv Detail & Related papers (2025-01-02T12:10:17Z)
A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities [30.737171081270322]
We study how fully-connected two-layer neural networks adapt to the target function after a single, but aggressive, gradient descent step. This provides a sharp description of the impact of feature learning in the generalization of two-layer neural networks, beyond the random features and lazy training regimes.
arXiv Detail & Related papers (2024-10-24T17:24:34Z)
Noise-Resilient Unsupervised Graph Representation Learning via Multi-Hop Feature Quality Estimation [53.91958614666386]
Unsupervised graph representation learning (UGRL) based on graph neural networks (GNNs) We propose a novel UGRL method based on Multi-hop feature Quality Estimation (MQE)
arXiv Detail & Related papers (2024-07-29T12:24:28Z)
Deep Neural Networks are Adaptive to Function Regularity and Data Distribution in Approximation and Estimation [8.284464143581546]
We study how deep neural networks can adapt to different regularity in functions across different locations and scales. Our results show that deep neural networks are adaptive to different regularity of functions and nonuniform data distributions.
arXiv Detail & Related papers (2024-06-08T02:01:50Z)
Feature Network Methods in Machine Learning and Applications [0.0]
A machine learning (ML) feature network is a graph that connects ML features in learning tasks based on their similarity. We provide an example of a deep tree-structured feature network, where hierarchical connections are formed through feature clustering and feed-forward learning.
arXiv Detail & Related papers (2024-01-10T01:57:12Z)
Multilayer Multiset Neuronal Networks -- MMNNs [55.2480439325792]
The present work describes multilayer multiset neuronal networks incorporating two or more layers of coincidence similarity neurons. The work also explores the utilization of counter-prototype points, which are assigned to the image regions to be avoided.
arXiv Detail & Related papers (2023-08-28T12:55:13Z)
Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds. Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z)
Similarity and Matching of Neural Network Representations [0.0]
We employ a toolset -- dubbed Dr. Frankenstein -- to analyse the similarity of representations in deep neural networks. We aim to match the activations on given layers of two trained neural networks by joining them with a stitching layer.
arXiv Detail & Related papers (2021-10-27T17:59:46Z)
Learning distinct features helps, provably [98.78384185493624]
We study the diversity of the features learned by a two-layer neural network trained with the least squares loss. We measure the diversity by the average $L$-distance between the hidden-layer features.
arXiv Detail & Related papers (2021-06-10T19:14:45Z)
UNIPoint: Universally Approximating Point Processes Intensities [125.08205865536577]
We provide a proof that a class of learnable functions can universally approximate any valid intensity function. We implement UNIPoint, a novel neural point process model, using recurrent neural networks to parameterise sums of basis function upon each event.
arXiv Detail & Related papers (2020-07-28T09:31:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.