Related papers: Linking convolutional kernel size to generalization bias in face analysis CNNs

Linking convolutional kernel size to generalization bias in face analysis CNNs

URL: http://arxiv.org/abs/2302.03750v2
Date: Sun, 3 Dec 2023 13:23:39 GMT
Title: Linking convolutional kernel size to generalization bias in face analysis CNNs
Authors: Hao Liang, Josue Ortega Caro, Vikram Maheshri, Ankit B. Patel, Guha Balakrishnan
Abstract summary: We present a causal framework for linking an architectural hyper parameter to out-of-distribution algorithmic bias. In our experiments, we focused on measuring the causal relationship between convolutional kernel size and face analysis classification bias. We show that modifying kernel size, even in one layer of a CNN, changes the frequency content of learned features significantly across data subgroups.
Score: 9.030335233143603
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training dataset biases are by far the most scrutinized factors when explaining algorithmic biases of neural networks. In contrast, hyperparameters related to the neural network architecture have largely been ignored even though different network parameterizations are known to induce different implicit biases over learned features. For example, convolutional kernel size is known to affect the frequency content of features learned in CNNs. In this work, we present a causal framework for linking an architectural hyperparameter to out-of-distribution algorithmic bias. Our framework is experimental, in that we train several versions of a network with an intervention to a specific hyperparameter, and measure the resulting causal effect of this choice on performance bias when a particular out-of-distribution image perturbation is applied. In our experiments, we focused on measuring the causal relationship between convolutional kernel size and face analysis classification bias across different subpopulations (race/gender), with respect to high-frequency image details. We show that modifying kernel size, even in one layer of a CNN, changes the frequency content of learned features significantly across data subgroups leading to biased generalization performance even in the presence of a balanced dataset.

Related papers

Tailoring the Hyperparameters of a Wide-Kernel Convolutional Neural Network to Fit Different Bearing Fault Vibration Datasets [0.10241134756773229]
State-of-the-art algorithms are reported to be almost perfect at distinguishing vibrations arising from healthy and damaged machine bearings. But what about their application to new data? In this paper, we are able to confirm that neural networks for bearing fault detection can be crippled by incorrect hyper parameterisation.
arXiv Detail & Related papers (2024-11-19T09:17:13Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Do deep neural networks have an inbuilt Occam's razor? [1.1470070927586016]
We show that structured data combined with an intrinsic Occam's razor-like inductive bias towards simple functions counteracts the exponential growth of functions with complexity. This analysis reveals that structured data, combined with an intrinsic Occam's razor-like inductive bias towards (Kolmogorov) simple functions that is strong enough to counteract the exponential growth of functions with complexity, is a key to the success of DNNs.
arXiv Detail & Related papers (2023-04-13T16:58:21Z)
Increasing biases can be more efficient than increasing weights [33.05856234084821]
Unit emphasizes the importance of preserving uncorrupted information as it is passed from one unit to the next. We show that by focusing on increasing biases rather than weights, there is potential for significant enhancement in a neural network model's performance.
arXiv Detail & Related papers (2023-01-03T01:36:31Z)
Scalar Invariant Networks with Zero Bias [3.428731916567677]
We show that zero-bias neural networks can perform comparably to biased networks for practical image classification tasks. We prove that zero-bias neural networks are fair in predicting the zero image. The robustness and fairness advantages of zero-bias neural networks may also indicate a promising path towards trustworthy and ethical AI.
arXiv Detail & Related papers (2022-11-15T20:26:07Z)
Interpreting Bias in the Neural Networks: A Peek Into Representational Similarity [0.0]
We investigate the performance and internal representational structure of convolution-based neural networks trained on biased data. We specifically study similarities in representations, using Centered Kernel Alignment (CKA) for different objective functions. We note that without progressive representational similarities among the layers of a neural network, the performance is less likely to be robust.
arXiv Detail & Related papers (2022-11-14T22:17:14Z)
What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime. We prove that deep CNNs adapt to the spatial scale of the target function. We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z)
Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks. We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z)
Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models. We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data. We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)
Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks [17.188280334580195]
Generalization beyond a training dataset is a main goal of machine learning. Recent observations in deep neural networks contradict conventional wisdom from classical statistics. We show that more data may impair generalization when noisy or not expressible by the kernel.
arXiv Detail & Related papers (2020-06-23T17:53:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.