Related papers: On the combined effect of class imbalance and concept complexity in deep learning

On the combined effect of class imbalance and concept complexity in deep learning

URL: http://arxiv.org/abs/2107.14194v1
Date: Thu, 29 Jul 2021 17:30:00 GMT
Title: On the combined effect of class imbalance and concept complexity in deep learning
Authors: Kushankur Ghosh, Colin Bellinger, Roberto Corizzo, Bartosz Krawczyk, Nathalie Japkowicz
Abstract summary: This paper studies the behavior of deep learning systems in settings that have previously been deemed challenging to classical machine learning systems. Deep architectures seem to help with structural concept complexity but not with overlap challenges in simple artificial domains. In the real-world image domains, where overfitting is a greater concern than in the artificial domains, the advantage of deeper architectures is less obvious.
Score: 11.178586036657798
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Structural concept complexity, class overlap, and data scarcity are some of the most important factors influencing the performance of classifiers under class imbalance conditions. When these effects were uncovered in the early 2000s, understandably, the classifiers on which they were demonstrated belonged to the classical rather than Deep Learning categories of approaches. As Deep Learning is gaining ground over classical machine learning and is beginning to be used in critical applied settings, it is important to assess systematically how well they respond to the kind of challenges their classical counterparts have struggled with in the past two decades. The purpose of this paper is to study the behavior of deep learning systems in settings that have previously been deemed challenging to classical machine learning systems to find out whether the depth of the systems is an asset in such settings. The results in both artificial and real-world image datasets (MNIST Fashion, CIFAR-10) show that these settings remain mostly challenging for Deep Learning systems and that deeper architectures seem to help with structural concept complexity but not with overlap challenges in simple artificial domains. Data scarcity is not overcome by deeper layers, either. In the real-world image domains, where overfitting is a greater concern than in the artificial domains, the advantage of deeper architectures is less obvious: while it is observed in certain cases, it is quickly cancelled as models get deeper and perform worse than their shallower counterparts.

Related papers

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations [1.9580473532948401]
This thesis explores the theoretical foundations of deep learning by studying the relationship between the architecture of these models and the inherent structures found within the data they process. We ask What drives the efficacy of deep learning algorithms and allows them to beat the so-called curse of dimensionality. Our methodology takes an empirical approach to deep learning, combining experimental studies with physics-inspired toy models.
arXiv Detail & Related papers (2023-10-24T19:50:41Z)
Towards Understanding Mixture of Experts in Deep Learning [95.27215939891511]
We study how the MoE layer improves the performance of neural network learning. Our results suggest that the cluster structure of the underlying problem and the non-linearity of the expert are pivotal to the success of MoE.
arXiv Detail & Related papers (2022-08-04T17:59:10Z)
Fault-Tolerant Deep Learning: A Hierarchical Perspective [12.315753706063324]
We conduct a comprehensive survey of fault-tolerant deep learning design approaches. We investigate these approaches from model layer, architecture layer, circuit layer, and cross layer respectively.
arXiv Detail & Related papers (2022-04-05T02:31:18Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Learning Contact Dynamics using Physically Structured Neural Networks [81.73947303886753]
We use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects. We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations. Our results indicate that an idealised form of touch feedback is a key component of making this learning problem tractable.
arXiv Detail & Related papers (2021-02-22T17:33:51Z)
Recent advances in deep learning theory [104.01582662336256]
This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analysing the generalizability of deep learning; (2) differential equations and their dynamic systems for modelling gradient descent and its variants; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; and (5) theoretical foundations of several special structures in network architectures.
arXiv Detail & Related papers (2020-12-20T14:16:41Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
Adaptive Hierarchical Decomposition of Large Deep Networks [4.272649614101117]
As datasets get larger, a natural question is if existing deep learning architectures can be extended to handle the 50+K classes thought to be perceptible by a typical human. This paper introduces a framework that automatically analyzes and configures a family of smaller deep networks as a replacement to a singular, larger network. The resulting smaller networks are highly scalable, parallel and more practical to train, and achieve higher classification accuracy.
arXiv Detail & Related papers (2020-07-17T21:04:50Z)
Understanding Deep Architectures with Reasoning Layer [60.90906477693774]
We show that properties of the algorithm layers, such as convergence, stability, and sensitivity, are intimately related to the approximation and generalization abilities of the end-to-end model. Our theory can provide useful guidelines for designing deep architectures with reasoning layers.
arXiv Detail & Related papers (2020-06-24T00:26:35Z)
Learn Class Hierarchy using Convolutional Neural Networks [0.9569316316728905]
We propose a new architecture for hierarchical classification of images, introducing a stack of deep linear layers with cross-entropy loss functions and center loss combined. We experimentally show that our hierarchical classifier presents advantages to the traditional classification approaches finding application in computer vision tasks.
arXiv Detail & Related papers (2020-05-18T12:06:43Z)
Introducing Fuzzy Layers for Deep Learning [5.209583609264815]
We introduce a new layer to deep learning: the fuzzy layer. Traditionally, the network architecture of neural networks is composed of an input layer, some combination of hidden layers, and an output layer. We propose the introduction of fuzzy layers into the deep learning architecture to exploit the powerful aggregation properties expressed through fuzzy methodologies.
arXiv Detail & Related papers (2020-02-21T19:33:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.