Related papers: Polynomial Networks in Deep Classifiers

Polynomial Networks in Deep Classifiers

URL: http://arxiv.org/abs/2104.07916v1
Date: Fri, 16 Apr 2021 06:41:20 GMT
Title: Polynomial Networks in Deep Classifiers
Authors: Grigorios G Chrysos, Markos Georgopoulos, Jiankang Deng, Yannis Panagakis
Abstract summary: We cast the study of deep neural networks under a unifying framework. Our framework provides insights on the inductive biases of each model. The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks.
Score: 55.90321402256631
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks have been the driving force behind the success in classification tasks, e.g., object and audio recognition. Impressive results and generalization have been achieved by a variety of recently proposed architectures, the majority of which are seemingly disconnected. In this work, we cast the study of deep classifiers under a unifying framework. In particular, we express state-of-the-art architectures (e.g., residual and non-local networks) in the form of different degree polynomials of the input. Our framework provides insights on the inductive biases of each model and enables natural extensions building upon their polynomial nature. The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks. The expressivity of the proposed models is highlighted both in terms of increased model performance as well as model compression. Lastly, the extensions allowed by this taxonomy showcase benefits in the presence of limited data and long-tailed data distributions. We expect this taxonomy to provide links between existing domain-specific architectures.

Related papers

Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling [4.190836962132713]
This paper introduces Orchid, a novel architecture designed to address the quadratic complexity of traditional attention mechanisms. At the core of this architecture lies a new data-dependent global convolution layer, which contextually adapts its conditioned kernel on input sequence. We evaluate the proposed model across multiple domains, including language modeling and image classification, to highlight its performance and generality.
arXiv Detail & Related papers (2024-02-28T17:36:45Z)
Fine-Grained ImageNet Classification in the Wild [0.0]
Robustness tests can uncover several vulnerabilities and biases which go unnoticed during the typical model evaluation stage. In our work, we perform fine-grained classification on closely related categories, which are identified with the help of hierarchical knowledge.
arXiv Detail & Related papers (2023-03-04T12:25:07Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Visualization Of Class Activation Maps To Explain AI Classification Of Network Packet Captures [0.0]
The number of connections and the addition of new applications in our networks causes a vast amount of log data. Deep learning methods provide both feature extraction and classification from data in a single system. We present a visual interactive tool that combines the classification of network data with an explanation technique to form an interface between experts, algorithms, and data.
arXiv Detail & Related papers (2022-09-05T16:34:43Z)
A Review of Sparse Expert Models in Deep Learning [23.721204843236006]
Sparse expert models are a thirty-year old concept re-emerging as a popular architecture in deep learning. We review the concept of sparse expert models, provide a basic description of the common algorithms, and contextualize the advances in the deep learning era.
arXiv Detail & Related papers (2022-09-04T18:00:29Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Deep Variational Models for Collaborative Filtering-based Recommender Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results. Our proposed models apply the variational concept to injectity in the latent space of the deep architecture. Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z)
Redefining Neural Architecture Search of Heterogeneous Multi-Network Models by Characterizing Variation Operators and Model Components [71.03032589756434]
We investigate the effect of different variation operators in a complex domain, that of multi-network heterogeneous neural models. We characterize both the variation operators, according to their effect on the complexity and performance of the model; and the models, relying on diverse metrics which estimate the quality of the different parts composing it.
arXiv Detail & Related papers (2021-06-16T17:12:26Z)
Neural Entity Linking: A Survey of Models Based on Deep Learning [82.43751915717225]
This survey presents a comprehensive description of recent neural entity linking (EL) systems developed since 2015. Its goal is to systemize design features of neural entity linking systems and compare their performance to the remarkable classic methods on common benchmarks. The survey touches on applications of entity linking, focusing on the recently emerged use-case of enhancing deep pre-trained masked language models.
arXiv Detail & Related papers (2020-05-31T18:02:26Z)
Principal Neighbourhood Aggregation for Graph Nets [4.339839287869653]
Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work on their expressive power has focused on isomorphism tasks and countable feature spaces. We extend this theoretical framework to include continuous features which occur regularly in real-world input domains.
arXiv Detail & Related papers (2020-04-12T23:30:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.