Efficient CNN with uncorrelated Bag of Features pooling
- URL: http://arxiv.org/abs/2209.10865v1
- Date: Thu, 22 Sep 2022 09:00:30 GMT
- Title: Efficient CNN with uncorrelated Bag of Features pooling
- Authors: Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis and Moncef
Gabbouj
- Abstract summary: Bag of Features (BoF) has been recently proposed to reduce the complexity of convolution layers.
We propose an approach that builds on top of BoF pooling to boost its efficiency by ensuring that the items of the learned dictionary are non-redundant.
The proposed strategy yields an efficient variant of BoF and further boosts its performance, without any additional parameters.
- Score: 98.78384185493624
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Despite the superior performance of CNN, deploying them on low computational
power devices is still limited as they are typically computationally expensive.
One key cause of the high complexity is the connection between the convolution
layers and the fully connected layers, which typically requires a high number
of parameters. To alleviate this issue, Bag of Features (BoF) pooling has been
recently proposed. BoF learns a dictionary, that is used to compile a histogram
representation of the input. In this paper, we propose an approach that builds
on top of BoF pooling to boost its efficiency by ensuring that the items of the
learned dictionary are non-redundant. We propose an additional loss term, based
on the pair-wise correlation of the items of the dictionary, which complements
the standard loss to explicitly regularize the model to learn a more diverse
and rich dictionary. The proposed strategy yields an efficient variant of BoF
and further boosts its performance, without any additional parameters.
Related papers
- Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic
Interpretability: A Case Study on Othello-GPT [59.245414547751636]
We propose a circuit discovery framework alternative to activation patching.
Our framework suffers less from out-of-distribution and proves to be more efficient in terms of complexity.
We dig in a small transformer trained on a synthetic task named Othello and find a number of human-understandable fine-grained circuits inside of it.
arXiv Detail & Related papers (2024-02-19T15:04:53Z) - NFLAT: Non-Flat-Lattice Transformer for Chinese Named Entity Recognition [39.308634515653914]
We advocate a novel lexical enhancement method, InterFormer, that effectively reduces the amount of computational and memory costs.
Compared with FLAT, it reduces unnecessary attention calculations in "word-character" and "word-word"
This reduces the memory usage by about 50% and can use more extensive lexicons or higher batches for network training.
arXiv Detail & Related papers (2022-05-12T01:55:37Z) - Highly Parallel Autoregressive Entity Linking with Discriminative
Correction [51.947280241185]
We propose a very efficient approach that parallelizes autoregressive linking across all potential mentions.
Our model is >70 times faster and more accurate than the previous generative method.
arXiv Detail & Related papers (2021-09-08T17:28:26Z) - NodePiece: Compositional and Parameter-Efficient Representations of
Large Knowledge Graphs [15.289356276538662]
We propose NodePiece, an anchor-based approach to learn a fixed-size entity vocabulary.
In NodePiece, a vocabulary of subword/sub-entity units is constructed from anchor nodes in a graph with known relation types.
Experiments show that NodePiece performs competitively in node classification, link prediction, and relation prediction tasks.
arXiv Detail & Related papers (2021-06-23T03:51:03Z) - PUDLE: Implicit Acceleration of Dictionary Learning by Backpropagation [4.081440927534577]
This paper offers the first theoretical proof for empirical results through PUDLE, a Provable Unfolded Dictionary LEarning method.
We highlight the minimization impact of loss, unfolding, and backpropagation on convergence.
We complement our findings through synthetic and image denoising experiments.
arXiv Detail & Related papers (2021-05-31T18:49:58Z) - The Temporal Dictionary Ensemble (TDE) Classifier for Time Series
Classification [0.0]
temporal dictionary ensemble (TDE) is more accurate than other dictionary based approaches.
We show HIVE-COTE is significantly more accurate than the current best deep learning approach.
This advance represents a new state of the art for time series classification.
arXiv Detail & Related papers (2021-05-09T05:27:42Z) - Text Information Aggregation with Centrality Attention [86.91922440508576]
We propose a new way of obtaining aggregation weights, called eigen-centrality self-attention.
We build a fully-connected graph for all the words in a sentence, then compute the eigen-centrality as the attention score of each word.
arXiv Detail & Related papers (2020-11-16T13:08:48Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Funnel-Transformer: Filtering out Sequential Redundancy for Efficient
Language Processing [112.2208052057002]
We propose Funnel-Transformer which gradually compresses the sequence of hidden states to a shorter one.
With comparable or fewer FLOPs, Funnel-Transformer outperforms the standard Transformer on a wide variety of sequence-level prediction tasks.
arXiv Detail & Related papers (2020-06-05T05:16:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.