On minimizers and convolutional filters: theoretical connections and
applications to genome analysis
- URL: http://arxiv.org/abs/2111.08452v6
- Date: Fri, 26 Jan 2024 16:55:28 GMT
- Title: On minimizers and convolutional filters: theoretical connections and
applications to genome analysis
- Authors: Yun William Yu
- Abstract summary: CNNs start with a wide array of randomly convolutional filters, paired with a pooling operation, and then multiple additional neural layers to learn both the filters themselves and how they can be used to classify the sequence.
In empirical experiments, we find that this property manifests as decreased density in repetitive regions, both in simulation and on real human telomeres.
We train from scratch a CNN embedding of synthetic short-reads from the SARS-CoV-2 genome into 3D Euclidean space that locally recapitulates the linear sequence distance of the read origins.
- Score: 2.8282906214258805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Minimizers and convolutional neural networks (CNNs) are two quite distinct
popular techniques that have both been employed to analyze categorical
biological sequences. At face value, the methods seem entirely dissimilar.
Minimizers use min-wise hashing on a rolling window to extract a single
important k-mer feature per window. CNNs start with a wide array of randomly
initialized convolutional filters, paired with a pooling operation, and then
multiple additional neural layers to learn both the filters themselves and how
they can be used to classify the sequence.
Here, our main result is a careful mathematical analysis of hash function
properties showing that for sequences over a categorical alphabet, random
Gaussian initialization of convolutional filters with max-pooling is equivalent
to choosing a minimizer ordering such that selected k-mers are (in Hamming
distance) far from the k-mers within the sequence but close to other
minimizers. In empirical experiments, we find that this property manifests as
decreased density in repetitive regions, both in simulation and on real human
telomeres. We additionally train from scratch a CNN embedding of synthetic
short-reads from the SARS-CoV-2 genome into 3D Euclidean space that locally
recapitulates the linear sequence distance of the read origins, a modest step
towards building a deep learning assembler, though it is at present too slow to
be practical. In total, this manuscript provides a partial explanation for the
effectiveness of CNNs in categorical sequence analysis.
Related papers
- On the rates of convergence for learning with convolutional neural networks [9.772773527230134]
We study approximation and learning capacities of convolutional neural networks (CNNs) with one-side zero-padding and multiple channels.
We derive convergence rates for estimators based on CNNs in many learning problems.
It is also shown that the obtained rates for classification are minimax optimal in some common settings.
arXiv Detail & Related papers (2024-03-25T06:42:02Z) - Optimized classification with neural ODEs via separability [0.0]
Classification of $N$ points becomes a simultaneous control problem when viewed through the lens of neural ordinary differential equations (neural ODEs)
In this study, we focus on estimating the number of neurons required for efficient cluster-based classification.
We propose a new constructive algorithm that simultaneously classifies clusters of $d$ points from any initial configuration.
arXiv Detail & Related papers (2023-12-21T12:56:40Z) - The Power of Linear Combinations: Learning with Random Convolutions [2.0305676256390934]
Modern CNNs can achieve high test accuracies without ever updating randomly (spatial) convolution filters.
These combinations of random filters can implicitly regularize the resulting operations.
Although we only observe relatively small gains from learning $3times 3$ convolutions, the learning gains increase proportionally with kernel size.
arXiv Detail & Related papers (2023-01-26T19:17:10Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - Batch Normalization Tells You Which Filter is Important [49.903610684578716]
We propose a simple yet effective filter pruning method by evaluating the importance of each filter based on the BN parameters of pre-trained CNNs.
The experimental results on CIFAR-10 and ImageNet demonstrate that the proposed method can achieve outstanding performance.
arXiv Detail & Related papers (2021-12-02T12:04:59Z) - Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which.
machines can independently compute gradients, and cooperate.
Our algorithm is based on a new concentration technique, and its sample complexity.
It is very practical: it improves upon the performance of all prior methods when no.
setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Improving the Backpropagation Algorithm with Consequentialism Weight
Updates over Mini-Batches [0.40611352512781856]
We show that it is possible to consider a multi-layer neural network as a stack of adaptive filters.
We introduce a better algorithm by predicting then emending the adverse consequences of the actions that take place in BP even before they happen.
Our experiments show the usefulness of our algorithm in the training of deep neural networks.
arXiv Detail & Related papers (2020-03-11T08:45:36Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.