Related papers: Effective Version Space Reduction for Convolutional Neural Networks

Effective Version Space Reduction for Convolutional Neural Networks

URL: http://arxiv.org/abs/2006.12456v1
Date: Mon, 22 Jun 2020 17:40:03 GMT
Title: Effective Version Space Reduction for Convolutional Neural Networks
Authors: Jiayu Liu, Ioannis Chiotellis, Rudolph Triebel, Daniel Cremers
Abstract summary: In active learning, sampling bias could pose a serious inconsistency problem and hinder the algorithm from finding the optimal hypothesis. We examine active learning with convolutional neural networks through the principled lens of version space reduction.
Score: 61.84773892603885
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In active learning, sampling bias could pose a serious inconsistency problem and hinder the algorithm from finding the optimal hypothesis. However, many methods for neural networks are hypothesis space agnostic and do not address this problem. We examine active learning with convolutional neural networks through the principled lens of version space reduction. We identify the connection between two approaches---prior mass reduction and diameter reduction---and propose a new diameter-based querying method---the minimum Gibbs-vote disagreement. By estimating version space diameter and bias, we illustrate how version space of neural networks evolves and examine the realizability assumption. With experiments on MNIST, Fashion-MNIST, SVHN and STL-10 datasets, we demonstrate that diameter reduction methods reduce the version space more effectively and perform better than prior mass reduction and other baselines, and that the Gibbs vote disagreement is on par with the best query method.

Related papers

Learning Discontinuous Galerkin Solutions to Elliptic Problems via Small Linear Convolutional Neural Networks [1.124958340749622]
We propose two approaches for learning discontinuous Galerkin solutions to PDEs using small linear convolutional neural networks. Our first approach is supervised and depends on labeled data, while our second approach is unsupervised and does not rely on any training data. In both cases, our methods use substantially fewer parameters than similar numerics-based neural networks while also demonstrating comparable accuracy to the true and DG solutions for elliptic problems.
arXiv Detail & Related papers (2025-02-12T20:53:34Z)
Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis [5.016205338484259]
The proposed method is more robust to network size variations than the existing method. When applied to Physics-Informed Neural Networks, the method exhibits faster convergence and robustness to variations of the network size.
arXiv Detail & Related papers (2024-10-03T06:30:27Z)
The Unreasonable Effectiveness of Solving Inverse Problems with Neural Networks [24.766470360665647]
We show that neural networks trained to learn solutions to inverse problems can find better solutions than classicals even on their training set. Our findings suggest an alternative use for neural networks: rather than generalizing to new data for fast inference, they can also be used to find better solutions on known data.
arXiv Detail & Related papers (2024-08-15T12:38:10Z)
Sparsifying dimensionality reduction of PDE solution data with Bregman learning [1.2016264781280588]
We propose a multistep algorithm that induces sparsity in the encoder-decoder networks for effective reduction in the number of parameters and additional compression of the latent space. Compared to conventional training methods like Adam, the proposed method achieves similar accuracy with 30% less parameters and a significantly smaller latent space.
arXiv Detail & Related papers (2024-06-18T14:45:30Z)
What to Do When Your Discrete Optimization Is the Size of a Neural Network? [24.546550334179486]
Machine learning applications using neural networks involve solving discrete optimization problems. classical approaches used in discrete settings do not scale well to large neural networks. We take continuation path (CP) methods to represent using purely the former and Monte Carlo (MC) methods to represent the latter.
arXiv Detail & Related papers (2024-02-15T21:57:43Z)
Solutions to Elliptic and Parabolic Problems via Finite Difference Based Unsupervised Small Linear Convolutional Neural Networks [1.124958340749622]
We propose a fully unsupervised approach, requiring no training data, to estimate finite difference solutions for PDEs directly via small linear convolutional neural networks. Our proposed approach uses substantially fewer parameters than similar finite difference-based approaches while also demonstrating comparable accuracy to the true solution for several selected elliptic and parabolic problems.
arXiv Detail & Related papers (2023-11-01T03:15:10Z)
Deep Graph Neural Networks via Posteriori-Sampling-based Node-Adaptive Residual Module [65.81781176362848]
Graph Neural Networks (GNNs) can learn from graph-structured data through neighborhood information aggregation. As the number of layers increases, node representations become indistinguishable, which is known as over-smoothing. We propose a textbfPosterior-Sampling-based, Node-distinguish Residual module (PSNR).
arXiv Detail & Related papers (2023-05-09T12:03:42Z)
Decomposed Diffusion Sampler for Accelerating Large-Scale Inverse Problems [64.29491112653905]
We propose a novel and efficient diffusion sampling strategy that synergistically combines the diffusion sampling and Krylov subspace methods. Specifically, we prove that if tangent space at a denoised sample by Tweedie's formula forms a Krylov subspace, then the CG with the denoised data ensures the data consistency update to remain in the tangent space. Our proposed method achieves more than 80 times faster inference time than the previous state-of-the-art method.
arXiv Detail & Related papers (2023-03-10T07:42:49Z)
Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent. We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. The use of gradient combined nonvolutionity renders learning susceptible to novel problems. We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.