Convolutional Neural Nets: Foundations, Computations, and New
Applications
- URL: http://arxiv.org/abs/2101.04869v1
- Date: Wed, 13 Jan 2021 04:20:42 GMT
- Title: Convolutional Neural Nets: Foundations, Computations, and New
Applications
- Authors: Shengli Jiang and Victor M. Zavala
- Abstract summary: CNNs are powerful machine learning models that highlight features from grid data to make predictions (regression and classification)
A common misconception is that CNNs are only capable of processing image or video data.
Here, we show how to apply CNNs to new types of applications such as optimal control, flow, monitoring, and molecular simulations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We review mathematical foundations of convolutional neural nets (CNNs) with
the goals of: i) highlighting connections with techniques from statistics,
signal processing, linear algebra, differential equations, and optimization,
ii) demystifying underlying computations, and iii) identifying new types of
applications. CNNs are powerful machine learning models that highlight features
from grid data to make predictions (regression and classification). The grid
data object can be represented as vectors (in 1D), matrices (in 2D), or tensors
(in 3D or higher dimensions) and can incorporate multiple channels (thus
providing high flexibility in the input data representation). For example, an
image can be represented as a 2D grid data object that contains red, green, and
blue (RBG) channels (each channel is a 2D matrix). Similarly, a video can be
represented as a 3D grid data object (two spatial dimensions plus time) with
RGB channels (each channel is a 3D tensor). CNNs highlight features from the
grid data by performing convolution operations with different types of
operators. The operators highlight different types of features (e.g., patterns,
gradients, geometrical features) and are learned by using optimization
techniques. In other words, CNNs seek to identify optimal operators that best
map the input data to the output data. A common misconception is that CNNs are
only capable of processing image or video data but their application scope is
much wider; specifically, datasets encountered in diverse applications can be
expressed as grid data. Here, we show how to apply CNNs to new types of
applications such as optimal control, flow cytometry, multivariate process
monitoring, and molecular simulations.
Related papers
- Topology-Agnostic Graph U-Nets for Scalar Field Prediction on Unstructured Meshes [2.4306216325375196]
TAG U-Net is a graph convolutional network that can be trained to input any mesh or graph structure.
The model constructs coarsened versions of each input graph and performs a set of convolution and pooling operations to predict the node-wise outputs on the original graph.
arXiv Detail & Related papers (2024-10-08T22:27:35Z) - SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and
Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences.
It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping.
Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z) - CNN Filter DB: An Empirical Investigation of Trained Convolutional
Filters [2.0305676256390934]
We show that model pre-training can succeed on arbitrary datasets if they meet size and variance conditions.
We show that many pre-trained models contain degenerated filters which make them less robust and less suitable for fine-tuning on target applications.
arXiv Detail & Related papers (2022-03-29T08:25:42Z) - SpectralNET: Exploring Spatial-Spectral WaveletCNN for Hyperspectral
Image Classification [0.0]
Hyperspectral Image (HSI) classification using Convolutional Neural Networks (CNN) is widely found in the current literature.
We propose SpectralNET, a wavelet CNN, which is a variation of 2D CNN for multi-resolution HSI classification.
arXiv Detail & Related papers (2021-04-01T08:45:15Z) - Spherical Transformer: Adapting Spherical Signal to CNNs [53.18482213611481]
Spherical Transformer can transform spherical signals into vectors that can be directly processed by standard CNNs.
We evaluate our approach on the tasks of spherical MNIST recognition, 3D object classification and omnidirectional image semantic segmentation.
arXiv Detail & Related papers (2021-01-11T12:33:16Z) - 2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video
Recognition [84.697097472401]
We introduce Ada3D, a conditional computation framework that learns instance-specific 3D usage policies to determine frames and convolution layers to be used in a 3D network.
We demonstrate that our method achieves similar accuracies to state-of-the-art 3D models while requiring 20%-50% less computation across different datasets.
arXiv Detail & Related papers (2020-12-29T21:40:38Z) - TSGCNet: Discriminative Geometric Feature Learning with Two-Stream
GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes.
We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z) - Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions.
$Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z) - Emotion Recognition on large video dataset based on Convolutional
Feature Extractor and Recurrent Neural Network [0.2855485723554975]
Our model combines convolutional neural network (CNN) with recurrent neural network (RNN) to predict dimensional emotions on video data.
Experiments are performed on publicly available datasets including the largest modern Aff-Wild2 database.
arXiv Detail & Related papers (2020-06-19T14:54:13Z) - Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications.
We propose a local structure-aware anisotropic convolutional operation (LSA-Conv)
Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z) - CNNTOP: a CNN-based Trajectory Owner Prediction Method [1.3793594968500604]
Trajectory owner prediction is the basis for many applications such as personalized recommendation, urban planning.
Existing methods mainly employ RNNs to model trajectories semantically.
We propose a CNN-based Trajectory Owner Prediction (CNNTOP) method.
arXiv Detail & Related papers (2020-01-05T07:58:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.