The Intrinsic Dimension of Images and Its Impact on Learning
- URL: http://arxiv.org/abs/2104.08894v1
- Date: Sun, 18 Apr 2021 16:29:23 GMT
- Title: The Intrinsic Dimension of Images and Its Impact on Learning
- Authors: Phillip Pope, Chen Zhu, Ahmed Abdelkader, Micah Goldblum, Tom
Goldstein
- Abstract summary: It is widely believed that natural image data exhibits low-dimensional structure despite the high dimensionality of conventional pixel representations.
In this work, we apply dimension estimation tools to popular datasets and investigate the role of low-dimensional structure in deep learning.
- Score: 60.811039723427676
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: It is widely believed that natural image data exhibits low-dimensional
structure despite the high dimensionality of conventional pixel
representations. This idea underlies a common intuition for the remarkable
success of deep learning in computer vision. In this work, we apply dimension
estimation tools to popular datasets and investigate the role of
low-dimensional structure in deep learning. We find that common natural image
datasets indeed have very low intrinsic dimension relative to the high number
of pixels in the images. Additionally, we find that low dimensional datasets
are easier for neural networks to learn, and models solving these tasks
generalize better from training to test data. Along the way, we develop a
technique for validating our dimension estimation tools on synthetic data
generated by GANs allowing us to actively manipulate the intrinsic dimension by
controlling the image generation process. Code for our experiments may be found
here https://github.com/ppope/dimensions.
Related papers
- GRIN: Zero-Shot Metric Depth with Pixel-Level Diffusion [27.35300492569507]
We present GRIN, an efficient diffusion model designed to ingest sparse unstructured training data.
We show that GRIN establishes a new state of the art in zero-shot metric monocular depth estimation even when trained from scratch.
arXiv Detail & Related papers (2024-09-15T23:32:04Z) - Deep Image Composition Meets Image Forgery [0.0]
Image forgery has been studied for many years.
Deep learning models require large amounts of labeled data for training.
We use state of the art image composition deep learning models to generate spliced images close to the quality of real-life manipulations.
arXiv Detail & Related papers (2024-04-03T17:54:37Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - Geo-SIC: Learning Deformable Geometric Shapes in Deep Image Classifiers [8.781861951759948]
This paper presents Geo-SIC, the first deep learning model to learn deformable shapes in a deformation space for an improved performance of image classification.
We introduce a newly designed framework that (i) simultaneously derives features from both image and latent shape spaces with large intra-class variations.
We develop a boosted classification network, equipped with an unsupervised learning of geometric shape representations.
arXiv Detail & Related papers (2022-10-25T01:55:17Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - Homography augumented momentum constrastive learning for SAR image
retrieval [3.9743795764085545]
We propose a deep learning-based image retrieval approach using homography transformation augmented contrastive learning.
We also propose a training method for the DNNs induced by contrastive learning that does not require any labeling procedure.
arXiv Detail & Related papers (2021-09-21T17:27:07Z) - DONet: Learning Category-Level 6D Object Pose and Size Estimation from
Depth Observation [53.55300278592281]
We propose a method of Category-level 6D Object Pose and Size Estimation (COPSE) from a single depth image.
Our framework makes inferences based on the rich geometric information of the object in the depth channel alone.
Our framework competes with state-of-the-art approaches that require labeled real-world images.
arXiv Detail & Related papers (2021-06-27T10:41:50Z) - Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D
Shapes [77.6741486264257]
We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs.
We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works.
arXiv Detail & Related papers (2021-01-26T18:50:22Z) - Learning Depth With Very Sparse Supervision [57.911425589947314]
This paper explores the idea that perception gets coupled to 3D properties of the world via interaction with the environment.
We train a specialized global-local network architecture with what would be available to a robot interacting with the environment.
Experiments on several datasets show that, when ground truth is available even for just one of the image pixels, the proposed network can learn monocular dense depth estimation up to 22.5% more accurately than state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-02T10:44:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.