Neural Field Representations of Mobile Computational Photography
- URL: http://arxiv.org/abs/2508.05907v1
- Date: Fri, 08 Aug 2025 00:03:46 GMT
- Title: Neural Field Representations of Mobile Computational Photography
- Authors: Ilya Chugunov,
- Abstract summary: I show how carefully designed neural field models can compactly represent complex geometry and lighting effects.<n>I enable applications such as depth estimation, layer separation, and image stitching directly from collected in-the-wild mobile photography data.
- Score: 4.459996749171579
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Over the past two decades, mobile imaging has experienced a profound transformation, with cell phones rapidly eclipsing all other forms of digital photography in popularity. Today's cell phones are equipped with a diverse range of imaging technologies - laser depth ranging, multi-focal camera arrays, and split-pixel sensors - alongside non-visual sensors such as gyroscopes, accelerometers, and magnetometers. This, combined with on-board integrated chips for image and signal processing, makes the cell phone a versatile pocket-sized computational imaging platform. Parallel to this, we have seen in recent years how neural fields - small neural networks trained to map continuous spatial input coordinates to output signals - enable the reconstruction of complex scenes without explicit data representations such as pixel arrays or point clouds. In this thesis, I demonstrate how carefully designed neural field models can compactly represent complex geometry and lighting effects. Enabling applications such as depth estimation, layer separation, and image stitching directly from collected in-the-wild mobile photography data. These methods outperform state-of-the-art approaches without relying on complex pre-processing steps, labeled ground truth data, or machine learning priors. Instead, they leverage well-constructed, self-regularized models that tackle challenging inverse problems through stochastic gradient descent, fitting directly to raw measurements from a smartphone.
Related papers
- Fast and accurate neural reflectance transformation imaging through knowledge distillation [3.8135470187943556]
Reflectance Transformation Imaging (RTI) is very popular for its ability to visually analyze surfaces.<n>Traditional methods like Polynomial Texture Maps (PTM) and Hemispherical Harmonics (HSH) are compact and fast.<n>We propose to reduce its computational cost through a novel solution based on Knowledge Distillation (DisK-NeuralRTI)
arXiv Detail & Related papers (2025-10-28T15:00:07Z) - Visual Odometry with Transformers [68.453547770334]
We introduce Visual odometry Transformer (VoT), which processes sequences of monocular frames by extracting features.<n>Unlike prior methods, VoT directly predicts camera motion without estimating dense geometry and relies solely on camera poses for supervision.<n>VoT scales effectively with larger datasets, benefits substantially from stronger pre-trained backbones, generalizes across diverse camera motions and calibration settings, and outperforms traditional methods while running more than 3 times faster.
arXiv Detail & Related papers (2025-10-02T17:00:14Z) - Learned Lightweight Smartphone ISP with Unpaired Data [55.2480439325792]
We propose a novel training method for a learnable Image Signal Processor (ISP)<n>Our unpaired approach employs a multi-term loss function guided by adversarial training.<n>Compared to paired training methods, our unpaired learning strategy shows strong potential and achieves high fidelity.
arXiv Detail & Related papers (2025-05-15T15:37:51Z) - Hardware, Algorithms, and Applications of the Neuromorphic Vision Sensor: a Review [0.0]
Neuromorphic, or event, cameras represent a transformation in the classical approach to visual sensing encodes detected instantaneous per-pixel illumination changes into an asynchronous stream of event packets.<n>Their novelty lies in the transition from capturing full picture frames at fixed time intervals to a sparse data format which, with its distinctive qualities, offers potential improvements in various applications.
arXiv Detail & Related papers (2025-04-11T14:46:36Z) - Neuromorphic Optical Tracking and Imaging of Randomly Moving Targets through Strongly Scattering Media [8.480104395572418]
We develop an end-to-end neuromorphic optical engineering and computational approach to track and image normally invisible objects.<n> Photons emerging from dense scattering media are detected by the event camera and converted to pixel-wise asynchronized spike trains.<n>We demonstrate tracking and imaging randomly moving objects in dense turbid media as well as image reconstruction of spatially stationary but optically dynamic objects.
arXiv Detail & Related papers (2025-01-07T15:38:13Z) - Learning Robust Multi-Scale Representation for Neural Radiance Fields
from Unposed Images [65.41966114373373]
We present an improved solution to the neural image-based rendering problem in computer vision.
The proposed approach could synthesize a realistic image of the scene from a novel viewpoint at test time.
arXiv Detail & Related papers (2023-11-08T08:18:23Z) - Low-Light Image Enhancement with Illumination-Aware Gamma Correction and
Complete Image Modelling Network [69.96295927854042]
Low-light environments usually lead to less informative large-scale dark areas.
We propose to integrate the effectiveness of gamma correction with the strong modelling capacities of deep networks.
Because exponential operation introduces high computational complexity, we propose to use Taylor Series to approximate gamma correction.
arXiv Detail & Related papers (2023-08-16T08:46:51Z) - Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference.
This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion.
The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z) - High Dynamic Range and Super-Resolution from Raw Image Bursts [52.341483902624006]
This paper introduces the first approach to reconstruct high-resolution, high-dynamic range color images from raw photographic bursts captured by a handheld camera with exposure bracketing.
The proposed algorithm is fast, with low memory requirements compared to state-of-the-art learning-based approaches to image restoration.
Experiments demonstrate its excellent performance with super-resolution factors of up to $times 4$ on real photographs taken in the wild with hand-held cameras.
arXiv Detail & Related papers (2022-07-29T13:31:28Z) - Sim-to-real for high-resolution optical tactile sensing: From images to
3D contact force distributions [5.939410304994348]
This article proposes a strategy to generate tactile images in simulation for a vision-based tactile sensor based on an internal camera.
The deformation of the material is simulated in a finite element environment under a diverse set of contact conditions, and spherical particles are projected to a simulated image.
Features extracted from the images are mapped to the 3D contact force distribution, with the ground truth also obtained via finite-element simulations.
arXiv Detail & Related papers (2020-12-21T12:43:33Z) - Mesoscopic photogrammetry with an unstabilized phone camera [8.210210271599134]
We present a feature-free photogrammetric computation technique that enables quantitative 3D mesoscopic (mm-scale height variation) imaging.
Our end-to-end, pixel-intensity-based approach jointly registers and stitches all the images by estimating a coaligned height map.
We also propose strategies for reducing time and memory, applicable to other multi-frame registration problems.
arXiv Detail & Related papers (2020-12-11T00:09:18Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.