What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
- URL: http://arxiv.org/abs/2304.13651v1
- Date: Wed, 26 Apr 2023 16:23:10 GMT
- Title: What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
- Authors: Zitian Tang, Wenjie Ye, Wei-Chiu Ma, Hang Zhao
- Abstract summary: We collect the first RGB-Thermal dataset for human motion analysis, dubbed Thermal-IM.
We develop a three-stage neural network model for accurate past human pose estimation.
- Score: 22.923237551192834
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Inferring past human motion from RGB images is challenging due to the
inherent uncertainty of the prediction problem. Thermal images, on the other
hand, encode traces of past human-object interactions left in the environment
via thermal radiation measurement. Based on this observation, we collect the
first RGB-Thermal dataset for human motion analysis, dubbed Thermal-IM. Then we
develop a three-stage neural network model for accurate past human pose
estimation. Comprehensive experiments show that thermal cues significantly
reduce the ambiguities of this task, and the proposed model achieves remarkable
performance. The dataset is available at
https://github.com/ZitianTang/Thermal-IM.
Related papers
- CattleFace-RGBT: RGB-T Cattle Facial Landmark Benchmark [4.463254896517738]
CattleFace-RGBT is a RGB-T Cattle Facial Landmark dataset consisting of 2,300 RGB-T image pairs, a total of 4,600 images.
Applying AI to thermal images is challenging due to suboptimal results from direct thermal training and infeasible RGB-thermal alignment.
We transfer models trained on RGB to thermal images and refine them using our AI-assisted annotation tool.
arXiv Detail & Related papers (2024-06-05T16:29:13Z) - ThermoNeRF: Multimodal Neural Radiance Fields for Thermal Novel View Synthesis [5.66229031510643]
We propose ThermoNeRF, a novel approach to rendering new RGB and thermal views of a scene jointly.
To overcome the lack of texture in thermal images, we use paired RGB and thermal images to learn scene density.
We also introduce ThermoScenes, a new dataset to palliate the lack of available RGB+thermal datasets for scene reconstruction.
arXiv Detail & Related papers (2024-03-18T18:10:34Z) - Closing the Gap in Human Behavior Analysis: A Pipeline for Synthesizing
Trimodal Data [1.8024397171920885]
We introduce a novel generative technique for creating trimodal, i.e., RGB, thermal, and depth, human-focused datasets.
This technique capitalizes on human segmentation masks derived from RGB images, combined with thermal and depth backgrounds that are sourced automatically.
By employing this approach, we generate trimodal data that can be leveraged to train models for settings with limited data, bad lightning conditions, or privacy-sensitive areas.
arXiv Detail & Related papers (2024-02-02T16:27:45Z) - Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh
Reconstruction [66.10717041384625]
Zolly is the first 3DHMR method focusing on perspective-distorted images.
We propose a new camera model and a novel 2D representation, termed distortion image, which describes the 2D dense distortion scale of the human body.
We extend two real-world datasets tailored for this task, all containing perspective-distorted human images.
arXiv Detail & Related papers (2023-03-24T04:22:41Z) - Does Thermal Really Always Matter for RGB-T Salient Object Detection? [153.17156598262656]
This paper proposes a network named TNet to solve the RGB-T salient object detection (SOD) task.
In this paper, we introduce a global illumination estimation module to predict the global illuminance score of the image.
On the other hand, we introduce a two-stage localization and complementation module in the decoding phase to transfer object localization cue and internal integrity cue in thermal features to the RGB modality.
arXiv Detail & Related papers (2022-10-09T13:50:12Z) - A Novel Fully Annotated Thermal Infrared Face Dataset: Recorded in
Various Environment Conditions and Distances From The Camera [3.2872586139884623]
This article presents a novel public dataset on facial thermography, which we call it Charlotte-ThermalFace.
Charlotte-ThermalFace contains more than10000 infrared thermal images in varying thermal conditions, several distances from the camera, and different head positions.
The data is fully annotated with the facial landmarks, ambient temperature, relative humidity, the air speed of the room, distance to the camera, and subject thermal sensation at the time of capturing each image.
arXiv Detail & Related papers (2022-04-29T17:57:54Z) - Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes.
We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature.
We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z) - Real-time RGBD-based Extended Body Pose Estimation [57.61868412206493]
We present a system for real-time RGBD-based estimation of 3D human pose.
We use parametric 3D deformable human mesh model (SMPL-X) as a representation.
We train estimators of body pose and facial expression parameters.
arXiv Detail & Related papers (2021-03-05T13:37:50Z) - A Large-Scale, Time-Synchronized Visible and Thermal Face Dataset [62.193924313292875]
We present the DEVCOM Army Research Laboratory Visible-Thermal Face dataset (ARL-VTF)
With over 500,000 images from 395 subjects, the ARL-VTF dataset represents to the best of our knowledge, the largest collection of paired visible and thermal face images to date.
This paper presents benchmark results and analysis on thermal face landmark detection and thermal-to-visible face verification by evaluating state-of-the-art models on the ARL-VTF dataset.
arXiv Detail & Related papers (2021-01-07T17:17:12Z) - The Use of AI for Thermal Emotion Recognition: A Review of Problems and
Limitations in Standard Design and Data [36.33347149799959]
With the increased attention on thermal imagery for Covid-19 screening, the public sector may believe there are new opportunities to exploit thermal as a modality for computer vision and AI.
This paper takes the reader on a short review of machine learning in thermal FER and the limitations of collecting and developing thermal FER data for AI training.
arXiv Detail & Related papers (2020-09-22T14:58:59Z) - I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human
Pose and Mesh Estimation from a Single RGB Image [79.040930290399]
We propose I2L-MeshNet, an image-to-lixel (line+pixel) prediction network.
The proposed I2L-MeshNet predicts the per-lixel likelihood on 1D heatmaps for each mesh coordinate instead of directly regressing the parameters.
Our lixel-based 1D heatmap preserves the spatial relationship in the input image and models the prediction uncertainty.
arXiv Detail & Related papers (2020-08-09T12:13:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.