CattleFace-RGBT: RGB-T Cattle Facial Landmark Benchmark
- URL: http://arxiv.org/abs/2406.03431v1
- Date: Wed, 5 Jun 2024 16:29:13 GMT
- Title: CattleFace-RGBT: RGB-T Cattle Facial Landmark Benchmark
- Authors: Ethan Coffman, Reagan Clark, Nhat-Tan Bui, Trong Thang Pham, Beth Kegley, Jeremy G. Powell, Jiangchao Zhao, Ngan Le,
- Abstract summary: CattleFace-RGBT is a RGB-T Cattle Facial Landmark dataset consisting of 2,300 RGB-T image pairs, a total of 4,600 images.
Applying AI to thermal images is challenging due to suboptimal results from direct thermal training and infeasible RGB-thermal alignment.
We transfer models trained on RGB to thermal images and refine them using our AI-assisted annotation tool.
- Score: 4.463254896517738
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To address this challenge, we introduce CattleFace-RGBT, a RGB-T Cattle Facial Landmark dataset consisting of 2,300 RGB-T image pairs, a total of 4,600 images. Creating a landmark dataset is time-consuming, but AI-assisted annotation can help. However, applying AI to thermal images is challenging due to suboptimal results from direct thermal training and infeasible RGB-thermal alignment due to different camera views. Therefore, we opt to transfer models trained on RGB to thermal images and refine them using our AI-assisted annotation tool following a semi-automatic annotation approach. Accurately localizing facial key points on both RGB and thermal images enables us to not only discern the cattle's respiratory signs but also measure temperatures to assess the animal's thermal state. To the best of our knowledge, this is the first dataset for the cattle facial landmark on RGB-T images. We conduct benchmarking of the CattleFace-RGBT dataset across various backbone architectures, with the objective of establishing baselines for future research, analysis, and comparison. The dataset and models are at https://github.com/UARK-AICV/CattleFace-RGBT-benchmark
Related papers
- T-FAKE: Synthesizing Thermal Images for Facial Landmarking [8.20594611891252]
We introduce the T-FAKE dataset, a new large-scale synthetic thermal dataset with sparse and dense landmarks.
Our models show excellent performance with both sparse 70-point landmarks and dense 478-point landmark annotations.
arXiv Detail & Related papers (2024-08-27T15:07:58Z) - Alignment-Free RGBT Salient Object Detection: Semantics-guided Asymmetric Correlation Network and A Unified Benchmark [15.435695491233982]
RGB and Thermal (RGBT) Salient Object Detection (SOD) aims to achieve high-quality saliency prediction.
Existing methods are tailored for manually aligned image pairs, which are labor-intensive.
We make the first attempt to address RGBT SOD for initially captured RGB and thermal image pairs without manual alignment.
arXiv Detail & Related papers (2024-06-03T01:01:58Z) - Caltech Aerial RGB-Thermal Dataset in the Wild [14.699908177967181]
We present the first publicly-available RGB-thermal dataset designed for aerial robotics operating in natural environments.
Our dataset captures a variety of terrain across the United States, including rivers, lakes, coastlines, deserts, and forests.
We provide semantic segmentation annotations for 10 classes commonly encountered in natural settings.
arXiv Detail & Related papers (2024-03-13T23:31:04Z) - Visible to Thermal image Translation for improving visual task in low
light conditions [0.0]
We have collected images from two different locations using the Parrot Anafi Thermal drone.
We created a two-stream network, preprocessed, augmented, the image data, and trained the generator and discriminator models from scratch.
The findings demonstrate that it is feasible to translate RGB training data to thermal data using GAN.
arXiv Detail & Related papers (2023-10-31T05:18:53Z) - What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging [22.923237551192834]
We collect the first RGB-Thermal dataset for human motion analysis, dubbed Thermal-IM.
We develop a three-stage neural network model for accurate past human pose estimation.
arXiv Detail & Related papers (2023-04-26T16:23:10Z) - Precise Facial Landmark Detection by Reference Heatmap Transformer [52.417964103227696]
We propose a novel Reference Heatmap Transformer (RHT) for more precise facial landmark detection.
The experimental results from challenging benchmark datasets demonstrate that our proposed method outperforms the state-of-the-art methods in the literature.
arXiv Detail & Related papers (2023-03-14T12:26:48Z) - Blind Face Restoration: Benchmark Datasets and a Baseline Model [63.053331687284064]
Blind Face Restoration (BFR) aims to construct a high-quality (HQ) face image from its corresponding low-quality (LQ) input.
We first synthesize two blind face restoration benchmark datasets called EDFace-Celeb-1M (BFR128) and EDFace-Celeb-150K (BFR512)
State-of-the-art methods are benchmarked on them under five settings including blur, noise, low resolution, JPEG compression artifacts, and the combination of them (full degradation)
arXiv Detail & Related papers (2022-06-08T06:34:24Z) - GradViT: Gradient Inversion of Vision Transformers [83.54779732309653]
We demonstrate the vulnerability of vision transformers (ViTs) to gradient-based inversion attacks.
We introduce a method, named GradViT, that optimize random noise into naturally looking images.
We observe unprecedentedly high fidelity and closeness to the original (hidden) data.
arXiv Detail & Related papers (2022-03-22T17:06:07Z) - Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
Images [69.5662419067878]
Grounding referring expressions in RGBD image has been an emerging field.
We present a novel task of 3D visual grounding in single-view RGBD image where the referred objects are often only partially scanned due to occlusion.
Our approach first fuses the language and the visual features at the bottom level to generate a heatmap that localizes the relevant regions in the RGBD image.
Then our approach conducts an adaptive feature learning based on the heatmap and performs the object-level matching with another visio-linguistic fusion to finally ground the referred object.
arXiv Detail & Related papers (2021-03-14T11:18:50Z) - A Large-Scale, Time-Synchronized Visible and Thermal Face Dataset [62.193924313292875]
We present the DEVCOM Army Research Laboratory Visible-Thermal Face dataset (ARL-VTF)
With over 500,000 images from 395 subjects, the ARL-VTF dataset represents to the best of our knowledge, the largest collection of paired visible and thermal face images to date.
This paper presents benchmark results and analysis on thermal face landmark detection and thermal-to-visible face verification by evaluating state-of-the-art models on the ARL-VTF dataset.
arXiv Detail & Related papers (2021-01-07T17:17:12Z) - Multi-Scale Thermal to Visible Face Verification via Attribute Guided
Synthesis [55.29770222566124]
We use attributes extracted from visible images to synthesize attribute-preserved visible images from thermal imagery for cross-modal matching.
A novel multi-scale generator is proposed to synthesize the visible image from the thermal image guided by the extracted attributes.
A pre-trained VGG-Face network is leveraged to extract features from the synthesized image and the input visible image for verification.
arXiv Detail & Related papers (2020-04-20T01:45:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.