Photometric Redshift Estimation with Convolutional Neural Networks and
Galaxy Images: A Case Study of Resolving Biases in Data-Driven Methods
- URL: http://arxiv.org/abs/2202.09964v1
- Date: Mon, 21 Feb 2022 02:59:33 GMT
- Title: Photometric Redshift Estimation with Convolutional Neural Networks and
Galaxy Images: A Case Study of Resolving Biases in Data-Driven Methods
- Authors: Q. Lin, D. Fouchez, J. Pasquet, M. Treyer, R. Ait Ouahmed, S. Arnouts,
and O. Ilbert
- Abstract summary: We investigate two major forms of biases, i.e., class-dependent residuals and mode collapse, in a case study of estimating photometric redshifts.
We propose a set of consecutive steps for resolving the two biases based on CNN models.
Experiments show that our methods possess a better capability in controlling biases compared to benchmark methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning models have been increasingly exploited in astrophysical
studies, yet such data-driven algorithms are prone to producing biased outputs
detrimental for subsequent analyses. In this work, we investigate two major
forms of biases, i.e., class-dependent residuals and mode collapse, in a case
study of estimating photometric redshifts as a classification problem using
Convolutional Neural Networks (CNNs) and galaxy images with spectroscopic
redshifts. We focus on point estimates and propose a set of consecutive steps
for resolving the two biases based on CNN models, involving representation
learning with multi-channel outputs, balancing the training data and leveraging
soft labels. The residuals can be viewed as a function of spectroscopic
redshifts or photometric redshifts, and the biases with respect to these two
definitions are incompatible and should be treated in a split way. We suggest
that resolving biases in the spectroscopic space is a prerequisite for
resolving biases in the photometric space. Experiments show that our methods
possess a better capability in controlling biases compared to benchmark
methods, and exhibit robustness under varying implementing and training
conditions provided with high-quality data. Our methods have promises for
future cosmological surveys that require a good constraint of biases, and may
be applied to regression problems and other studies that make use of
data-driven models. Nonetheless, the bias-variance trade-off and the demand on
sufficient statistics suggest the need for developing better methodologies and
optimizing data usage strategies.
Related papers
- CLAP. I. Resolving miscalibration for deep learning-based galaxy photometric redshift estimation [3.611102630303458]
We develop a novel method called the Contrastive Learning and Adaptive KNN for Photometric Redshift (CLAP)
It leverages supervised contrastive learning (SCL) and k-nearest neighbours (KNN) to construct and calibrate raw probability density estimates.
The harmonic mean is adopted to combine an ensemble of estimates from multiple realisations for improving accuracy.
arXiv Detail & Related papers (2024-10-25T08:46:55Z) - Learning Diffusion Model from Noisy Measurement using Principled Expectation-Maximization Method [9.173055778539641]
We propose a principled expectation-maximization (EM) framework that iteratively learns diffusion models from noisy data with arbitrary corruption types.
Our framework employs a plug-and-play Monte Carlo method to accurately estimate clean images from noisy measurements, followed by training the diffusion model using the reconstructed images.
arXiv Detail & Related papers (2024-10-15T03:54:59Z) - Few-shot learning for COVID-19 Chest X-Ray Classification with
Imbalanced Data: An Inter vs. Intra Domain Study [49.5374512525016]
Medical image datasets are essential for training models used in computer-aided diagnosis, treatment planning, and medical research.
Some challenges are associated with these datasets, including variability in data distribution, data scarcity, and transfer learning issues when using models pre-trained from generic images.
We propose a methodology based on Siamese neural networks in which a series of techniques are integrated to mitigate the effects of data scarcity and distribution imbalance.
arXiv Detail & Related papers (2024-01-18T16:59:27Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Analyzing the Effects of Handling Data Imbalance on Learned Features
from Medical Images by Looking Into the Models [50.537859423741644]
Training a model on an imbalanced dataset can introduce unique challenges to the learning problem.
We look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features.
arXiv Detail & Related papers (2022-04-04T09:38:38Z) - Learning Discriminative Shrinkage Deep Networks for Image Deconvolution [122.79108159874426]
We propose an effective non-blind deconvolution approach by learning discriminative shrinkage functions to implicitly model these terms.
Experimental results show that the proposed method performs favorably against the state-of-the-art ones in terms of efficiency and accuracy.
arXiv Detail & Related papers (2021-11-27T12:12:57Z) - Score-based diffusion models for accelerated MRI [35.3148116010546]
We introduce a way to sample data from a conditional distribution given the measurements, such that the model can be readily used for solving inverse problems in imaging.
Our model requires magnitude images only for training, and yet is able to reconstruct complex-valued data, and even extends to parallel imaging.
arXiv Detail & Related papers (2021-10-08T08:42:03Z) - Visual Recognition with Deep Learning from Biased Image Datasets [6.10183951877597]
We show how biasing models can be applied to remedy problems in the context of visual recognition.
Based on the (approximate) knowledge of the biasing mechanisms at work, our approach consists in reweighting the observations.
We propose to use a low dimensional image representation, shared across the image databases.
arXiv Detail & Related papers (2021-09-06T10:56:58Z) - Scalable Statistical Inference of Photometric Redshift via Data
Subsampling [0.3222802562733786]
Handling big data has largely been a major bottleneck in traditional statistical models.
We develop a data-driven statistical modeling framework that combines the uncertainties from an ensemble of statistical models.
We demonstrate this method on a photometric redshift estimation problem in cosmology.
arXiv Detail & Related papers (2021-03-30T02:49:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.