Related papers: Assessing Annotation Accuracy in Ice Sheets Using Quantitative Metrics

Assessing Annotation Accuracy in Ice Sheets Using Quantitative Metrics

URL: http://arxiv.org/abs/2407.09535v1
Date: Wed, 26 Jun 2024 04:43:51 GMT
Title: Assessing Annotation Accuracy in Ice Sheets Using Quantitative Metrics
Authors: Bayu Adhi Tama, Vandana Janeja, Sanjay Purushotham,
Abstract summary: This study addresses the need for accurate ice sheet data interpretation by introducing a suite of quantitative metrics designed to validate ice sheet annotation techniques. Our methodology incorporates several computer vision metrics, traditionally underutilized in glaciological research, to evaluate the continuity and connectivity of ice layer annotations.
Score: 10.770434484584342
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The increasing threat of sea level rise due to climate change necessitates a deeper understanding of ice sheet structures. This study addresses the need for accurate ice sheet data interpretation by introducing a suite of quantitative metrics designed to validate ice sheet annotation techniques. Focusing on both manual and automated methods, including ARESELP and its modified version, MARESELP, we assess their accuracy against expert annotations. Our methodology incorporates several computer vision metrics, traditionally underutilized in glaciological research, to evaluate the continuity and connectivity of ice layer annotations. The results demonstrate that while manual annotations provide invaluable expert insights, automated methods, particularly MARESELP, improve layer continuity and alignment with expert labels.

Related papers

Towards a Principled Evaluation of Knowledge Editors [2.497666465251894]
We show that choosing different metrics and evaluation methodologies as well as different edit batch sizes can lead to a different ranking of knowledge editors.<n>We also include a manual assessment of the string matching based evaluation method for knowledge editing that is favored by recently released datasets, revealing a tendency to produce false positive matches.
arXiv Detail & Related papers (2025-07-08T12:37:54Z)
AI-ready Snow Radar Echogram Dataset (SRED) for climate change monitoring [0.32985979395737786]
This study introduces the first comprehensive radar echogram dataset derived from Snow Radar airborne data collected in 2012.<n>To demonstrate its utility, we evaluated the performance of five deep learning models on the dataset.
arXiv Detail & Related papers (2025-05-01T18:29:36Z)
IceBench: A Benchmark for Deep Learning based Sea Ice Type Classification [1.2499537119440243]
We introduce IceBench, a comprehensive benchmarking framework for sea ice type classification. IceBench is open-source and allows for convenient integration and evaluation of other sea ice type classification methods. We conduct an in-depth comparative study on representative models to assess their strengths and limitations.
arXiv Detail & Related papers (2025-03-22T23:14:50Z)
Advancing climate model interpretability: Feature attribution for Arctic melt anomalies [0.0]
The Arctic and Antarctic ice sheets are experiencing rapid surface melting and increased freshwater runoff, contributing significantly to global sea level rise. We present a novel unsupervised attribution method leveraging counterfactual explanation method to analyze detected anomalies in ERA5 and GEMB models.
arXiv Detail & Related papers (2025-02-11T18:05:54Z)
Data Augmentation via Latent Diffusion for Saliency Prediction [67.88936624546076]
Saliency prediction models are constrained by the limited diversity and quantity of labeled data. We propose a novel data augmentation method for deep saliency prediction that edits natural images while preserving the complexity and variability of real-world scenes.
arXiv Detail & Related papers (2024-09-11T14:36:24Z)
Enabling Quick, Accurate Crowdsourced Annotation for Elevation-Aware Flood Extent Mapping [6.55068241536296]
FloodTrace is an application that enables effective crowdsourcing for flooded region annotation for machine learning training data. We provide a framework for researchers to review aggregated crowdsourced annotations and correct inaccuracies using methods inspired by uncertainty visualization.
arXiv Detail & Related papers (2024-07-31T23:42:05Z)
Partial Label Learning with Focal Loss for Sea Ice Classification Based on Ice Charts [2.0270474485799017]
We present a novel GeoAI approach to training sea ice classification by formalizing it as a partial label learning task with explicit confidence scores. We treat the polygon-level labels as candidate partial labels, assign the corresponding ice concentrations as confidence scores to each label, and integrate them with focal loss to train a Convolutional Neural Network (CNN) Our proposed approach leads to enhanced performance for sea ice classification in Sentinel-1 dual-polarized SAR images, improving classification accuracy (from 87% to 92%) and weighted average F-1 score (from 90% to 93%) compared to the conventional training approach.
arXiv Detail & Related papers (2024-06-05T22:49:30Z)
Region-level labels in ice charts can produce pixel-level segmentation for Sea Ice types [12.480532138980834]
We present a weakly supervised learning method for sea ice classification with lower-resolution labels from expert-annotated ice charts. Our method outperforms the fully supervised U-Net benchmark in both mapping resolution and class-wise accuracy.
arXiv Detail & Related papers (2024-05-16T21:54:33Z)
Enhancing Post-Hoc Explanation Benchmark Reliability for Image Classification [0.0]
Empirical evaluations demonstrate significant improvements in benchmark reliability across metrics, datasets, and post-hoc methods. This pioneering work establishes a foundation for more reliable evaluation practices in the realm of post-hoc explanation methods.
arXiv Detail & Related papers (2023-11-29T18:21:24Z)
Goodhart's Law Applies to NLP's Explanation Benchmarks [57.26445915212884]
We critically examine two sets of metrics: the ERASER metrics (comprehensiveness and sufficiency) and the EVAL-X metrics. We show that we can inflate a model's comprehensiveness and sufficiency scores dramatically without altering its predictions or explanations on in-distribution test inputs. Our results raise doubts about the ability of current metrics to guide explainability research, underscoring the need for a broader reassessment of what precisely these metrics are intended to capture.
arXiv Detail & Related papers (2023-08-28T03:03:03Z)
An Experimental Investigation into the Evaluation of Explainability Methods [60.54170260771932]
This work compares 14 different metrics when applied to nine state-of-the-art XAI methods and three dummy methods (e.g., random saliency maps) used as references. Experimental results show which of these metrics produces highly correlated results, indicating potential redundancy.
arXiv Detail & Related papers (2023-05-25T08:07:07Z)
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning [63.77667876176978]
Large language models show improved downstream task interpretability when prompted to generate step-by-step reasoning to justify their final answers. These reasoning steps greatly improve model interpretability and verification, but objectively studying their correctness is difficult. We present ROS, a suite of interpretable, unsupervised automatic scores that improve and extend previous text generation evaluation metrics.
arXiv Detail & Related papers (2022-12-15T15:52:39Z)
FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment [93.09267863425492]
We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable. We construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures.
arXiv Detail & Related papers (2022-04-07T17:59:32Z)
Simulating surface height and terminus position for marine outlet glaciers using a level set method with data assimilation [0.0]
We implement a data assimilation framework for integrating ice surface and terminus position observations into a numerical ice-flow model. The model is also applied to simulate Helheim Glacier, a major tidewater-terminating glacier of the Greenland Ice Sheet.
arXiv Detail & Related papers (2022-01-28T16:45:37Z)
Deep Semi-supervised Knowledge Distillation for Overlapping Cervical Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation. We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining. Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.