A new baseline for retinal vessel segmentation: Numerical identification
and correction of methodological inconsistencies affecting 100+ papers
- URL: http://arxiv.org/abs/2111.03853v1
- Date: Sat, 6 Nov 2021 11:09:11 GMT
- Title: A new baseline for retinal vessel segmentation: Numerical identification
and correction of methodological inconsistencies affecting 100+ papers
- Authors: Gy\"orgy Kov\'acs, Attila Fazekas
- Abstract summary: We performed a detailed numerical analysis of the coherence of the published performance scores.
We found inconsistencies in the reported scores related to the use of the field of view.
The highest accuracy score achieved to date is 0.9582 in the FoV region, which is 1% higher than that of human annotators.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the last 15 years, the segmentation of vessels in retinal images has
become an intensively researched problem in medical imaging, with hundreds of
algorithms published. One of the de facto benchmarking data sets of vessel
segmentation techniques is the DRIVE data set. Since DRIVE contains a
predefined split of training and test images, the published performance results
of the various segmentation techniques should provide a reliable ranking of the
algorithms. Including more than 100 papers in the study, we performed a
detailed numerical analysis of the coherence of the published performance
scores. We found inconsistencies in the reported scores related to the use of
the field of view (FoV), which has a significant impact on the performance
scores. We attempted to eliminate the biases using numerical techniques to
provide a more realistic picture of the state of the art. Based on the results,
we have formulated several findings, most notably: despite the well-defined
test set of DRIVE, most rankings in published papers are based on
non-comparable figures; in contrast to the near-perfect accuracy scores
reported in the literature, the highest accuracy score achieved to date is
0.9582 in the FoV region, which is 1% higher than that of human annotators. The
methods we have developed for identifying and eliminating the evaluation biases
can be easily applied to other domains where similar problems may arise.
Related papers
- Efficient Data-Sketches and Fine-Tuning for Early Detection of Distributional Drift in Medical Imaging [5.1358645354733765]
This paper presents an accurate and sensitive approach to detect distributional drift in CT-scan medical images.
We developed a robust library model for real-time anomaly detection, allowing for efficient comparison of incoming images.
We fine-tuned a vision transformer pre-trained model to extract relevant features using breast cancer images.
arXiv Detail & Related papers (2024-08-15T23:46:37Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Intrinsic Self-Supervision for Data Quality Audits [35.69673085324971]
Benchmark datasets in computer vision often contain off-topic images, near duplicates, and label errors.
In this paper, we revisit the task of data cleaning and formalize it as either a ranking problem, or a scoring problem.
We find that a specific combination of context-aware self-supervised representation learning and distance-based indicators is effective in finding issues without annotation biases.
arXiv Detail & Related papers (2023-05-26T15:57:04Z) - Devil is in the Queries: Advancing Mask Transformers for Real-world
Medical Image Segmentation and Out-of-Distribution Localization [40.013449382899566]
A trustworthy medical AI algorithm should demonstrate its effectiveness on tail conditions to avoid clinically dangerous damage.
We adopt the concept of object queries in Mask Transformers to formulate semantic segmentation as a soft cluster assignment.
Our framework is tested on two real-world segmentation tasks, i.e., segmentation of pancreatic and liver tumors.
arXiv Detail & Related papers (2023-04-01T03:24:03Z) - Improving Object Detection in Medical Image Analysis through Multiple
Expert Annotators: An Empirical Investigation [0.3670422696827525]
The work discusses the use of machine learning algorithms for anomaly detection in medical image analysis.
We introduce a simple and effective approach that aggregates annotations from multiple annotators with varying levels of expertise.
We then aim to improve the efficiency of predictive models in abnormal detection tasks by estimating hidden labels from multiple annotations and using a re-weighted loss function to improve detection performance.
arXiv Detail & Related papers (2023-03-29T07:34:20Z) - TeTIm-Eval: a novel curated evaluation data set for comparing
text-to-image models [1.1252184947601962]
evaluating and comparing text-to-image models is a challenging problem.
In this paper a novel evaluation approach is experimented, on the basis of: (i) a curated data set, divided into ten categories; (ii) a quantitative metric, the CLIP-score, (iii) a human evaluation task to distinguish, for a given text, the real and the generated images.
Early experimental results show that the accuracy of the human judgement is fully coherent with the CLIP-score.
arXiv Detail & Related papers (2022-12-15T13:52:03Z) - PCA: Semi-supervised Segmentation with Patch Confidence Adversarial
Training [52.895952593202054]
We propose a new semi-supervised adversarial method called Patch Confidence Adrial Training (PCA) for medical image segmentation.
PCA learns the pixel structure and context information in each patch to get enough gradient feedback, which aids the discriminator in convergent to an optimal state.
Our method outperforms the state-of-the-art semi-supervised methods, which demonstrates its effectiveness for medical image segmentation.
arXiv Detail & Related papers (2022-07-24T07:45:47Z) - Fake It Till You Make It: Near-Distribution Novelty Detection by
Score-Based Generative Models [54.182955830194445]
existing models either fail or face a dramatic drop under the so-called near-distribution" setting.
We propose to exploit a score-based generative model to produce synthetic near-distribution anomalous data.
Our method improves the near-distribution novelty detection by 6% and passes the state-of-the-art by 1% to 5% across nine novelty detection benchmarks.
arXiv Detail & Related papers (2022-05-28T02:02:53Z) - Vision Transformers for femur fracture classification [59.99241204074268]
The Vision Transformer (ViT) was able to correctly predict 83% of the test images.
Good results were obtained in sub-fractures with the largest and richest dataset ever.
arXiv Detail & Related papers (2021-08-07T10:12:42Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.