Effect of the output activation function on the probabilities and errors
in medical image segmentation
- URL: http://arxiv.org/abs/2109.00903v1
- Date: Thu, 2 Sep 2021 12:51:14 GMT
- Title: Effect of the output activation function on the probabilities and errors
in medical image segmentation
- Authors: Lars Nieradzik and Gerik Scheuermann and Dorothee Saur and Christina
Gillmann
- Abstract summary: sigmoid activation is the standard output activation function in binary classification and segmentation with neural networks.
We consider how the behavior of different output activation and loss functions affects the prediction probabilities and the corresponding segmentation errors.
- Score: 3.0625089376654664
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The sigmoid activation is the standard output activation function in binary
classification and segmentation with neural networks. Still, there exist a
variety of other potential output activation functions, which may lead to
improved results in medical image segmentation. In this work, we consider how
the asymptotic behavior of different output activation and loss functions
affects the prediction probabilities and the corresponding segmentation errors.
For cross entropy, we show that a faster rate of change of the activation
function correlates with better predictions, while a slower rate of change can
improve the calibration of probabilities. For dice loss, we found that the
arctangent activation function is superior to the sigmoid function.
Furthermore, we provide a test space for arbitrary output activation functions
in the area of medical image segmentation. We tested seven activation functions
in combination with three loss functions on four different medical image
segmentation tasks to provide a classification of which function is best suited
in this application scenario.
Related papers
- Interpretable Enzyme Function Prediction via Residue-Level Detection [58.30647671797602]
We present an attention-based framework, namely ProtDETR, for enzyme function prediction.
It uses a set of learnable functional queries to adaptatively extract different local representations from the sequence of residue-level features.
ProtDETR significantly outperforms existing deep learning-based enzyme function prediction methods.
arXiv Detail & Related papers (2025-01-10T01:02:43Z) - Activation Scaling for Steering and Interpreting Language Models [55.59689963561315]
We argue that successfully intervening on a model is a prerequisite for interpreting its internal workings.
We establish a three-term objective: a successful intervention should flip the correct with the wrong token and vice versa.
Using gradient-based optimization, this objective lets us learn (and later evaluate) a specific kind of efficient and interpretable intervention.
arXiv Detail & Related papers (2024-10-07T12:01:32Z) - Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features [115.33889811527533]
Diffusion models are initially designed for image generation.
Recent research shows that the internal signals within their backbones, named activations, can also serve as dense features for various discriminative tasks.
arXiv Detail & Related papers (2024-10-04T16:05:14Z) - Trainable Highly-expressive Activation Functions [8.662179223772089]
We introduce DiTAC, a trainable highly-expressive activation function.
DiTAC enhances model expressiveness and performance, often yielding substantial improvements.
It also outperforms existing activation functions (regardless of whether the latter are fixed or trainable) in tasks such as semantic segmentation, image generation, regression problems, and image classification.
arXiv Detail & Related papers (2024-07-10T11:49:29Z) - Your Network May Need to Be Rewritten: Network Adversarial Based on High-Dimensional Function Graph Decomposition [0.994853090657971]
We propose a network adversarial method to address the aforementioned challenges.
This is the first method to use different activation functions in a network.
We have achieved a substantial improvement over standard activation functions regarding both training efficiency and predictive accuracy.
arXiv Detail & Related papers (2024-05-04T11:22:30Z) - Data-aware customization of activation functions reduces neural network
error [0.35172332086962865]
We show that data-aware customization of activation functions can result in striking reductions in neural network error.
A simple substitution with the seagull'' activation function in an already-refined neural network can lead to an order-of-magnitude reduction in error.
arXiv Detail & Related papers (2023-01-16T23:38:37Z) - Stochastic Adaptive Activation Function [1.9199289015460212]
This study proposes a simple yet effective activation function that facilitates different thresholds and adaptive activations according to the positions of units and the contexts of inputs.
Experimental analysis demonstrates that our activation function can provide the benefits of more accurate prediction and earlier convergence in many deep learning applications.
arXiv Detail & Related papers (2022-10-21T01:57:25Z) - Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data.
RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z) - Inference on Strongly Identified Functionals of Weakly Identified
Functions [71.42652863687117]
We study a novel condition for the functional to be strongly identified even when the nuisance function is not.
We propose penalized minimax estimators for both the primary and debiasing nuisance functions.
arXiv Detail & Related papers (2022-08-17T13:38:31Z) - Activation Functions: Dive into an optimal activation function [1.52292571922932]
We find an optimal activation function by defining it as a weighted sum of existing activation functions.
The study uses three activation functions, ReLU, tanh, and sin, over three popular image datasets.
arXiv Detail & Related papers (2022-02-24T12:44:11Z) - Why Do Better Loss Functions Lead to Less Transferable Features? [93.47297944685114]
This paper studies how the choice of training objective affects the transferability of the hidden representations of convolutional neural networks trained on ImageNet.
We show that many objectives lead to statistically significant improvements in ImageNet accuracy over vanilla softmax cross-entropy, but the resulting fixed feature extractors transfer substantially worse to downstream tasks.
arXiv Detail & Related papers (2020-10-30T17:50:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.