Related papers: Fine-tune your Classifier: Finding Correlations With Temperature

Fine-tune your Classifier: Finding Correlations With Temperature

URL: http://arxiv.org/abs/2210.09715v1
Date: Tue, 18 Oct 2022 09:48:46 GMT
Title: Fine-tune your Classifier: Finding Correlations With Temperature
Authors: Benjamin Chamand, Olivier Risser-Maroix, Camille Kurtz, Philippe Joly, Nicolas Lom\'enie
Abstract summary: We analyze the impact of temperature on classification tasks by describing a dataset as a set of statistics computed on representations. We study the correlation between these extracted statistics and the observed optimal temperatures.
Score: 2.071516130824992
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Temperature is a widely used hyperparameter in various tasks involving neural networks, such as classification or metric learning, whose choice can have a direct impact on the model performance. Most of existing works select its value using hyperparameter optimization methods requiring several runs to find the optimal value. We propose to analyze the impact of temperature on classification tasks by describing a dataset as a set of statistics computed on representations on which we can build a heuristic giving us a default value of temperature. We study the correlation between these extracted statistics and the observed optimal temperatures. This preliminary study on more than a hundred combinations of different datasets and features extractors highlights promising results towards the construction of a general heuristic for temperature.

Related papers

Exploring the Impact of Temperature on Large Language Models:Hot or Cold? [9.70280446429164]
We evaluate the impact of temperature in the range of 0 to 2 on data sets designed to assess six different capabilities.<n>Our findings reveal skill-specific effects of temperature on model performance, highlighting the complexity of optimal temperature selection.<n>We propose a BERT-based temperature selector that takes advantage of these observed effects to identify the optimal temperature for a given prompt.
arXiv Detail & Related papers (2025-06-08T21:36:26Z)
Exploring the Impact of Temperature Scaling in Softmax for Classification and Adversarial Robustness [8.934328206473456]
This study delves into the often-overlooked parameter within the softmax function, known as "temperature" Our empirical studies, adopting convolutional neural networks and transformers, reveal that moderate temperatures generally introduce better overall performance. For the first time, we discover a surprising benefit of elevated temperatures: enhanced model robustness against common corruption, natural perturbation, and non-targeted adversarial attacks like Projected Gradient Descent.
arXiv Detail & Related papers (2025-02-28T00:07:45Z)
Optimizing Temperature for Language Models with Multi-Sample Inference [47.14991144052361]
This paper addresses the challenge of automatically identifying the (near)-optimal temperature for different large language models. We provide a comprehensive analysis of temperature's role in performance optimization, considering variations in model architectures, datasets, task types, model sizes, and predictive accuracy. We propose a novel entropy-based metric for automated temperature optimization, which consistently outperforms fixed-temperature baselines.
arXiv Detail & Related papers (2025-02-07T19:35:25Z)
Adaptive Decoding via Latent Preference Optimization [55.70602730588745]
We introduce Adaptive Decoding, a layer added to the model to select the sampling temperature dynamically at inference time. Our method outperforms all fixed decoding temperatures across a range of tasks that require different temperatures.
arXiv Detail & Related papers (2024-11-14T18:31:39Z)
Optimal Kernel Choice for Score Function-based Causal Discovery [92.65034439889872]
We propose a kernel selection method within the generalized score function that automatically selects the optimal kernel that best fits the data. We conduct experiments on both synthetic data and real-world benchmarks, and the results demonstrate that our proposed method outperforms kernel selection methods.
arXiv Detail & Related papers (2024-07-14T09:32:20Z)
EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling [31.663507929452564]
We propose an effective Entropy-based Dynamic Temperature (EDT) Sampling method to balance generation quality and diversity. Our experiments show that EDT significantly outperforms the existing strategies across different tasks.
arXiv Detail & Related papers (2024-03-21T16:41:12Z)
Emerging Statistical Machine Learning Techniques for Extreme Temperature Forecasting in U.S. Cities [0.0]
We present a comprehensive analysis of extreme temperature patterns using emerging statistical machine learning techniques. We apply these methods to climate time series data from five most populated U.S. cities. Our findings highlight the differences between the statistical methods and identify Multilayer Perceptrons as the most effective approach.
arXiv Detail & Related papers (2023-07-26T16:38:32Z)
Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization [51.41175648612714]
We propose a new robust contrastive loss inspired by distributionally robust optimization (DRO) We show that our algorithm automatically learns a suitable $tau$ for each sample. Our method outperforms prior strong baselines on unimodal and bimodal datasets.
arXiv Detail & Related papers (2023-05-19T19:25:56Z)
Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models. In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z)
Dynamic Bayesian Approach for decision-making in Ego-Things [8.577234269009042]
This paper presents a novel approach to detect abnormalities in dynamic systems based on multisensory data and feature selection. Growing neural gas (GNG) is employed for clustering multisensory data into a set of nodes. Our method uses a Markov Jump particle filter (MJPF) for state estimation and abnormality detection.
arXiv Detail & Related papers (2020-10-28T11:38:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.