Rethinking Ranking-based Loss Functions: Only Penalizing Negative
Instances before Positive Ones is Enough
- URL: http://arxiv.org/abs/2102.04640v1
- Date: Tue, 9 Feb 2021 04:30:15 GMT
- Title: Rethinking Ranking-based Loss Functions: Only Penalizing Negative
Instances before Positive Ones is Enough
- Authors: Zhuo Li, Weiqing Min, Jiajun Song, Yaohui Zhu, Shuqiang Jiang
- Abstract summary: We argue that only penalizing negative instances before positive ones is enough, because the loss only comes from them.
Instead of following the AP-based loss, we propose a new loss, namely Penalizing Negative instances before Positive ones (PNP)
PNP-D may be more suitable for real-world data, which usually contains several local clusters for one class.
- Score: 55.55081785232991
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optimising the approximation of Average Precision (AP) has been widely
studied for retrieval. Such methods consider both negative and positive
instances before each target positive one according to the definition of AP.
However, we argue that only penalizing negative instances before positive ones
is enough, because the loss only comes from them. To this end, instead of
following the AP-based loss, we propose a new loss, namely Penalizing Negative
instances before Positive ones (PNP), which directly minimizes the number of
negative instances before each positive one. Meanwhile, limited by the
definition of AP, AP-based methods only adopt a specific gradient assignment
strategy. We wonder whether there exists better ones. Instead, we
systematically investigate different gradient assignment solutions via
constructing derivative functions of the loss, resulting in PNP-I with
increasing derivative functions and PNP-D with decreasing ones. Because of
their gradient assignment strategies, PNP-I tries to make all the relevant
instances together, while PNP-D only quickly corrects positive one with fewer
negative instances before. Thus, PNP-D may be more suitable for real-world
data, which usually contains several local clusters for one class. Extensive
evaluations on three standard retrieval datasets also show that PNP-D achieves
the state-of-the-art performance.
Related papers
- Policy Gradient with Active Importance Sampling [55.112959067035916]
Policy gradient (PG) methods significantly benefit from IS, enabling the effective reuse of previously collected samples.
However, IS is employed in RL as a passive tool for re-weighting historical samples.
We look for the best behavioral policy from which to collect samples to reduce the policy gradient variance.
arXiv Detail & Related papers (2024-05-09T09:08:09Z) - The Role of Baselines in Policy Gradient Optimization [83.42050606055822]
We show that the emphstate value baseline allows on-policy.
emphnatural policy gradient (NPG) to converge to a globally optimal.
policy at an $O (1/t) rate gradient.
We find that the primary effect of the value baseline is to textbfreduce the aggressiveness of the updates rather than their variance.
arXiv Detail & Related papers (2023-01-16T06:28:00Z) - Variance Reduction for Score Functions Using Optimal Baselines [0.0]
This paper studies baselines, a variance reduction technique for score functions.
Motivated primarily by reinforcement learning, we derive for the first time an expression for the optimal state-dependent baseline.
arXiv Detail & Related papers (2022-12-27T19:17:28Z) - Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality [94.89246810243053]
This paper studies offline policy learning, which aims at utilizing observations collected a priori to learn an optimal individualized decision rule.
Existing policy learning methods rely on a uniform overlap assumption, i.e., the propensities of exploring all actions for all individual characteristics must be lower bounded.
We propose Pessimistic Policy Learning (PPL), a new algorithm that optimize lower confidence bounds (LCBs) instead of point estimates.
arXiv Detail & Related papers (2022-12-19T22:43:08Z) - Post-Processing Temporal Action Detection [134.26292288193298]
Temporal Action Detection (TAD) methods typically take a pre-processing step in converting an input varying-length video into a fixed-length snippet representation sequence.
This pre-processing step would temporally downsample the video, reducing the inference resolution and hampering the detection performance in the original temporal resolution.
We introduce a novel model-agnostic post-processing method without model redesign and retraining.
arXiv Detail & Related papers (2022-11-27T19:50:37Z) - Unified Negative Pair Generation toward Well-discriminative Feature Space for Face Recognition [2.816374336026564]
Face recognition models form a well-discriminative feature space (WDFS) that satisfies $infmathcalSp > supmathcalSn$.
This paper proposes a unified negative pair generation (UNPG) by combining two PG strategies.
arXiv Detail & Related papers (2022-03-22T10:21:11Z) - Variance Penalized On-Policy and Off-Policy Actor-Critic [60.06593931848165]
We propose on-policy and off-policy actor-critic algorithms that optimize a performance criterion involving both mean and variance in the return.
Our approach not only performs on par with actor-critic and prior variance-penalization baselines in terms of expected return, but also generates trajectories which have lower variance in the return.
arXiv Detail & Related papers (2021-02-03T10:06:16Z) - Direct loss minimization algorithms for sparse Gaussian processes [9.041035455989181]
The paper provides a thorough investigation of Direct loss (DLM), which optimize the posterior to minimize predictive loss in sparse Gaussian processes.
The application of DLM in non-conjugate cases is more complex because the minimization of expectation in the log-loss DLM objective is often intractable.
arXiv Detail & Related papers (2020-04-07T02:31:00Z) - Resolving learning rates adaptively by locating Stochastic Non-Negative
Associated Gradient Projection Points using line searches [0.0]
Learning rates in neural network training are currently determined a priori to training using expensive manual or automated tuning.
This study proposes gradient-only line searches to resolve the learning rate for neural network training algorithms.
arXiv Detail & Related papers (2020-01-15T03:08:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.