Self-supervised Representation Learning with Relative Predictive Coding
- URL: http://arxiv.org/abs/2103.11275v1
- Date: Sun, 21 Mar 2021 01:04:24 GMT
- Title: Self-supervised Representation Learning with Relative Predictive Coding
- Authors: Yao-Hung Hubert Tsai, Martin Q. Ma, Muqiao Yang, Han Zhao,
Louis-Philippe Morency, Ruslan Salakhutdinov
- Abstract summary: Relative Predictive Coding (RPC) is a new contrastive representation learning objective.
RPC maintains a good balance among training stability, minibatch size sensitivity, and downstream task performance.
We empirically verify the effectiveness of RPC on benchmark vision and speech self-supervised learning tasks.
- Score: 102.93854542031396
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces Relative Predictive Coding (RPC), a new contrastive
representation learning objective that maintains a good balance among training
stability, minibatch size sensitivity, and downstream task performance. The key
to the success of RPC is two-fold. First, RPC introduces the relative
parameters to regularize the objective for boundedness and low variance.
Second, RPC contains no logarithm and exponential score functions, which are
the main cause of training instability in prior contrastive objectives. We
empirically verify the effectiveness of RPC on benchmark vision and speech
self-supervised learning tasks. Lastly, we relate RPC with mutual information
(MI) estimation, showing RPC can be used to estimate MI with low variance.
Related papers
- Performance Optimization of Ratings-Based Reinforcement Learning [1.6133809033337525]
This paper explores multiple optimization methods to improve the performance of rating-based reinforcement learning (RbRL)
RbRL has been developed to infer reward functions in reward-free environments for the subsequent policy learning via standard reinforcement learning.
arXiv Detail & Related papers (2025-01-13T23:56:24Z) - ResFlow: Fine-tuning Residual Optical Flow for Event-based High Temporal Resolution Motion Estimation [50.80115710105251]
Event cameras hold significant promise for high-temporal-resolution (HTR) motion estimation.
We propose a residual-based paradigm for estimating HTR optical flow with event data.
arXiv Detail & Related papers (2024-12-12T09:35:47Z) - Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate [4.6659670917171825]
Recurrent reinforcement learning (RL) consists of a context encoder based on recurrent neural networks (RNNs) for unobservable state prediction.
Previous RL methods face training stability issues due to the gradient instability of RNNs.
We propose Recurrent Off-policy RL with Context-Encoder-Specific Learning Rate (RESeL) to tackle this issue.
arXiv Detail & Related papers (2024-05-24T09:33:47Z) - Prior Constraints-based Reward Model Training for Aligning Large Language Models [58.33118716810208]
This paper proposes a Prior Constraints-based Reward Model (namely PCRM) training method to mitigate this problem.
PCRM incorporates prior constraints, specifically, length ratio and cosine similarity between outputs of each comparison pair, during reward model training to regulate optimization magnitude and control score margins.
Experimental results demonstrate that PCRM significantly improves alignment performance by effectively constraining reward score scaling.
arXiv Detail & Related papers (2024-04-01T07:49:11Z) - Directly Attention Loss Adjusted Prioritized Experience Replay [0.07366405857677226]
Prioritized Replay Experience (PER) enables the model to learn more about relatively important samples by artificially changing their accessed frequencies.
DALAP is proposed, which can directly quantify the changed extent of the shifted distribution through Parallel Self-Attention network.
arXiv Detail & Related papers (2023-11-24T10:14:05Z) - Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction [34.76128090845668]
Relation extraction has been particularly challenging in low-resource scenarios and domains.
Recent literature has tackled low-resource RE by self-supervised learning.
We propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.
arXiv Detail & Related papers (2022-12-21T07:30:22Z) - Adversarial Intrinsic Motivation for Reinforcement Learning [60.322878138199364]
We investigate whether the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution can be utilized effectively for reinforcement learning tasks.
Our approach, termed Adversarial Intrinsic Motivation (AIM), estimates this Wasserstein-1 distance through its dual objective and uses it to compute a supplemental reward function.
arXiv Detail & Related papers (2021-05-27T17:51:34Z) - Strategy for Boosting Pair Comparison and Improving Quality Assessment
Accuracy [29.849156371902943]
Pair Comparison (PC) is of significant advantage over Absolute Category Rating (ACR) in terms of discriminability.
In this study, we employ a generic model to bridge the pair comparison data and ACR data, where the variance term could be recovered and the obtained information is more complete.
In such a way, the proposed methodology could achieve the same accuracy of pair comparison but with the compelxity as low as ACR.
arXiv Detail & Related papers (2020-10-01T13:05:09Z) - Robust Learning Through Cross-Task Consistency [92.42534246652062]
We propose a broadly applicable and fully computational method for augmenting learning with Cross-Task Consistency.
We observe that learning with cross-task consistency leads to more accurate predictions and better generalization to out-of-distribution inputs.
arXiv Detail & Related papers (2020-06-07T09:24:33Z) - An Information Bottleneck Approach for Controlling Conciseness in
Rationale Extraction [84.49035467829819]
We show that it is possible to better manage this trade-off by optimizing a bound on the Information Bottleneck (IB) objective.
Our fully unsupervised approach jointly learns an explainer that predicts sparse binary masks over sentences, and an end-task predictor that considers only the extracted rationale.
arXiv Detail & Related papers (2020-05-01T23:26:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.