Related papers: Structured Probabilistic Coding

Structured Probabilistic Coding

URL: http://arxiv.org/abs/2312.13933v5
Date: Thu, 2 May 2024 05:18:45 GMT
Title: Structured Probabilistic Coding
Authors: Dou Hu, Lingwei Wei, Yaxin Liu, Wei Zhou, Songlin Hu,
Abstract summary: This paper presents a new supervised representation learning framework, namely structured probabilistic coding (SPC) SPC is an encoder-only probabilistic coding technology with a structured regularization from the target space. It can enhance the generalization ability of pre-trained language models for better language understanding.
Score: 28.46046583495838
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a new supervised representation learning framework, namely structured probabilistic coding (SPC), to learn compact and informative representations from input related to the target task. SPC is an encoder-only probabilistic coding technology with a structured regularization from the target space. It can enhance the generalization ability of pre-trained language models for better language understanding. Specifically, our probabilistic coding simultaneously performs information encoding and task prediction in one module to more fully utilize the effective information from input data. It uses variational inference in the output space to reduce randomness and uncertainty. Besides, to better control the learning process of probabilistic representations, a structured regularization is proposed to promote uniformity across classes in the latent space. With the regularization term, SPC can preserve the Gaussian structure of the latent code and achieve better coverage of the hidden space with class uniformly. Experimental results on 12 natural language understanding tasks demonstrate that our SPC effectively improves the performance of pre-trained language models for classification and regression. Extensive experiments show that SPC can enhance the generalization capability, robustness to label noise, and clustering quality of output representations.

Related papers

CSE-SFP: Enabling Unsupervised Sentence Representation Learning via a Single Forward Pass [3.0566617373924325]
Recent advances in pre-trained language models (PLMs) have driven remarkable progress in this field. We propose CSE-SFP, an innovative method that exploits the structural characteristics of generative models. We show that CSE-SFP not only produces higher-quality embeddings but also significantly reduces both training time and memory consumption.
arXiv Detail & Related papers (2025-05-01T08:27:14Z)
Style Quantization for Data-Efficient GAN Training [18.40243591024141]
Under limited data setting, GANs often struggle to navigate and effectively exploit the input latent space. We propose textitSQ-GAN, a novel approach that enhances consistency regularization. Experiments demonstrate significant improvements in both discriminator robustness and generation quality.
arXiv Detail & Related papers (2025-03-31T16:28:44Z)
Semi-supervised Semantic Segmentation with Multi-Constraint Consistency Learning [81.02648336552421]
We propose a Multi-Constraint Consistency Learning approach to facilitate the staged enhancement of the encoder and decoder. Self-adaptive feature masking and noise injection are designed in an instance-specific manner to perturb the features for robust learning of the decoder. Experimental results on Pascal VOC2012 and Cityscapes datasets demonstrate that our proposed MCCL achieves new state-of-the-art performance.
arXiv Detail & Related papers (2025-03-23T03:21:33Z)
Uncertainty-Participation Context Consistency Learning for Semi-supervised Semantic Segmentation [9.546065701435532]
Semi-supervised semantic segmentation has attracted considerable attention for its ability to mitigate the reliance on extensive labeled data. This paper proposes the Uncertainty-participation Context Consistency Learning (UCCL) method to explore richer supervisory signals.
arXiv Detail & Related papers (2024-12-23T06:49:59Z)
Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC. We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss. Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z)
SEER-ZSL: Semantic Encoder-Enhanced Representations for Generalized Zero-Shot Learning [0.7420433640907689]
Generalized Zero-Shot Learning (GZSL) recognizes unseen classes by transferring knowledge from the seen classes. This paper introduces a dual strategy to address the generalization gap.
arXiv Detail & Related papers (2023-12-20T15:18:51Z)
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs) We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU) We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z)
Semantic Probabilistic Layers for Neuro-Symbolic Learning [83.25785999205932]
We design a predictive layer for structured-output prediction (SOP) It can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints. Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space.
arXiv Detail & Related papers (2022-06-01T12:02:38Z)
Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization [76.68866368409216]
We propose learning to dynamically select discretization tightness conditioned on inputs. We show that dynamically varying tightness in communication bottlenecks can improve model performance on visual reasoning and reinforcement learning tasks.
arXiv Detail & Related papers (2022-02-02T23:54:26Z)
SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense Predictions without Cost [36.27226683586425]
The semantic structure aware inference (SSA) is proposed to explore the semantic structure information hidden in different stages of the CNN-based network to generate high-quality CAM in the model inference. The proposed method has the advantage of no parameters and does not need to be trained. Therefore, it can be applied to a wide range of weakly-supervised pixel-wise dense prediction tasks.
arXiv Detail & Related papers (2021-11-05T11:07:21Z)
pRSL: Interpretable Multi-label Stacking by Learning Probabilistic Rules [0.0]
We present the probabilistic rule stacking (pRSL) which uses probabilistic propositional logic rules and belief propagation to combine the predictions of several underlying classifiers. We derive algorithms for exact and approximate inference and learning, and show that pRSL reaches state-of-the-art performance on various benchmark datasets.
arXiv Detail & Related papers (2021-05-28T14:06:21Z)
Network Classifiers Based on Social Learning [71.86764107527812]
We propose a new way of combining independently trained classifiers over space and time. The proposed architecture is able to improve prediction performance over time with unlabeled data. We show that this strategy results in consistent learning with high probability, and it yields a robust structure against poorly trained classifiers.
arXiv Detail & Related papers (2020-10-23T11:18:20Z)
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components [96.42805872177067]
We propose a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space. We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step. We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.
arXiv Detail & Related papers (2020-10-07T03:34:01Z)
Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction [32.21975128854042]
We propose the principle of Maximal Coding Rate Reduction ($textMCR2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class. We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive and contrastive learning, and provide theoretical guarantees for learning diverse and discriminative features.
arXiv Detail & Related papers (2020-06-15T17:23:55Z)
Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method. PCL implicitly encodes semantic structures of the data into the learned embedding space. PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.