Pragmatic Constraint on Distributional Semantics
- URL: http://arxiv.org/abs/2211.11041v1
- Date: Sun, 20 Nov 2022 17:51:06 GMT
- Title: Pragmatic Constraint on Distributional Semantics
- Authors: Elizaveta Zhemchuzhina and Nikolai Filippov and Ivan P. Yamshchikov
- Abstract summary: We show that Zipf-law token distribution emerges irrespective of the chosen tokenization.
We show that Zipf distribution is characterized by two distinct groups of tokens that differ both in terms of their frequency and their semantics.
- Score: 6.091096843566857
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the limits of language models' statistical learning in the
context of Zipf's law. First, we demonstrate that Zipf-law token distribution
emerges irrespective of the chosen tokenization. Second, we show that Zipf
distribution is characterized by two distinct groups of tokens that differ both
in terms of their frequency and their semantics. Namely, the tokens that have a
one-to-one correspondence with one semantic concept have different statistical
properties than those with semantic ambiguity. Finally, we demonstrate how
these properties interfere with statistical learning procedures motivated by
distributional semantics.
Related papers
- Distributional Semantics, Holism, and the Instability of Meaning [0.0]
A standard objection to meaning holism is the charge of instability.
In this article we examine whether the instability objection poses a problem for distributional models of meaning.
arXiv Detail & Related papers (2024-05-20T14:53:25Z) - Enriching Disentanglement: From Logical Definitions to Quantitative Metrics [59.12308034729482]
Disentangling the explanatory factors in complex data is a promising approach for generalizable and data-efficient representation learning.
We establish a theoretical connection between logical definitions of disentanglement and quantitative metrics using topos theory and enriched category theory.
We empirically demonstrate the effectiveness of the proposed metrics by isolating different aspects of disentangled representations.
arXiv Detail & Related papers (2023-05-19T08:22:23Z) - Beyond Demographic Parity: Redefining Equal Treatment [23.28973277699437]
We show the theoretical properties of our notion of equal treatment and devise a two-sample test based on the AUC of an equal treatment inspector.
We release textttexplanationspace, an open-source Python package with methods and tutorials.
arXiv Detail & Related papers (2023-03-14T16:19:44Z) - Learning versus Refutation in Noninteractive Local Differential Privacy [133.80204506727526]
We study two basic statistical tasks in non-interactive local differential privacy (LDP): learning and refutation.
Our main result is a complete characterization of the sample complexity of PAC learning for non-interactive LDP protocols.
arXiv Detail & Related papers (2022-10-26T03:19:24Z) - Label Uncertainty Modeling and Prediction for Speech Emotion Recognition
using t-Distributions [15.16865739526702]
We propose to model the label distribution using a Student's t-distribution.
We derive the corresponding Kullback-Leibler divergence based loss function and use it to train an estimator for the distribution of emotion labels.
Results reveal that our t-distribution based approach improves over the Gaussian approach with state-of-the-art uncertainty modeling results.
arXiv Detail & Related papers (2022-07-25T12:38:20Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Label Distribution Amendment with Emotional Semantic Correlations for
Facial Expression Recognition [69.18918567657757]
We propose a new method that amends the label distribution of each facial image by leveraging correlations among expressions in the semantic space.
By comparing semantic and task class-relation graphs of each image, the confidence of its label distribution is evaluated.
Experimental results demonstrate the proposed method is more effective than compared state-of-the-art methods.
arXiv Detail & Related papers (2021-07-23T07:46:14Z) - Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning [80.05441565830726]
This paper addresses imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance.
We propose a general pseudo-labeling framework to address the bias motivated by this observation.
We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label.
arXiv Detail & Related papers (2021-06-10T11:58:25Z) - On the Sentence Embeddings from Pre-trained Language Models [78.45172445684126]
In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited.
We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity.
We propose to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective.
arXiv Detail & Related papers (2020-11-02T13:14:57Z) - The empirical structure of word frequency distributions [0.0]
I show that first names form natural communicative distributions in most languages.
I then show this pattern of findings replicates in communicative distributions of English nouns and verbs.
arXiv Detail & Related papers (2020-01-09T20:52:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.