Capturing Symmetry and Antisymmetry in Language Models through Symmetry-Aware Training Objectives
- URL: http://arxiv.org/abs/2504.16312v1
- Date: Tue, 22 Apr 2025 23:11:50 GMT
- Title: Capturing Symmetry and Antisymmetry in Language Models through Symmetry-Aware Training Objectives
- Authors: Zhangdie Yuan, Andreas Vlachos,
- Abstract summary: We introduce a novel Wikidata-derived natural language inference dataset designed to evaluate large language models (LLMs)<n>Our findings reveal that LLMs perform comparably to random chance on this benchmark, highlighting a gap in relational understanding.<n>To address this, we explore encoder retraining via contrastive learning with k-nearest neighbors.
- Score: 11.024798322342413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Capturing symmetric (e.g., country borders another country) and antisymmetric (e.g., parent_of) relations is crucial for a variety of applications. This paper tackles this challenge by introducing a novel Wikidata-derived natural language inference dataset designed to evaluate large language models (LLMs). Our findings reveal that LLMs perform comparably to random chance on this benchmark, highlighting a gap in relational understanding. To address this, we explore encoder retraining via contrastive learning with k-nearest neighbors. The retrained encoder matches the performance of fine-tuned classification heads while offering additional benefits, including greater efficiency in few-shot learning and improved mitigation of catastrophic forgetting.
Related papers
- ReLearn: Unlearning via Learning for Large Language Models [64.2802606302194]
We propose ReLearn, a data augmentation and fine-tuning pipeline for effective unlearning.<n>This framework introduces Knowledge Forgetting Rate (KFR) and Knowledge Retention Rate (KRR) to measure knowledge-level preservation.<n>Our experiments show that ReLearn successfully achieves targeted forgetting while preserving high-quality output.
arXiv Detail & Related papers (2025-02-16T16:31:00Z) - Analysis of LLM as a grammatical feature tagger for African American English [0.6927055673104935]
African American English (AAE) presents unique challenges in natural language processing (NLP)<n>This research systematically compares the performance of available NLP models.<n>This study highlights the necessity for improved model training and architectural adjustments to better accommodate AAE's unique linguistic characteristics.
arXiv Detail & Related papers (2025-02-09T19:46:33Z) - SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - Enhancing Idiomatic Representation in Multiple Languages via an Adaptive Contrastive Triplet Loss [9.807885676930308]
We propose an approach to model idiomaticity using a triplet loss that incorporates the asymmetric contribution of components words to an idiomatic meaning for training language models.
Our proposed method is evaluated on a SemEval challenge and outperforms previous alternatives significantly in many metrics.
arXiv Detail & Related papers (2024-06-21T14:21:41Z) - CSS: Contrastive Semantic Similarity for Uncertainty Quantification of LLMs [1.515687944002438]
We propose Contrastive Semantic Similarity, a module to obtain similarity features for measuring uncertainty for text pairs.
We conduct extensive experiments with three large language models (LLMs) on several benchmark question-answering datasets.
Results show that our proposed method performs better in estimating reliable responses of LLMs than comparable baselines.
arXiv Detail & Related papers (2024-06-05T11:35:44Z) - Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z) - Machine Translation Meta Evaluation through Translation Accuracy
Challenge Sets [92.38654521870444]
We introduce ACES, a contrastive challenge set spanning 146 language pairs.
This dataset aims to discover whether metrics can identify 68 translation accuracy errors.
We conduct a large-scale study by benchmarking ACES on 50 metrics submitted to the WMT 2022 and 2023 metrics shared tasks.
arXiv Detail & Related papers (2024-01-29T17:17:42Z) - Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore.
We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.