Related papers: SMCLM: Semantically Meaningful Causal Language Modeling for Autoregressive Paraphrase Generation

SMCLM: Semantically Meaningful Causal Language Modeling for Autoregressive Paraphrase Generation

URL: http://arxiv.org/abs/2507.03415v1
Date: Fri, 04 Jul 2025 09:23:13 GMT
Title: SMCLM: Semantically Meaningful Causal Language Modeling for Autoregressive Paraphrase Generation
Authors: Michał Perełkiewicz, Sławomir Dadas, Rafał Poświata,
Abstract summary: This article introduces semantically meaningful causal language modeling (SMCLM)<n>SMCLM is a selfsupervised method of training autoregressive models to generate semantically equivalent text.<n>The proposed method is competitive with the supervised method and achieves state-of-the-art results in unsupervised approaches.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This article introduces semantically meaningful causal language modeling (SMCLM), a selfsupervised method of training autoregressive models to generate semantically equivalent text. Our approach involves using semantically meaningful text representation as an initial embedding in the autoregressive training and generation processes. The extensive empirical study demonstrates that the SMCLM approach makes autoregressive models capable of learning robust and high-quality paraphrase generation. The proposed method is competitive with the supervised method and achieves state-of-the-art results in unsupervised approaches. This article also presents a comprehensive set of automatic metrics that cover a wide range of autogenerated paraphrase evaluation aspects. Simultaneously, this article highlights the low reliability of the metrics that are widely used in paraphrase generation evaluation, including BLEU, ROUGE, and BERTScore.

Related papers

LLMs Are Not Scorers: Rethinking MT Evaluation with Generation-Based Methods [0.0]
We propose a generation-based evaluation paradigm that leverages decoder-only language models to produce high-quality references.<n> Empirical results show that our method outperforms both intra-LLM direct scoring baselines and external non-LLM reference-free metrics from MTME.
arXiv Detail & Related papers (2025-05-22T02:14:38Z)
Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models [60.00178316095646]
Sentence embedding is essential for many NLP tasks, with contrastive learning methods achieving strong performance using datasets like NLI.<n>Recent studies leverage large language models (LLMs) to generate sentence pairs, reducing annotation dependency.<n>We propose a method for controlling the generation direction of LLMs in the latent space. Unlike unconstrained generation, the controlled approach ensures meaningful semantic divergence.<n> Experiments on multiple benchmarks demonstrate that our method achieves new SOTA performance with a modest cost in ranking sentence synthesis.
arXiv Detail & Related papers (2025-02-19T12:07:53Z)
Combining Autoregressive and Autoencoder Language Models for Text Classification [1.0878040851638]
CAALM-TC is a novel method that enhances text classification by integrating autoregressive and autoencoder language models.<n> Experimental results on four benchmark datasets demonstrate that CAALM consistently outperforms existing methods.
arXiv Detail & Related papers (2024-11-20T12:49:42Z)
Unified Generative and Discriminative Training for Multi-modal Large Language Models [88.84491005030316]
Generative training has enabled Vision-Language Models (VLMs) to tackle various complex tasks. Discriminative training, exemplified by models like CLIP, excels in zero-shot image-text classification and retrieval. This paper proposes a unified approach that integrates the strengths of both paradigms.
arXiv Detail & Related papers (2024-11-01T01:51:31Z)
Counterfactuals As a Means for Evaluating Faithfulness of Attribution Methods in Autoregressive Language Models [6.394084132117747]
We propose a technique that leverages counterfactual generation to evaluate the faithfulness of attribution methods for autoregressive language models.<n>Our technique generates fluent, in-distribution counterfactuals, making the evaluation protocol more reliable.
arXiv Detail & Related papers (2024-08-21T00:17:59Z)
Open-Domain Text Evaluation via Contrastive Distribution Methods [75.59039812868681]
We introduce a novel method for evaluating open-domain text generation called Contrastive Distribution Methods. Our experiments on coherence evaluation for multi-turn dialogue and commonsense evaluation for controllable generation demonstrate CDM's superior correlate with human judgment.
arXiv Detail & Related papers (2023-06-20T20:37:54Z)
PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text. Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z)
Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models. We propose a simple and effective iterative training method called MIx Source and pseudo Target. Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z)
Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models [107.86965028729517]
Attention-based encoder-decoder (AED) models learn an implicit internal language model (ILM) from the training transcriptions. We propose several novel methods to estimate the ILM directly from the AED model.
arXiv Detail & Related papers (2021-04-12T15:16:03Z)
Contextualized Perturbation for Textual Adversarial Attack [56.370304308573274]
Adversarial examples expose the vulnerabilities of natural language processing (NLP) models. This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs.
arXiv Detail & Related papers (2020-09-16T06:53:15Z)
Hybrid Autoregressive Transducer (hat) [11.70833387055716]
This paper proposes and evaluates the hybrid autoregressive transducer (HAT) model. It is a time-synchronous encoderdecoder model that preserves the modularity of conventional automatic speech recognition systems. We evaluate our proposed model on a large-scale voice search task.
arXiv Detail & Related papers (2020-03-12T20:47:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.