Related papers: Semantic Fusion with Fuzzy-Membership Features for Controllable Language Modelling

Semantic Fusion with Fuzzy-Membership Features for Controllable Language Modelling

URL: http://arxiv.org/abs/2509.13357v1
Date: Sun, 14 Sep 2025 22:11:09 GMT
Title: Semantic Fusion with Fuzzy-Membership Features for Controllable Language Modelling
Authors: Yongchao Huang, Hassan Raza,
Abstract summary: semantic fusion is a lightweight scheme that augments a Transformer language model (LM) with a fuzzy-membership feature channel.<n>Each token is represented by a vector of interpretable features whose values are graded degrees from differentiable membership functions.<n>This approach adds only small overhead, remains fully compatible with tied input-output embeddings, and provides an interpretable pathway for conditioned natural language generation.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose semantic fusion, a lightweight scheme that augments a Transformer language model (LM) with a parallel, fuzzy-membership feature channel that encodes token-level semantics. Each token is represented by a vector of interpretable features (e.g. part-of-speech cues, shallow roles, boundary flags, sentiment polarity and strength) whose values are graded degrees from differentiable membership functions (e.g. power kernels). These per-token vectors form a sentence-level semantic matrix fused via a gated adapter into the LM. Training uses standard next-token prediction, an auxiliary loss that reconstructs the semantic features from hidden states, and a lightweight uniformizer that regularizes adjective-class distributions. On a synthetic two-clause corpus with held-out adjectives for out-of-distribution (OOD) control, semantic fusion improves perplexity and enables precise, user-controllable generation of polarity and punctuation while maintaining model simplicity. This approach adds only small overhead, remains fully compatible with tied input-output embeddings, and provides an interpretable pathway for conditioned natural language generation.

Related papers

Zonkey: A Hierarchical Diffusion Language Model with Differentiable Tokenization and Probabilistic Attention [0.0]
Zonkey is a hierarchical diffusion model that addresses limitations through a fully trainable pipeline from raw characters to document-level representations.<n>At its core is a differentiable tokenizer that learns probabilistic beginning-of-sequence (BOS) decisions.<n>Zonkey generates coherent, variable-length text from noise, demonstrating emergent hierarchies.
arXiv Detail & Related papers (2026-01-29T14:17:37Z)
S2Sent: Nested Selectivity Aware Sentence Representation Learning [5.284254208630281]
We propose a sentence representation selection mechanism Ssuperscript2Sent.<n>The selector performs spatial selection (SS) and nested frequency selection (FS) from a modular perspective.<n>Extensive experiments have demonstrated that Stextsuperscript2Sent achieves significant improvements over baseline methods.
arXiv Detail & Related papers (2025-08-25T16:13:42Z)
Evaluating Sparse Autoencoders: From Shallow Design to Matching Pursuit [16.996218963146788]
Sparse autoencoders (SAEs) have recently become central tools for interpretability.<n>This paper evaluates SAEs in a controlled setting using MNIST.<n>We introduce a multi-iteration SAE by unrolling Matching Pursuit (MP-SAE)
arXiv Detail & Related papers (2025-06-05T16:57:58Z)
Enhancing Latent Computation in Transformers with Latent Tokens [48.371764897314]
Augmenting large language models with auxiliary tokens has emerged as a promising strategy for enhancing model performance.<n>We introduce a lightweight method termed latent tokens; these are dummy tokens that may be non-interpretable in natural language.<n>The proposed latent tokens can be seamlessly integrated with a pre-trained Transformer, trained in a parameter-efficient manner, and applied flexibly at inference time.
arXiv Detail & Related papers (2025-05-19T02:35:53Z)
Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence [6.991281327290525]
Language models lack the notion of interchangeable tokens.<n>We formalize this machine learning problem and introduce alpha-covariance.<n>Our findings establish a foundation for designing language models that can learn interchangeable token representations.
arXiv Detail & Related papers (2024-10-22T16:34:36Z)
Activation Scaling for Steering and Interpreting Language Models [55.59689963561315]
We argue that successfully intervening on a model is a prerequisite for interpreting its internal workings. We establish a three-term objective: a successful intervention should flip the correct with the wrong token and vice versa. Using gradient-based optimization, this objective lets us learn (and later evaluate) a specific kind of efficient and interpretable intervention.
arXiv Detail & Related papers (2024-10-07T12:01:32Z)
Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion. It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing. Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z)
Token Fusion: Bridging the Gap between Token Pruning and Token Merging [71.84591084401458]
Vision Transformers (ViTs) have emerged as powerful backbones in computer vision, outperforming many traditional CNNs. computational overhead, largely attributed to the self-attention mechanism, makes deployment on resource-constrained edge devices challenging. We introduce "Token Fusion" (ToFu), a method that amalgamates the benefits of both token pruning and token merging.
arXiv Detail & Related papers (2023-12-02T04:29:19Z)
Guiding the PLMs with Semantic Anchors as Intermediate Supervision: Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network. By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks. We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.