A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders
- URL: http://arxiv.org/abs/2511.23007v1
- Date: Fri, 28 Nov 2025 09:16:35 GMT
- Title: A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders
- Authors: Yizheng Wang, Tao Jiang, Jinyan Bai, Zhengbin Zou, Tiancheng Xue, Nan Zhang, Jie Luan,
- Abstract summary: This paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE.<n>The proposed framework achieves a 10.4% improvement in both macro-F1 and weighted-F1 scores in in-domain settings, and an 11.4% increase in macro-F1 in cross-domain scenarios.
- Score: 5.780557526613627
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Software Requirement Document (RD) typically contain tens of thousands of individual requirements, and ensuring consistency among these requirements is critical for the success of software engineering projects. Automated detection methods can significantly enhance efficiency and reduce costs; however, existing approaches still face several challenges, including low detection accuracy on imbalanced data, limited semantic extraction due to the use of a single encoder, and suboptimal performance in cross-domain transfer learning. To address these issues, this paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS. First, the framework employs two independent encoders, Sentence-BERT (SBERT) and Simple Contrastive Sentence Embedding (SimCSE), to generate sentence embeddings for requirement pairs, followed by a six-element concatenation strategy. Furthermore, the classifier is enhanced by a two-layer fully connected feedforward neural network (FFNN) with a hybrid loss optimization strategy that integrates a variant of Focal Loss, domain-specific constraints, and a confidence-based penalty term. Finally, the framework synergistically integrates sequential and cross-domain transfer learning. Experimental results demonstrate that the proposed framework achieves a 10.4% improvement in both macro-F1 and weighted-F1 scores in in-domain settings, and an 11.4% increase in macro-F1 in cross-domain scenarios.
Related papers
- Refining Decision Boundaries In Anomaly Detection Using Similarity Search Within the Feature Space [3.3202103799131795]
We introduce SDA2E, a Sparse Dual Adversarial Attention-based AutoEncoder designed to learn compact and discriminative latent representations from imbalanced, high-dimensional data.<n>We propose a similarity-guided active learning framework that integrates three novel strategies to refine decision boundaries efficiently.<n>We evaluate SDA2E extensively across 52 imbalanced datasets, including multiple DARPA Transparent Computing scenarios, and benchmark it against 15 state-of-the-art anomaly detection methods.
arXiv Detail & Related papers (2026-02-02T23:55:08Z) - QoS-Aware Hierarchical Reinforcement Learning for Joint Link Selection and Trajectory Optimization in SAGIN-Supported UAV Mobility Management [52.15690855486153]
A space-air-ground integrated network (SAGIN) has emerged as an essential architecture for enabling ubiquitous UAV connectivity.<n>This paper formulates UAV mobility management in SAGIN as a constrained multiobjective joint optimization problem.
arXiv Detail & Related papers (2025-12-17T06:22:46Z) - CoT-Saliency: Unified Chain-of-Thought Reasoning for Heterogeneous Saliency Tasks [96.64597365827046]
We present the first unified framework that jointly handles three operationally heterogeneous saliency tasks.<n>We introduce a Chain-of-Thought (CoT) reasoning process in a Vision-Language Model (VLM) to bridge task heterogeneity.<n>We show our model matches or outperforms specialized SOTA methods and strong closed-source VLMs across all tasks.
arXiv Detail & Related papers (2025-11-01T04:37:01Z) - Localized Kernel Projection Outlyingness: A Two-Stage Approach for Multi-Modal Outlier Detection [0.0]
Two-Stage LKPLO is a novel multi-stage outlier detection framework.<n>It overcomes the coexisting limitations of conventional projection-based methods.<n>It achieves state-of-the-art performance on challenging datasets.
arXiv Detail & Related papers (2025-10-28T03:53:46Z) - Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning [71.30276778807068]
We propose a unified framework that strategically coordinates sample pruning and token pruning.<n>Q-Tuning achieves a +38% average improvement over the full-data SFT baseline using only 12.5% of the original training data.
arXiv Detail & Related papers (2025-09-28T13:27:38Z) - Domain Adaptation via Feature Refinement [0.3867363075280543]
We propose Domain Adaptation via Feature Refinement (DAFR2), a simple yet effective framework for unsupervised domain adaptation under distribution shift.<n>The proposed method combines three key components: adaptation of Batch Normalization statistics using unlabeled target data, feature distillation from a source-trained model and hypothesis transfer.
arXiv Detail & Related papers (2025-08-22T06:32:19Z) - Statistical Analysis of Conditional Group Distributionally Robust Optimization with Cross-Entropy Loss [16.1456465253627]
We study multi-source unsupervised domain adaptation, where labeled data are available from multiple source domains and only unlabeled data are observed from the target domain.<n>We propose a novel Group Distributionally Conditional Optimization framework that learns a classifier by minimizing the worst-case cross-entropy loss over the convex combinations of the conditional outcome distributions from sources domains.<n>We establish fast statistical convergence rates for the empirical CG-DRO estimator by constructing two surrogate minimax optimization problems that serve as theoretical bridges.
arXiv Detail & Related papers (2025-07-14T04:21:23Z) - NDCG-Consistent Softmax Approximation with Accelerated Convergence [67.10365329542365]
We propose novel loss formulations that align directly with ranking metrics.<n>We integrate the proposed RG losses with the highly efficient Alternating Least Squares (ALS) optimization method.<n> Empirical evaluations on real-world datasets demonstrate that our approach achieves comparable or superior ranking performance.
arXiv Detail & Related papers (2025-06-11T06:59:17Z) - Sparse Optimization for Transfer Learning: A L0-Regularized Framework for Multi-Source Domain Adaptation [9.605924781372849]
We propose a Sparse Optimization for Transfer Learning framework based on L0-regularization.<n> Simulations show that SOTL significantly improves both estimation accuracy and computational speed.<n> Empirical validation on the Community and Crime benchmarks demonstrates the statistical robustness of the SOTL method in cross-domain transfer.
arXiv Detail & Related papers (2025-04-07T08:06:16Z) - Communication-Efficient Federated Learning by Quantized Variance Reduction for Heterogeneous Wireless Edge Networks [55.467288506826755]
Federated learning (FL) has been recognized as a viable solution for local-privacy-aware collaborative model training in wireless edge networks.<n>Most existing communication-efficient FL algorithms fail to reduce the significant inter-device variance.<n>We propose a novel communication-efficient FL algorithm, named FedQVR, which relies on a sophisticated variance-reduced scheme.
arXiv Detail & Related papers (2025-01-20T04:26:21Z) - Disentangled Federated Learning for Tackling Attributes Skew via
Invariant Aggregation and Diversity Transferring [104.19414150171472]
Attributes skews the current federated learning (FL) frameworks from consistent optimization directions among the clients.
We propose disentangled federated learning (DFL) to disentangle the domain-specific and cross-invariant attributes into two complementary branches.
Experiments verify that DFL facilitates FL with higher performance, better interpretability, and faster convergence rate, compared with SOTA FL methods.
arXiv Detail & Related papers (2022-06-14T13:12:12Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.