Related papers: A Machine Learning Approach for Hierarchical Classification of Software Requirements

A Machine Learning Approach for Hierarchical Classification of Software Requirements

URL: http://arxiv.org/abs/2302.12599v1
Date: Fri, 24 Feb 2023 12:33:55 GMT
Title: A Machine Learning Approach for Hierarchical Classification of Software Requirements
Authors: Manal Binkhonain, Liping Zhao
Abstract summary: The paper proposes HC4RC, a novel ML approach for multiclass classification of requirements. We experimentally compare the effectiveness of HC4RC with three closely related approaches.
Score: 3.8377728124578856
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Context: Classification of software requirements into different categories is a critically important task in requirements engineering (RE). Developing machine learning (ML) approaches for requirements classification has attracted great interest in the RE community since the 2000s. Objective: This paper aims to address two related problems that have been challenging real-world applications of ML approaches: the problems of class imbalance and high dimensionality with low sample size data (HDLSS). These problems can greatly degrade the classification performance of ML methods. Method: The paper proposes HC4RC, a novel ML approach for multiclass classification of requirements. HC4RC solves the aforementioned problems through semantic-role-based feature selection, dataset decomposition and hierarchical classification. We experimentally compare the effectiveness of HC4RC with three closely related approaches - two of which are based on a traditional statistical classification model whereas one uses an advanced deep learning model. Results: Our experiment shows: 1) The class imbalance and HDLSS problems present a challenge to both traditional and advanced ML approaches. 2) The HC4RC approach is simple to use and can effectively address the class imbalance and HDLSS problems compared to similar approaches. Conclusion: This paper makes an important practical contribution to addressing the class imbalance and HDLSS problems in multiclass classification of software requirements.

Related papers

Unbiased Max-Min Embedding Classification for Transductive Few-Shot Learning: Clustering and Classification Are All You Need [83.10178754323955]
Few-shot learning enables models to generalize from only a few labeled examples. We propose the Unbiased Max-Min Embedding Classification (UMMEC) Method, which addresses the key challenges in few-shot learning. Our method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in annotatedL.
arXiv Detail & Related papers (2025-03-28T07:23:07Z)
Class-Independent Increment: An Efficient Approach for Multi-label Class-Incremental Learning [49.65841002338575]
This paper focuses on the challenging yet practical multi-label class-incremental learning (MLCIL) problem. We propose a novel class-independent incremental network (CINet) to extract multiple class-level embeddings for multi-label samples. It learns and preserves the knowledge of different classes by constructing class-specific tokens.
arXiv Detail & Related papers (2025-03-01T14:40:52Z)
The Multiplex Classification Framework: optimizing multi-label classifiers through problem transformation, ontology engineering, and model ensembling [0.0]
This paper introduces the Multiplex Classification Framework. The framework offers several advantages, including adaptability to any number of classes and logical constraints. Two experiments were conducted to compare the performance of conventional classification models with the Multiplex approach.
arXiv Detail & Related papers (2024-12-18T20:07:27Z)
ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection [60.297079601066784]
We introduce ErrorRadar, the first benchmark designed to assess MLLMs' capabilities in error detection. ErrorRadar evaluates two sub-tasks: error step identification and error categorization. It consists of 2,500 high-quality multimodal K-12 mathematical problems, collected from real-world student interactions. Results indicate significant challenges still remain, as GPT-4o with best performance is still around 10% behind human evaluation.
arXiv Detail & Related papers (2024-10-06T14:59:09Z)
Sequential Binary Classification for Intrusion Detection [0.0]
IDS datasets suffer from high class imbalance, which impacts the performance of standard ML models. This paper explores a structural approach to handling class imbalance in multi-class classification problems. Experiments on benchmark IDS datasets demonstrate that the structural approach to handling class-imbalance, as exemplified by SBC, is a viable approach to handling the issue.
arXiv Detail & Related papers (2024-06-10T08:34:13Z)
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF [82.73541793388]
We introduce the first principled algorithmic framework for solving bilevel RL problems through the lens of penalty formulation. We provide theoretical studies of the problem landscape and its penalty-based gradient (policy) algorithms. We demonstrate the effectiveness of our algorithms via simulations in the Stackelberg Markov game, RL from human feedback and incentive design.
arXiv Detail & Related papers (2024-02-10T04:54:15Z)
Classification, Challenges, and Automated Approaches to Handle Non-Functional Requirements in ML-Enabled Systems: A Systematic Literature Review [10.09767622002672]
We propose a systematic literature review targeting two key aspects: the classification of the non-functional requirements investigated so far, and the challenges to be faced when developing models in ML-enabled systems. We report that current research identified 30 different non-functional requirements, which can be grouped into six main classes. We also compiled a catalog of more than 23 software engineering challenges, based on which further research should consider the nonfunctional requirements of machine learning-enabled systems.
arXiv Detail & Related papers (2023-11-29T09:45:41Z)
Few-shot Class-incremental Learning: A Survey [16.729567512584822]
Few-shot Class-Incremental Learning (FSCIL) presents a unique challenge in Machine Learning (ML) This paper aims to provide a comprehensive and systematic review of FSCIL.
arXiv Detail & Related papers (2023-08-13T13:01:21Z)
A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing [68.37496795076203]
We provide guidance for NLP researchers and practitioners dealing with imbalanced data. We first discuss various types of controlled and real-world class imbalance. We organize the methods by whether they are based on sampling, data augmentation, choice of loss function, staged learning, or model design.
arXiv Detail & Related papers (2022-10-10T13:26:40Z)
Class-Imbalanced Complementary-Label Learning via Weighted Loss [8.934943507699131]
Complementary-label learning (CLL) is widely used in weakly supervised classification. It faces a significant challenge in real-world datasets when confronted with class-imbalanced training samples. We propose a novel problem setting that enables learning from class-imbalanced complementary labels for multi-class classification.
arXiv Detail & Related papers (2022-09-28T16:02:42Z)
Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees [115.08148491584997]
We present the first theoretically grounded distributed methods for solving variational inequalities and saddle point problems using compressed communication: MASHA1 and MASHA2. New algorithms support bidirectional compressions, and also can be modified for setting with batches and for federated learning with partial participation of clients.
arXiv Detail & Related papers (2021-10-07T10:04:32Z)
Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems. In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z)
An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems. Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)
Combined Cleaning and Resampling Algorithm for Multi-Class Imbalanced Data with Label Noise [11.868507571027626]
In this paper, we propose a novel oversampling technique, a Multi-Class Combined Cleaning and Resampling algorithm. The proposed method utilizes an energy-based approach to modeling the regions suitable for oversampling, less affected by small disjuncts and outliers than SMOTE. It combines it with a simultaneous cleaning operation, the aim of which is to reduce the effect of overlapping class distributions on the performance of the learning algorithms.
arXiv Detail & Related papers (2020-04-07T13:59:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.