Warnings: Violation Symptoms Indicating Architecture Erosion
- URL: http://arxiv.org/abs/2212.12168v2
- Date: Fri, 4 Aug 2023 14:45:10 GMT
- Title: Warnings: Violation Symptoms Indicating Architecture Erosion
- Authors: Ruiyin Li, Peng Liang, Paris Avgeriou
- Abstract summary: We investigated the characteristics of architecture violation symptoms in code review comments from the developers' perspective.
Ten categories of violation symptoms are discussed by developers during the code review process.
The most frequently-used linguistic pattern is Problem Discovery.
- Score: 2.6580082406002705
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As a software system evolves, its architecture tends to degrade, and
gradually impedes software maintenance and evolution activities and negatively
impacts the quality attributes of the system. The main root cause behind
architecture erosion phenomenon derives from violation symptoms (such as
violations of architecture pattern). Previous studies focus on detecting
violations in software systems using architecture conformance checking
approaches. However, code review comments are also rich sources that may
contain extensive discussions regarding architecture violations. In this work,
we investigated the characteristics of architecture violation symptoms in code
review comments from the developers' perspective. We employed a set of keywords
related to violation symptoms to collect 606 (out of 21,583) code review
comments from four popular OSS projects in the OpenStack and Qt communities. We
manually analyzed the collected 606 review comments to provide the categories
and linguistic patterns of violation symptoms, as well as the reactions how
developers addressed them. Our findings show that: (1) 10 categories of
violation symptoms are discussed by developers during the code review process;
(2) The frequently-used terms of expressing violation symptoms are
"inconsistent" and "violate", and the most frequently-used linguistic pattern
is Problem Discovery; (3) Refactoring and removing code are the major measures
(90%) to tackle violation symptoms, while a few violation symptoms were ignored
by developers. Our findings suggest that the investigation of violation
symptoms can help researchers better understand the characteristics of
architecture erosion and facilitate the development and maintenance activities,
and developers should explicitly manage violation symptoms, not only for
addressing the existing architecture violations but also preventing future
violations.
Related papers
- A Causal Perspective on Measuring, Explaining and Mitigating Smells in LLM-Generated Code [49.09545217453401]
Propensity Smelly Score (PSC) is a metric that estimates the likelihood of generating particular smell types.<n>We identify how generation strategy, model size, model architecture and prompt formulation shape the structural properties of generated code.<n> PSC helps developers interpret model behavior and assess code quality, providing evidence that smell propensity signals can support human judgement.
arXiv Detail & Related papers (2025-11-19T19:18:28Z) - Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics [89.1999907891494]
We present WebDetective, a benchmark of hint-free multi-hop questions paired with a controlled Wikipedia sandbox.<n>Our evaluation of 25 state-of-the-art models reveals systematic weaknesses across all architectures.<n>We develop an agentic workflow, EvidenceLoop, that explicitly targets the challenges our benchmark identifies.
arXiv Detail & Related papers (2025-10-01T07:59:03Z) - On the need to perform comprehensive evaluations of automated program repair benchmarks: Sorald case study [4.968268396950843]
Automated program repair (APR) tools aim to improve code quality by automatically addressing violations detected by static analysis profilers.<n>Previous research tends to evaluate APR tools only for their ability to clear violations.<n>This study evaluates Sorald, a state-of-the-art APR tool, as a proof of concept.
arXiv Detail & Related papers (2025-08-21T00:12:14Z) - Defects4Log: Benchmarking LLMs for Logging Code Defect Detection and Reasoning [17.585929362588555]
Logging code is written by developers to capture system runtime behavior.<n>Defects in logging code can undermine the usefulness of logs and lead to misinterpretations.<n>Large language models (LLMs) have demonstrated promising generalization and reasoning capabilities.
arXiv Detail & Related papers (2025-08-15T08:20:09Z) - FaultLine: Automated Proof-of-Vulnerability Generation Using LLM Agents [17.658431034176065]
FaultLine is an agent workflow that automatically generates proof-of-vulnerability (PoV) test cases.<n>It does not use language-specific static or dynamic analysis components, which enables it to be used across programming languages.<n>On a dataset of 100 known vulnerabilities in Java, C and C++ projects, FaultLine is able to generate PoV tests for 16 projects, compared to just 9 for CodeAct 2.1.
arXiv Detail & Related papers (2025-07-21T04:55:34Z) - Bugs in the Shadows: Static Detection of Faulty Python Refactorings [44.115219601924856]
Python's dynamic type system poses significant challenges for automated code transformations.<n>Our analysis uncovered 29 bugs across four types from a total of 1,152 attempts.<n>These results highlight the need to improve the robustness of current Python tools to ensure the correctness of automated code transformations.
arXiv Detail & Related papers (2025-07-01T18:03:56Z) - Detecting the Root Cause Code Lines in Bug-Fixing Commits by Heterogeneous Graph Learning [1.5213722322518697]
Automated defect prediction tools can proactively identify software changes prone to defects within software projects.<n>Existing work in heterogeneous and complex software projects continues to face challenges, such as struggling with heterogeneous commit structures and ignoring cross-line dependencies in code changes.<n>We propose an approach called RC_Detector, which consists of three main components: the bug-fixing graph construction component, the code semantic aggregation component, and the cross-line semantic retention component.
arXiv Detail & Related papers (2025-05-02T05:39:50Z) - Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework [58.36391985790157]
In real world software development, improper or missing exception handling can severely impact the robustness and reliability of code.
We explore the use of large language models (LLMs) to improve exception handling in code.
We propose Seeker, a multi-agent framework inspired by expert developer strategies for exception handling.
arXiv Detail & Related papers (2024-12-16T12:35:29Z) - Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub.
83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z) - Seeker: Enhancing Exception Handling in Code with LLM-based Multi-Agent Approach [54.03528377384397]
In real world software development, improper or missing exception handling can severely impact the robustness and reliability of code.
We explore the use of large language models (LLMs) to improve exception handling in code.
We propose Seeker, a multi agent framework inspired by expert developer strategies for exception handling.
arXiv Detail & Related papers (2024-10-09T14:45:45Z) - Patch2QL: Discover Cognate Defects in Open Source Software Supply Chain
With Auto-generated Static Analysis Rules [1.9591497166224197]
We propose a novel technique for detecting cognate defects in OSS through the automatic generation of SAST rules.
Specifically, it extracts key syntax and semantic information from pre- and post-patch versions of code.
We have implemented a prototype tool called Patch2QL and applied it to fundamental OSS in C/C++.
arXiv Detail & Related papers (2024-01-23T02:23:11Z) - Identifying Defect-Inducing Changes in Visual Code [54.20154707138088]
"SZZ Visual Code" (SZZ-VC) is an algorithm that finds changes in visual code based on the differences of graphical elements rather than differences of lines to detect defect-inducing changes.
We validated the algorithm for an industry-made AAA video game and 20 music visual programming defects across 12 open source projects.
arXiv Detail & Related papers (2023-09-07T00:12:28Z) - Security Defect Detection via Code Review: A Study of the OpenStack and
Qt Communities [7.2944322548786715]
Security defects are not prevalently discussed in code review.
More than half of the reviewers provided explicit fixing strategies/solutions to help developers fix security defects.
Disagreement between the developer and the reviewer are the main causes for not resolving security defects.
arXiv Detail & Related papers (2023-07-05T14:30:41Z) - Towards Automated Identification of Violation Symptoms of Architecture Erosion [2.3649868749585874]
We developed 15 machine learning-based and 4 deep learning-based classifiers with three pre-trained word embeddings to identify violation symptoms of architecture erosion from developer discussions in code reviews.
We conducted a survey and semi-structured interviews to acquire feedback from involved participants who discussed architecture violations in code reviews.
The results show that the SVM classifier based on word2vec pre-trained word embedding performs the best with an F1-score of 0.779.
arXiv Detail & Related papers (2023-06-14T16:20:59Z) - A Hierarchical Deep Neural Network for Detecting Lines of Codes with
Vulnerabilities [6.09170287691728]
Software vulnerabilities, caused by unintentional flaws in source codes, are the main root cause of cyberattacks.
We propose a deep learning approach to detect vulnerabilities from their LLVM IR representations based on the techniques that have been used in natural language processing.
arXiv Detail & Related papers (2022-11-15T21:21:27Z) - Towards a Fair Comparison and Realistic Design and Evaluation Framework
of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework.
We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models.
We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z) - Early Detection of Security-Relevant Bug Reports using Machine Learning:
How Far Are We? [6.438136820117887]
In a typical maintenance scenario, security-relevant bug reports are prioritised by the development team when preparing corrective patches.
Open security-relevant bug reports can become a critical leak of sensitive information that attackers can leverage to perform zero-day attacks.
In recent years, approaches for the detection of security-relevant bug reports based on machine learning have been reported with promising performance.
arXiv Detail & Related papers (2021-12-19T11:30:29Z) - No Need to Know Physics: Resilience of Process-based Model-free Anomaly
Detection for Industrial Control Systems [95.54151664013011]
We present a novel framework to generate adversarial spoofing signals that violate physical properties of the system.
We analyze four anomaly detectors published at top security conferences.
arXiv Detail & Related papers (2020-12-07T11:02:44Z) - Deep Just-In-Time Inconsistency Detection Between Comments and Source
Code [51.00904399653609]
In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code.
We develop a deep-learning approach that learns to correlate a comment with code changes.
We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
arXiv Detail & Related papers (2020-10-04T16:49:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.