Related papers: Detecting Semantic Conflicts with Unit Tests

Detecting Semantic Conflicts with Unit Tests

URL: http://arxiv.org/abs/2310.02395v1
Date: Tue, 3 Oct 2023 19:36:28 GMT
Title: Detecting Semantic Conflicts with Unit Tests
Authors: L\'euson Da Silva, Paulo Borba, Toni Maciel, Wardah Mahmood, Thorsten Berger, Jo\~ao Moisakis, Aldiberg Gomes, Vin\'icius Leite
Abstract summary: Branching and merging are common practices in software development, increasing developer's productivity. Modern merge techniques can resolve textual conflicts automatically, but they fail when the conflict arises at the semantic level. We proposeSemAntic Merge, a semantic merge tool based on the automated generation of unit tests.
Score: 5.273883263686449
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Branching and merging are common practices in collaborative software development, increasing developer's productivity. Despite such benefits, developers need to merge software and resolve merge conflicts. While modern merge techniques can resolve textual conflicts automatically, they fail when the conflict arises at the semantic level. Although semantic merge tools have been proposed, they are usually based on heavyweight static analyses or need explicit specifications of program behavior. In this work, we take a different route and propose SAM (SemAntic Merge), a semantic merge tool based on the automated generation of unit tests that are used as partial specifications. To evaluate SAM's feasibility for detecting conflicts, we perform an empirical study analyzing more than 80 pairs of changes integrated into common class elements from 51 merge scenarios. Furthermore, we also assess how the four unit-test generation tools used by SAM contribute to conflict identification. We propose and assess the adoption of Testability Transformations and Serialization. Our results show that SAM best performs when combining only the tests generated by Differential EvoSuite and EvoSuite and using the proposed Testability Transformations (nine detected conflicts out of 28). These results reinforce previous findings about the potential of using test-case generation to detect test conflicts.

Related papers

CLOVER: A Test Case Generation Benchmark with Coverage, Long-Context, and Verification [71.34070740261072]
This paper presents a benchmark, CLOVER, to evaluate models' capabilities in generating and completing test cases. The benchmark is containerized for code execution across tasks, and we will release the code, data, and construction methodologies.
arXiv Detail & Related papers (2025-02-12T21:42:56Z)
Commit0: Library Generation from Scratch [77.38414688148006]
Commit0 is a benchmark that challenges AI agents to write libraries from scratch. Agents are provided with a specification document outlining the library's API as well as a suite of interactive unit tests. Commit0 also offers an interactive environment where models receive static analysis and execution feedback on the code they generate.
arXiv Detail & Related papers (2024-12-02T18:11:30Z)
Evaluation of Version Control Merge Tools [3.1969855247377836]
A version control system, such as Git, requires a way to integrate changes from different developers or branches. A merge tool either outputs a clean integration of the changes, or it outputs a conflict for manual resolution. New merge tools have been proposed, but they have not yet been evaluated against one another.
arXiv Detail & Related papers (2024-10-13T17:35:14Z)
CONGRA: Benchmarking Automatic Conflict Resolution [3.9910625211670485]
ConGra is a benchmarking scheme designed to evaluate the performance of software merging tools. We build a large-scale evaluation dataset based on 44,948 conflicts from 34 real-world projects.
arXiv Detail & Related papers (2024-09-21T12:21:41Z)
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge [57.66282463340297]
Knowledge conflict arises from discrepancies between information in the context of a large language model (LLM) and the knowledge stored in its parameters. We propose a fine-grained, instance-level approach called AdaCAD, which dynamically infers the weight of adjustment based on the degree of conflict.
arXiv Detail & Related papers (2024-09-11T16:35:18Z)
Leveraging Large Language Models for Enhancing the Understandability of Generated Unit Tests [4.574205608859157]
We introduce UTGen, which combines search-based software testing and large language models to enhance the understandability of automatically generated test cases. We observe that participants working on assignments with UTGen test cases fix up to 33% more bugs and use up to 20% less time when compared to baseline test cases.
arXiv Detail & Related papers (2024-08-21T15:35:34Z)
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation [88.80792308991867]
Segment Anything model (SAM) has shown ability to group image pixels into patches, but applying it to semantic-aware segmentation still faces major challenges. This paper presents SAM-CP, a simple approach that establishes two types of composable prompts beyond SAM and composes them for versatile segmentation. Experiments show that SAM-CP achieves semantic, instance, and panoptic segmentation in both open and closed domains.
arXiv Detail & Related papers (2024-07-23T17:47:25Z)
Observation-based unit test generation at Meta [52.4716552057909]
TestGen automatically generates unit tests, carved from serialized observations of complex objects, observed during app execution. TestGen has landed 518 tests into production, which have been executed 9,617,349 times in continuous integration, finding 5,702 faults. Our evaluation reveals that, when carving its observations from 4,361 reliable end-to-end tests, TestGen was able to generate tests for at least 86% of the classes covered by end-to-end tests.
arXiv Detail & Related papers (2024-02-09T00:34:39Z)
Code-Aware Prompting: A study of Coverage Guided Test Generation in Regression Setting using LLM [32.44432906540792]
We present SymPrompt, a code-aware prompting strategy for large language models in test generation. SymPrompt enhances correct test generations by a factor of 5 and bolsters relative coverage by 26% for CodeGen2. Notably, when applied to GPT-4, SymPrompt improves coverage by over 2x compared to baseline prompting strategies.
arXiv Detail & Related papers (2024-01-31T18:21:49Z)
Detecting Semantic Conflicts using Static Analysis [1.201626478128059]
We propose a technique that explores the use of static analysis to detect interference when merging contributions from two developers. We evaluate our technique using a dataset of 99 experimental units extracted from merge scenarios.
arXiv Detail & Related papers (2023-10-06T14:13:16Z)
Towards Automatic Generation of Amplified Regression Test Oracles [44.45138073080198]
We propose a test oracle derivation approach to amplify regression test oracles. The approach monitors the object state during test execution and compares it to the previous version to detect any changes in relation to the SUT's intended behaviour.
arXiv Detail & Related papers (2023-07-28T12:38:44Z)
Generalizable Metric Network for Cross-domain Person Re-identification [55.71632958027289]
Cross-domain (i.e., domain generalization) scene presents a challenge in Re-ID tasks. Most existing methods aim to learn domain-invariant or robust features for all domains. We propose a Generalizable Metric Network (GMN) to explore sample similarity in the sample-pair space.
arXiv Detail & Related papers (2023-06-21T03:05:25Z)
Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale Feature Fusion [95.7732308775325]
The proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection. DETR suffers from slow training convergence, which hinders its applicability to various detection tasks. We design Semantic-Aligned-Matching DETR++ to accelerate DETR's convergence and improve detection performance.
arXiv Detail & Related papers (2022-07-28T15:34:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.