Detecting Semantic Clones of Unseen Functionality
- URL: http://arxiv.org/abs/2510.04143v1
- Date: Sun, 05 Oct 2025 10:45:52 GMT
- Title: Detecting Semantic Clones of Unseen Functionality
- Authors: Konstantinos Kitsios, Francesco Sovrano, Earl T. Barr, Alberto Bacchelli,
- Abstract summary: We re-evaluate six state-of-the-art models, including both task-specific models and generative LLMs, on the task of detecting clones of unseen functionality.<n>We propose and evaluate the use of contrastive learning to improve the performance of existing models on clones of unseen functionality.
- Score: 7.660632979515074
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic code clone detection is the task of detecting whether two snippets of code implement the same functionality (e.g., Sort Array). Recently, many neural models achieved near-perfect performance on this task. These models seek to make inferences based on their training data. Consequently, they better detect clones similar to those they have seen during training and may struggle to detect those they have not. Developers seeking clones are, of course, interested in both types of clones. We confirm this claim through a literature review, identifying three practical clone detection tasks in which the model's goal is to detect clones of a functionality even if it was trained on clones of different functionalities. In light of this finding, we re-evaluate six state-of-the-art models, including both task-specific models and generative LLMs, on the task of detecting clones of unseen functionality. Our experiments reveal a drop in F1 of up to 48% (average 31%) for task-specific models. LLMs perform on par with task-specific models without explicit training for clone detection, but generalize better to unseen functionalities, where F1 drops up to 5% (average 3%) instead. We propose and evaluate the use of contrastive learning to improve the performance of existing models on clones of unseen functionality. We draw inspiration from the computer vision and natural language processing fields where contrastive learning excels at measuring similarity between two objects, even if they come from classes unseen during training. We replace the final classifier of the task-specific models with a contrastive classifier, while for the generative LLMs we propose contrastive in-context learning, guiding the LLMs to focus on the differences between clones and non-clones. The F1 on clones of unseen functionality is improved by up to 26% (average 9%) for task-specific models and up to 5% (average 3%) for LLMs.
Related papers
- HyClone: Bridging LLM Understanding and Dynamic Execution for Semantic Code Clone Detection [3.2167919219391474]
Code clone detection is a critical task in software engineering, aimed at identifying duplicated or similar code fragments within or across software systems.<n>Recent advances in large language models (LLMs) have shown promise in understanding code semantics.<n>We propose a novel two-stage framework that combines LLM-based screening with execution-based validation for detecting semantic clones in Python programs.
arXiv Detail & Related papers (2025-08-02T13:11:56Z) - On the Use of Deep Learning Models for Semantic Clone Detection [4.796947520072581]
We propose a multi-step evaluation approach for five state-of-the-art clone detection models leveraging existing benchmark datasets.<n>Specifically, we examine three highly-performing single-language models (ASTNN, GMN, CodeBERT) on BigCloneBench, SemanticCloneBench, and GPTCloneBench.<n>While single-language models show high F1 scores for BigCloneBench, their performance on SemanticCloneBench varies (up to 20%)<n>Interestingly, the cross-language model (C4) shows superior performance (around 7%) on SemanticCloneBench over other models.
arXiv Detail & Related papers (2024-12-19T11:15:02Z) - Mitigating Copy Bias in In-Context Learning through Neuron Pruning [74.91243772654519]
Large language models (LLMs) have demonstrated impressive few-shot in-context learning abilities.
They are sometimes prone to a copying bias', where they copy answers from provided examples instead of learning the underlying patterns.
We propose a novel and simple method to mitigate such copying bias.
arXiv Detail & Related papers (2024-10-02T07:18:16Z) - The Struggles of LLMs in Cross-lingual Code Clone Detection [3.5202378300682162]
Cross-lingual code clone detection has gained traction within the software engineering community.<n>Inspired by the significant advances in machine learning, this paper revisits cross-lingual code clone detection.<n>We evaluate the performance of five (05) Large Language Models (LLMs) and eight prompts (08) for the identification of cross-lingual code clones.
arXiv Detail & Related papers (2024-08-08T12:57:14Z) - Assessing the Code Clone Detection Capability of Large Language Models [0.0]
The evaluation involves testing the models on a variety of code pairs of different clone types and levels of similarity.
Findings indicate that GPT-4 consistently surpasses GPT-3.5 across all clone types.
arXiv Detail & Related papers (2024-07-02T16:20:44Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs.
We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z) - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM.
For learning methods, we explore the claim of a "free lunch" hypothesis.
For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z) - Partial Network Cloning [58.83278629019384]
PNC conducts partial parametric "cloning" from a source network and then injects the cloned module to the target.
Our method yields a significant improvement of 5% in accuracy and 50% in locality when compared with parameter-tuning based methods.
arXiv Detail & Related papers (2023-03-19T08:20:31Z) - Evaluation of Contrastive Learning with Various Code Representations for
Code Clone Detection [3.699097874146491]
We evaluate contrastive learning for detecting semantic clones of code snippets.
We use CodeTransformator to create a dataset that mimics plagiarised code based on competitive programming solutions.
The results of our evaluation show that proposed models perform diversely in each task, however the performance of the graph-based models is generally above the others.
arXiv Detail & Related papers (2022-06-17T12:25:44Z) - Semantic Clone Detection via Probabilistic Software Modeling [69.43451204725324]
This article contributes a semantic clone detection approach that detects clones that have 0% syntactic similarity.
We present SCD-PSM as a stable and precise solution to semantic clone detection.
arXiv Detail & Related papers (2020-08-11T17:54:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.