Related papers: Discovering and exploring cases of educational source code plagiarism with Dolos

Discovering and exploring cases of educational source code plagiarism with Dolos

URL: http://arxiv.org/abs/2402.10853v2
Date: Wed, 21 Feb 2024 10:51:12 GMT
Title: Discovering and exploring cases of educational source code plagiarism with Dolos
Authors: Rien Maertens, Maarten Van Neyghem, Maxiem Geldhof, Charlotte Van Petegem, Niko Strijbol, Peter Dawyndt, Bart Mesuere
Abstract summary: Dolos is an ecosystem of tools for detecting and preventing plagiarism in educational source code. Educators can now run the entire plagiarism pipeline from a new web app in their browser. New dashboards provide an instant assessment of whether a collection of source files contains suspected cases of plagiarism.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Source code plagiarism is a significant issue in educational practice, and educators need user-friendly tools to cope with such academic dishonesty. This article introduces the latest version of Dolos, a state-of-the-art ecosystem of tools for detecting and preventing plagiarism in educational source code. In this new version, the primary focus has been on enhancing the user experience. Educators can now run the entire plagiarism detection pipeline from a new web app in their browser, eliminating the need for any installation or configuration. Completely redesigned analytics dashboards provide an instant assessment of whether a collection of source files contains suspected cases of plagiarism and how widespread plagiarism is within the collection. The dashboards support hierarchically structured navigation to facilitate zooming in and out of suspect cases. Clusters are an essential new component of the dashboard design, reflecting the observation that plagiarism can occur among larger groups of students. To meet various user needs, the Dolos software stack for source code plagiarism detections now includes a web interface, a JSON application programming interface (API), a command line interface (CLI), a JavaScript library and a preconfigured Docker container. Clear documentation and a free-to-use instance of the web app can be found at https://dolos.ugent.be. The source code is also available on GitHub.

Related papers

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories. PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files. We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
LLMs Plagiarize: Ensuring Responsible Sourcing of Large Language Model Training Data Through Knowledge Graph Comparison [0.0]
We propose a novel system, a variant of a plagiarism detection system, that assesses whether a knowledge source has been used in the training or fine-tuning of a large language model. Unlike current methods, we utilize an approach that uses Resource Description Framework (RDF) triples to create knowledge graphs from both a source document and an LLM continuation of that document. These graphs are then analyzed with respect to content using cosine similarity and with respect to structure using a normalized version of graph edit distance that shows the degree of isomorphism.
arXiv Detail & Related papers (2024-07-02T20:49:21Z)
CONCORD: Clone-aware Contrastive Learning for Source Code [64.51161487524436]
Self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE tasks. We argue that it is also essential to factor in how developers code day-to-day for general-purpose representation learning. In particular, we propose CONCORD, a self-supervised, contrastive learning strategy to place benign clones closer in the representation space while moving deviants further apart.
arXiv Detail & Related papers (2023-06-05T20:39:08Z)
Deep learning for table detection and structure recognition: A survey [49.09628624903334]
The goal of this survey is to provide a profound comprehension of the major developments in the field of Table Detection. We provide an analysis of both classic and new applications in the field. The datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature.
arXiv Detail & Related papers (2022-11-15T19:42:27Z)
Repro: An Open-Source Library for Improving the Reproducibility and Usability of Publicly Available Research Code [74.28810048824519]
Repro is an open-source library which aims at improving the usability of research code. It provides a lightweight Python API for running software released by researchers within Docker containers.
arXiv Detail & Related papers (2022-04-29T01:54:54Z)
Autoregressive Search Engines: Generating Substrings as Document Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers. Previous work has explored ways to partition the search space into hierarchical structures. In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z)
A Survey of Plagiarism Detection Systems: Case of Use with English, French and Arabic Languages [0.0]
This paper presents an overview of plagiarism detection systems for use in Arabic, French, and English academic and educational settings. An indepth examination of technical forms of plagiarism was also performed in the context of this study.
arXiv Detail & Related papers (2022-01-10T16:11:54Z)
Hamtajoo: A Persian Plagiarism Checker for Academic Manuscripts [0.0]
Hamtajoo is a Persian plagiarism detection system for academic manuscripts. We describe the overall structure of the system along with the algorithms used in each stage. In order to evaluate the performance of the proposed system, we used a plagiarism detection corpus comply with the PAN standards.
arXiv Detail & Related papers (2021-12-27T15:45:35Z)
The Struggle with Academic Plagiarism: Approaches based on Semantic Similarity [0.0]
We present a report of how semantic similarity measures can be used in the plagiarism detection task. Current software has proven to be successful, however the problem of identifying paraphrasing or obfuscation plagiarism remains unresolved.
arXiv Detail & Related papers (2021-06-02T20:00:33Z)
COSEA: Convolutional Code Search with Layer-wise Attention [90.35777733464354]
We propose a new deep learning architecture, COSEA, which leverages convolutional neural networks with layer-wise attention to capture the code's intrinsic structural logic. COSEA can achieve significant improvements over state-of-the-art methods on code search tasks.
arXiv Detail & Related papers (2020-10-19T13:53:38Z)
Mossad: Defeating Software Plagiarism Detection [0.48225981108928456]
This paper presents an entirely automatic program transformation approach, Mossad, that defeats popular software plagiarism detection tools. It comprises a framework that couples techniques inspired by genetic programming with domain-specific knowledge to effectively undermine plagiarism detectors. Moss is both fast and effective: it can, in minutes, generate modified versions of programs that are likely to escape detection.
arXiv Detail & Related papers (2020-10-04T22:02:38Z)
Learning to map source code to software vulnerability using code-as-a-graph [67.62847721118142]
We explore the applicability of Graph Neural Networks in learning the nuances of source code from a security perspective. We show that a code-as-graph encoding is more meaningful for vulnerability detection than existing code-as-photo and linear sequence encoding approaches.
arXiv Detail & Related papers (2020-06-15T16:05:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.