Related papers: Applying the Chinese Wall Reverse Engineering Technique to Large Language Model Code Editing

Applying the Chinese Wall Reverse Engineering Technique to Large Language Model Code Editing

URL: http://arxiv.org/abs/2507.15599v1
Date: Mon, 21 Jul 2025 13:21:29 GMT
Title: Applying the Chinese Wall Reverse Engineering Technique to Large Language Model Code Editing
Authors: Manatsawin Hanmongkolchai,
Abstract summary: We propose an application of the "Chinese Wall" technique, inspired by the reverse engineering technique of the same name.<n>A weaker but ethically aligned model may be used to perform complicated tasks that, otherwise, can only be completed by more powerful models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models for code (Code LLM) are increasingly utilized in programming environments. Despite their utility, the training datasets for top LLM remain undisclosed, raising concerns about potential copyright violations. Some models, such as Pleias and Comma put emphasis on data curation and licenses, however, with limited training data these models are not competitive and only serve as proof of concepts. To improve the utility of these models, we propose an application of the "Chinese Wall" technique, inspired by the reverse engineering technique of the same name -- a high quality model is used to generate detailed instructions for a weaker model. By doing so, a weaker but ethically aligned model may be used to perform complicated tasks that, otherwise, can only be completed by more powerful models. In our evaluation, we've found that this technique improves Comma v0.1 1T's performance in CanItEdit benchmark by over 66%, and Starcoder2 Instruct by roughly 20% compared to when running the same model on the benchmark alone. The practical application of this technique today, however, may be limited due to the lack of models trained on public domain content without copyright restrictions.

Related papers

Matryoshka Model Learning for Improved Elastic Student Models [62.154536258259384]
MatTA is a framework for training multiple accurate Student models using a novel Teacher-TA-Student recipe.<n>We demonstrate our method on GPT-2 Medium, a public model, and achieve relative improvements of over 24% on SAT Math and over 10% on the LAMBADA benchmark.
arXiv Detail & Related papers (2025-05-29T10:54:58Z)
DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models [10.34623505096336]
We present DistilQwen2.5, a family of distilled, lightweight language models (LLMs) derived from the public Qwen2.5 models.<n>These models exhibit enhanced instruction-following capabilities compared to the original models.<n>To facilitate practical use, we have released all the DistilQwen2.5 models to the open-source community.
arXiv Detail & Related papers (2025-04-21T11:26:02Z)
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric [99.56567010306807]
Large Language Models (LLMs) have become indispensable across academia, industry, and daily applications.<n>One core challenge of evaluation in the large language model (LLM) era is the generalization issue.<n>We propose Model Utilization Index (MUI), a mechanism interpretability enhanced metric that complements traditional performance scores.
arXiv Detail & Related papers (2025-04-10T04:09:47Z)
Jasper and Stella: distillation of SOTA embedding models [8.708650717134008]
We propose a novel multi-stage distillation framework that enables a smaller student embedding model to distill multiple teacher embedding models.<n>We utilize Matryoshka Representation Learning (MRL) to reduce the vector dimensionality of the student embedding model effectively.<n>Our student model named Jasper with 2 billion parameters, built upon the Stella embedding model, obtained the No.3 position on the Massive Text Embedding Benchmark leaderboard.
arXiv Detail & Related papers (2024-12-26T04:05:28Z)
Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging [25.327483618051378]
We conduct the first study on the robustness of IP protection methods under model merging scenarios. Experimental results indicate that current Large Language Model (LLM) watermarking techniques cannot survive in the merged models. Our research aims to highlight that model merging should be an indispensable consideration in the robustness assessment of model IP protection techniques.
arXiv Detail & Related papers (2024-04-08T04:30:33Z)
Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains. In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z)
Who's Harry Potter? Approximate Unlearning in LLMs [4.821438899378393]
Large language models (LLMs) are trained on massive internet corpora that often contain copyrighted content. This poses legal and ethical challenges for the developers and users of these models, as well as the original authors and publishers. We propose a novel technique for unlearning a subset of the training data from a LLM, without having to retrain it from scratch.
arXiv Detail & Related papers (2023-10-03T17:48:14Z)
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours. We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length. This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z)
Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language Models [11.57282859281814]
We consider different knowledge levels and attribution strategies, and find that we can correctly trace back 8 out of the 10 fine tuned models with our best method.
arXiv Detail & Related papers (2023-06-15T17:42:48Z)
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? [50.84738303888189]
We present a large-scale evaluation of modeling choices and their impact on zero-shot generalization. We train models with over 5 billion parameters for more than 170 billion tokens. We find that pretrained causal decoder models can be efficiently adapted into non-causal decoder models.
arXiv Detail & Related papers (2022-04-12T14:19:49Z)
Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese [33.83704598544326]
Mengzi stands for a family of discriminative, generative, domain-specific, and multimodal pre-trained model variants. Compared with public Chinese PLMs, Mengzi is simple but more powerful. Our lightweight model has achieved new state-of-the-art results on the widely-used CLUE benchmark.
arXiv Detail & Related papers (2021-10-13T13:14:32Z)
Model Reuse with Reduced Kernel Mean Embedding Specification [70.044322798187]
We present a two-phase framework for finding helpful models for a current application. In the upload phase, when a model is uploading into the pool, we construct a reduced kernel mean embedding (RKME) as a specification for the model. Then in the deployment phase, the relatedness of the current task and pre-trained models will be measured based on the value of the RKME specification.
arXiv Detail & Related papers (2020-01-20T15:15:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.