Tady: A Neural Disassembler without Structural Constraint Violations
- URL: http://arxiv.org/abs/2506.13323v1
- Date: Mon, 16 Jun 2025 10:11:43 GMT
- Title: Tady: A Neural Disassembler without Structural Constraint Violations
- Authors: Siliang Qin, Fengrui Yang, Hao Wang, Bolun Zhang, Zeyu Gao, Chao Zhang, Kai Chen,
- Abstract summary: We introduce Tady, a novel neural disassembler featuring an improved model architecture and a dedicated post-processing algorithm.<n>We show that Tady effectively eliminates structural constraint violations and functions with high efficiency, while maintaining instruction-level accuracy.
- Score: 14.794789423601552
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Disassembly is a crucial yet challenging step in binary analysis. While emerging neural disassemblers show promise for efficiency and accuracy, they frequently generate outputs violating fundamental structural constraints, which significantly compromise their practical usability. To address this critical problem, we regularize the disassembly solution space by formalizing and applying key structural constraints based on post-dominance relations. This approach systematically detects widespread errors in existing neural disassemblers' outputs. These errors often originate from models' limited context modeling and instruction-level decoding that neglect global structural integrity. We introduce Tady, a novel neural disassembler featuring an improved model architecture and a dedicated post-processing algorithm, specifically engineered to address these deficiencies. Comprehensive evaluations on diverse binaries demonstrate that Tady effectively eliminates structural constraint violations and functions with high efficiency, while maintaining instruction-level accuracy.
Related papers
- Determination Of Structural Cracks Using Deep Learning Frameworks [0.0]
This study introduces a novel deep-learning architecture designed to enhance the accuracy and efficiency of structural crack detection.<n>In this research, various configurations of residual U-Net models were utilized.<n>The ensemble model achieved the highest scores, signifying superior accuracy.
arXiv Detail & Related papers (2025-07-03T08:24:47Z) - $φ^{\infty}$: Clause Purification, Embedding Realignment, and the Total Suppression of the Em Dash in Autoregressive Language Models [0.0]
We identify a critical vulnerability in autoregressive transformer language models where the em dash token induces semantic drift.<n>We propose a novel solution combining symbolic clause purification via the phi-infinity operator with targeted embedding matrix.
arXiv Detail & Related papers (2025-06-22T18:27:39Z) - Towards Understanding Bugs in Distributed Training and Inference Frameworks for Large Language Models [7.486731499255164]
This paper conducts the first large-scale empirical analysis of 308 fixed bugs across three popular distributed training/inference frameworks: DeepSpeed, Megatron-LM, and Colossal-AI.<n>We examine bug symptoms, root causes, bug identification and fixing efforts, and common low-effort fixing strategies.
arXiv Detail & Related papers (2025-06-12T07:24:59Z) - Weight Spectra Induced Efficient Model Adaptation [54.8615621415845]
Fine-tuning large-scale foundation models incurs prohibitive computational costs.<n>We show that fine-tuning predominantly amplifies the top singular values while leaving the remainder largely intact.<n>We propose a novel method that leverages learnable rescaling of top singular directions.
arXiv Detail & Related papers (2025-05-29T05:03:29Z) - Q-function Decomposition with Intervention Semantics with Factored Action Spaces [51.01244229483353]
We consider Q-functions defined over a lower dimensional projected subspace of the original action space, and study the condition for the unbiasedness of decomposed Q-functions.<n>This leads to a general scheme which we call action decomposed reinforcement learning that uses the projected Q-functions to approximate the Q-function in standard model-free reinforcement learning algorithms.
arXiv Detail & Related papers (2025-04-30T05:26:51Z) - Model Hemorrhage and the Robustness Limits of Large Language Models [119.46442117681147]
Large language models (LLMs) demonstrate strong performance across natural language processing tasks, yet undergo significant performance degradation when modified for deployment.<n>We define this phenomenon as model hemorrhage - performance decline caused by parameter alterations and architectural changes.
arXiv Detail & Related papers (2025-03-31T10:16:03Z) - ConsistentFeature: A Plug-and-Play Component for Neural Network Regularization [0.32885740436059047]
Over- parameterized neural network models often lead to significant performance discrepancies between training and test sets.<n>We introduce a simple perspective on overfitting: models learn different representations in different i.i.d. datasets.<n>We propose an adaptive method, ConsistentFeature, that regularizes the model by constraining feature differences across random subsets of the same training set.
arXiv Detail & Related papers (2024-12-02T13:21:31Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - Model order reduction of deep structured state-space models: A system-theoretic approach [0.0]
deep structured state-space models offer high predictive performance.
The learned representations often suffer from excessively large model orders, which render them unsuitable for control design purposes.
We introduce two regularization terms which can be incorporated into the training loss for improved model order reduction.
The presented regularizers lead to advantages in terms of parsimonious representations and faster inference resulting from the reduced order models.
arXiv Detail & Related papers (2024-03-21T21:05:59Z) - Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown [2.5628953713168685]
We tackle the challenge of outlier-robust predictive modeling using highly expressive neural networks.<n>Our approach integrates two key components: (1) a transformed trimmed loss (TTL), and (2) higher-order variation regularization (HOVR), which imposes smoothness constraints on the prediction function.
arXiv Detail & Related papers (2023-08-04T12:57:13Z) - Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model.
A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations.
We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.