Related papers: Bidirectional Representations Augmented Autoregressive Biological Sequence Generation:Application in De Novo Peptide Sequencing

Bidirectional Representations Augmented Autoregressive Biological Sequence Generation:Application in De Novo Peptide Sequencing

URL: http://arxiv.org/abs/2510.08169v2
Date: Fri, 17 Oct 2025 01:38:43 GMT
Title: Bidirectional Representations Augmented Autoregressive Biological Sequence Generation:Application in De Novo Peptide Sequencing
Authors: Xiang Zhang, Jiaqi Wei, Zijie Qiu, Sheng Xu, Zhi Jin, ZhiQiang Gao, Nanqing Dong, Siqi Sun,
Abstract summary: Autoregressive (AR) models offer holistic, bidirectional representations but face challenges with generative coherence and scalability.<n>We propose a hybrid framework enhancing AR generation by dynamically integrating rich contextual information from non-autoregressive mechanisms.<n>A novel cross-decoder attention module enables the AR decoder to iteratively query and integrate these bidirectional features.
Score: 51.12821379640881
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autoregressive (AR) models, common in sequence generation, are limited in many biological tasks such as de novo peptide sequencing and protein modeling by their unidirectional nature, failing to capture crucial global bidirectional token dependencies. Non-Autoregressive (NAR) models offer holistic, bidirectional representations but face challenges with generative coherence and scalability. To transcend this, we propose a hybrid framework enhancing AR generation by dynamically integrating rich contextual information from non-autoregressive mechanisms. Our approach couples a shared input encoder with two decoders: a non-autoregressive one learning latent bidirectional biological features, and an AR decoder synthesizing the biological sequence by leveraging these bidirectional features. A novel cross-decoder attention module enables the AR decoder to iteratively query and integrate these bidirectional features, enriching its predictions. This synergy is cultivated via a tailored training strategy with importance annealing for balanced objectives and cross-decoder gradient blocking for stable, focused learning. Evaluations on a demanding nine-species benchmark of de novo peptide sequencing show that our model substantially surpasses AR and NAR baselines. It uniquely harmonizes AR stability with NAR contextual awareness, delivering robust, superior performance on diverse downstream data. This research advances biological sequence modeling techniques and contributes a novel architectural paradigm for augmenting AR models with enhanced bidirectional understanding for complex sequence generation. Code is available at https://github.com/BEAM-Labs/denovo.

Related papers

Towards Robust Universal Perturbation Attacks: A Float-Coded, Penalty-Driven Evolutionary Approach [10.211523683826004]
Universal adversarial noise perturbations (UAPs) have garnered significant attention due to their ability to undermine deep neural networks.<n>We introduce a single-objective-driven framework for generating such perturbations.<n>Our framework consistently produces perturbations with smaller sizes, higher misclassification effectiveness, and faster compared to existing methods.
arXiv Detail & Related papers (2026-01-18T23:31:12Z)
UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation [98.93314262366681]
We present UniX, a next-generation unified medical foundation model for chest X-ray understanding and generation.<n>UniX decouples the two tasks into an autoregressive branch for understanding and a diffusion branch for high-fidelity generation.<n>On two representative benchmarks, UniX achieves a 46.1% improvement in understanding performance and a 24.2% gain in generation quality.
arXiv Detail & Related papers (2026-01-16T18:59:58Z)
DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models [51.76664843721462]
DeepThinkVLA is a new architecture for Vision-Language-Action models.<n>It generates sequential CoT with causal attention and switches to bidirectional attention for fast decoding of action vectors.<n>It achieves a 97.0% success rate on the LIBERO benchmark.
arXiv Detail & Related papers (2025-10-31T05:26:16Z)
IAR2: Improving Autoregressive Visual Generation with Semantic-Detail Associated Token Prediction [77.06211178777939]
IAR2 is an advanced autoregressive framework that enables a hierarchical semantic-detail synthesis process.<n>We show that IAR2 sets a new state-of-the-art for autoregressive image generation, achieving a FID of 1.50 on ImageNet.
arXiv Detail & Related papers (2025-10-08T12:08:21Z)
CAME-AB: Cross-Modality Attention with Mixture-of-Experts for Antibody Binding Site Prediction [9.316793780511917]
bfCAME-AB is a novel Cross-modality Attention framework for antibody binding site prediction.<n>It integrates raw acid encodings, BLOSUM substitution profiles, pretrained language model embeddings, structure-aware features, and biochemical graphs.<n>It consistently outperforms strong baselines on multiple metrics, including Precision, Recall, F1-score, AUC-ROC, and MCC.
arXiv Detail & Related papers (2025-09-08T09:24:09Z)
Integrating Dynamical Systems Learning with Foundational Models: A Meta-Evolutionary AI Framework for Clinical Trials [0.0]
NetraAI is a system-based framework engineered for stability and interpretability on small clinical trial datasets.<n>We formalize NetraAI's foundations, combining contraction mappings, information geometry, and evolutionary algorithms to identify predictive patient cohorts.<n>By prioritizing reliable, explainable knowledge, NetraAI offers a new generation of adaptive, self-reflective AI to accelerate clinical discovery.
arXiv Detail & Related papers (2025-05-25T03:34:33Z)
UniGenX: a unified generative foundation model that couples sequence, structure and function to accelerate scientific design across proteins, molecules and materials [62.72989417755985]
We present UniGenX, a unified generative model for function in natural systems.<n>UniGenX represents heterogeneous inputs as a mixed stream of symbolic and numeric tokens.<n>It achieves state-of-the-art or competitive performance for the function-aware generation across domains.
arXiv Detail & Related papers (2025-03-09T16:43:07Z)
STAR: Synthesis of Tailored Architectures [61.080157488857516]
We propose a new approach for the synthesis of tailored architectures (STAR)<n>Our approach combines a novel search space based on the theory of linear input-varying systems, supporting a hierarchical numerical encoding into architecture genomes. STAR genomes are automatically refined and recombined with gradient-free, evolutionary algorithms to optimize for multiple model quality and efficiency metrics.<n>Using STAR, we optimize large populations of new architectures, leveraging diverse computational units and interconnection patterns, improving over highly-optimized Transformers and striped hybrid models on the frontier of quality, parameter size, and inference cache for autoregressive language modeling.
arXiv Detail & Related papers (2024-11-26T18:42:42Z)
OneProt: Towards Multi-Modal Protein Foundation Models [5.440531199006399]
We introduce OneProt, a multi-modal AI for proteins that integrates structural, sequence, text, and binding site data.<n>Using the ImageBind framework, OneProt aligns the latent spaces of protein modality encoders in a lightweight fine-tuning scheme.<n>This work expands the horizons of multi-modal protein models, paving the way for transformative applications in drug discovery, biocatalytic reaction planning, and protein engineering.
arXiv Detail & Related papers (2024-11-07T16:54:54Z)
CycleIK: Neuro-inspired Inverse Kinematics [12.29529468290859]
CycleIK is a neuro-robotic approach that wraps two novel neuro-inspired methods for the inverse kinematics (IK) task. We show how embedding these into a hybrid neuro-genetic IK pipeline allows for further optimization.
arXiv Detail & Related papers (2023-07-21T13:03:27Z)
Diformer: Directional Transformer for Neural Machine Translation [13.867255817435705]
Autoregressive (AR) and Non-autoregressive (NAR) models have their own superiority on the performance and latency. We propose the Directional Transformer (Diformer) by jointly modelling AR and NAR into three generation directions. Experiments on 4 WMT benchmarks demonstrate that Diformer outperforms current united-modelling works with more than 1.5 BLEU points for both AR and NAR decoding.
arXiv Detail & Related papers (2021-12-22T02:35:29Z)
AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures. We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS. Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z)
Automated and Formal Synthesis of Neural Barrier Certificates for Dynamical Models [70.70479436076238]
We introduce an automated, formal, counterexample-based approach to synthesise Barrier Certificates (BC) The approach is underpinned by an inductive framework, which manipulates a candidate BC structured as a neural network, and a sound verifier, which either certifies the candidate's validity or generates counter-examples. The outcomes show that we can synthesise sound BCs up to two orders of magnitude faster, with in particular a stark speedup on the verification engine.
arXiv Detail & Related papers (2020-07-07T07:39:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.