Cliqueformer: Model-Based Optimization with Structured Transformers
- URL: http://arxiv.org/abs/2410.13106v3
- Date: Wed, 05 Feb 2025 21:16:48 GMT
- Title: Cliqueformer: Model-Based Optimization with Structured Transformers
- Authors: Jakub Grudzien Kuba, Pieter Abbeel, Sergey Levine,
- Abstract summary: Large neural networks excel at prediction tasks, but their application to design problems, such as protein engineering or materials discovery, requires solving offline model-based optimization (MBO) problems.
We present Cliqueformer, a transformer-based architecture that learns the black-box function's structure through functional graphical models (FGM)
Across various domains, including chemical and genetic design tasks, Cliqueformer demonstrates superior performance compared to existing methods.
- Score: 102.55764949282906
- License:
- Abstract: Large neural networks excel at prediction tasks, but their application to design problems, such as protein engineering or materials discovery, requires solving offline model-based optimization (MBO) problems. While predictive models may not directly translate to effective design, recent MBO algorithms incorporate reinforcement learning and generative modeling approaches. Meanwhile, theoretical work suggests that exploiting the target function's structure can enhance MBO performance. We present Cliqueformer, a transformer-based architecture that learns the black-box function's structure through functional graphical models (FGM), addressing distribution shift without relying on explicit conservative approaches. Across various domains, including chemical and genetic design tasks, Cliqueformer demonstrates superior performance compared to existing methods.
Related papers
- The Buffer Mechanism for Multi-Step Information Reasoning in Language Models [52.77133661679439]
Investigating internal reasoning mechanisms of large language models can help us design better model architectures and training strategies.
In this study, we constructed a symbolic dataset to investigate the mechanisms by which Transformer models employ vertical thinking strategy.
We proposed a random matrix-based algorithm to enhance the model's reasoning ability, resulting in a 75% reduction in the training time required for the GPT-2 model.
arXiv Detail & Related papers (2024-05-24T07:41:26Z) - Aligning Optimization Trajectories with Diffusion Models for Constrained
Design Generation [17.164961143132473]
We introduce a learning framework that demonstrates the efficacy of aligning the sampling trajectory of diffusion models with the optimization trajectory derived from traditional physics-based methods.
Our method allows for generating feasible and high-performance designs in as few as two steps without the need for expensive preprocessing, external surrogate models, or additional labeled data.
Our results demonstrate that TA outperforms state-of-the-art deep generative models on in-distribution configurations and halves the inference computational cost.
arXiv Detail & Related papers (2023-05-29T09:16:07Z) - Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain.
In this work, we build models that rely on the message-passing rule of predictive coding.
We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - TopoDiff: A Performance and Constraint-Guided Diffusion Model for
Topology Optimization [4.091593765662773]
TopoDiff is a conditional diffusion-model-based architecture for performance-aware and manufacturability-aware topology optimization.
Our method significantly outperforms a state-of-art conditional GAN by reducing the average error on physical performance by a factor of eight.
arXiv Detail & Related papers (2022-08-20T03:26:00Z) - Re-parameterizing Your Optimizers rather than Architectures [119.08740698936633]
We propose a novel paradigm of incorporating model-specific prior knowledge into Structurals and using them to train generic (simple) models.
As an implementation, we propose a novel methodology to add prior knowledge by modifying the gradients according to a set of model-specific hyper- parameters.
For a simple model trained with a Repr, we focus on a VGG-style plain model and showcase that such a simple model trained with a Repr, which is referred to as Rep-VGG, performs on par with the recent well-designed models.
arXiv Detail & Related papers (2022-05-30T16:55:59Z) - Conservative Objective Models for Effective Offline Model-Based
Optimization [78.19085445065845]
Computational design problems arise in a number of settings, from synthetic biology to computer architectures.
We propose a method that learns a model of the objective function that lower bounds the actual value of the ground-truth objective on out-of-distribution inputs.
COMs are simple to implement and outperform a number of existing methods on a wide range of MBO problems.
arXiv Detail & Related papers (2021-07-14T17:55:28Z) - High-Dimensional Bayesian Optimization via Tree-Structured Additive
Models [40.497123136157946]
We consider generalized additive models in which low-dimensional functions with overlapping subsets of variables are composed to model a high-dimensional target function.
Our goal is to lower the computational resources required and facilitate faster model learning.
We demonstrate and discuss the efficacy of our approach via a range of experiments on synthetic functions and real-world datasets.
arXiv Detail & Related papers (2020-12-24T03:56:44Z) - Optimization-Inspired Learning with Architecture Augmentations and
Control Mechanisms for Low-Level Vision [74.9260745577362]
This paper proposes a unified optimization-inspired learning framework to aggregate Generative, Discriminative, and Corrective (GDC) principles.
We construct three propagative modules to effectively solve the optimization models with flexible combinations.
Experiments across varied low-level vision tasks validate the efficacy and adaptability of GDC.
arXiv Detail & Related papers (2020-12-10T03:24:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.