Delayed Propagation Transformer: A Universal Computation Engine towards
Practical Control in Cyber-Physical Systems
- URL: http://arxiv.org/abs/2110.15926v1
- Date: Fri, 29 Oct 2021 17:20:53 GMT
- Title: Delayed Propagation Transformer: A Universal Computation Engine towards
Practical Control in Cyber-Physical Systems
- Authors: Wenqing Zheng, Qiangqiang Guo, Hao Yang, Peihao Wang, Zhangyang Wang
- Abstract summary: Multi-agent control is a central theme in the Cyber-Physical Systems.
This paper presents a new transformer-based model that specializes in the global modeling of CPS.
With physical constraint inductive bias baked into its design, our DePT is ready to plug and play for a broad class of multi-agent systems.
- Score: 68.75717332928205
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multi-agent control is a central theme in the Cyber-Physical Systems (CPS).
However, current control methods either receive non-Markovian states due to
insufficient sensing and decentralized design, or suffer from poor convergence.
This paper presents the Delayed Propagation Transformer (DePT), a new
transformer-based model that specializes in the global modeling of CPS while
taking into account the immutable constraints from the physical world. DePT
induces a cone-shaped spatial-temporal attention prior, which injects the
information propagation and aggregation principles and enables a global view.
With physical constraint inductive bias baked into its design, our DePT is
ready to plug and play for a broad class of multi-agent systems. The
experimental results on one of the most challenging CPS -- network-scale
traffic signal control system in the open world -- show that our model
outperformed the state-of-the-art expert methods on synthetic and real-world
datasets. Our codes are released at: https://github.com/VITA-Group/DePT.
Related papers
- One-shot World Models Using a Transformer Trained on a Synthetic Prior [37.027893127637036]
One-Shot World Model (OSWM) is a transformer world model that is learned in an in-context learning fashion from purely synthetic data.
OSWM is able to quickly adapt to the dynamics of a simple grid world, as well as the CartPole gym and a custom control environment.
arXiv Detail & Related papers (2024-09-21T09:39:32Z) - Generalizable Implicit Neural Representation As a Universal Spatiotemporal Traffic Data Learner [46.866240648471894]
Spatiotemporal Traffic Data (STTD) measures the complex dynamical behaviors of the multiscale transportation system.
We present a novel paradigm to address the STTD learning problem by parameterizing STTD as an implicit neural representation.
We validate its effectiveness through extensive experiments in real-world scenarios, showcasing applications from corridor to network scales.
arXiv Detail & Related papers (2024-06-13T02:03:22Z) - PIDformer: Transformer Meets Control Theory [28.10913642120948]
We unveil self-attention as an autonomous state-space model that inherently promotes smoothness in its solutions.
We incorporate a Proportional-Integral-Derivative (PID) closed-loop feedback control system with a reference point into the model to improve robustness and representation capacity.
Motivated by this control framework, we derive a novel class of transformers, PID-controlled Transformer (PIDformer)
arXiv Detail & Related papers (2024-02-25T05:04:51Z) - Global-to-Local Modeling for Video-based 3D Human Pose and Shape
Estimation [53.04781510348416]
Video-based 3D human pose and shape estimations are evaluated by intra-frame accuracy and inter-frame smoothness.
We propose to structurally decouple the modeling of long-term and short-term correlations in an end-to-end framework, Global-to-Local Transformer (GLoT)
Our GLoT surpasses previous state-of-the-art methods with the lowest model parameters on popular benchmarks, i.e., 3DPW, MPI-INF-3DHP, and Human3.6M.
arXiv Detail & Related papers (2023-03-26T14:57:49Z) - PhysFormer++: Facial Video-based Physiological Measurement with SlowFast
Temporal Difference Transformer [76.40106756572644]
Recent deep learning approaches focus on mining subtle clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose two end-to-end video transformer based on PhysFormer and Phys++++, to adaptively aggregate both local and global features for r representation enhancement.
Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra-temporal and cross-dataset testing.
arXiv Detail & Related papers (2023-02-07T15:56:03Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.