Related papers: AutoLayout: Closed-Loop Layout Synthesis via Slow-Fast Collaborative Reasoning

AutoLayout: Closed-Loop Layout Synthesis via Slow-Fast Collaborative Reasoning

URL: http://arxiv.org/abs/2507.04293v1
Date: Sun, 06 Jul 2025 08:35:22 GMT
Title: AutoLayout: Closed-Loop Layout Synthesis via Slow-Fast Collaborative Reasoning
Authors: Weixing Chen, Dafeng Chi, Yang Liu, Yuxi Yang, Yexin Zhang, Yuzheng Zhuang, Xingyue Quan, Jianye Hao, Guanbin Li, Liang Lin,
Abstract summary: Auto is a fully automated method that integrates a closed-loop self-validation process within a dual-system framework.<n>The effectiveness of Auto was validated across 8 distinct scenarios, where it demonstrated a significant 10.1% improvement over SOTA methods.
Score: 102.71841660031065
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The automated generation of layouts is vital for embodied intelligence and autonomous systems, supporting applications from virtual environment construction to home robot deployment. Current approaches, however, suffer from spatial hallucination and struggle with balancing semantic fidelity and physical plausibility, often producing layouts with deficits such as floating or overlapping objects and misaligned stacking relation. In this paper, we propose AutoLayout, a fully automated method that integrates a closed-loop self-validation process within a dual-system framework. Specifically, a slow system harnesses detailed reasoning with a Reasoning-Reflection-Generation (RRG) pipeline to extract object attributes and spatial constraints. Then, a fast system generates discrete coordinate sets and a topological relation set that are jointly validated. To mitigate the limitations of handcrafted rules, we further introduce an LLM-based Adaptive Relation Library (ARL) for generating and evaluating layouts. Through the implementation of Slow-Fast Collaborative Reasoning, the AutoLayout efficiently generates layouts after thorough deliberation, effectively mitigating spatial hallucination. Its self-validation mechanism establishes a closed-loop process that iteratively corrects potential errors, achieving a balance between physical stability and semantic consistency. The effectiveness of AutoLayout was validated across 8 distinct scenarios, where it demonstrated a significant 10.1% improvement over SOTA methods in terms of physical plausibility, semantic consistency, and functional completeness.

Related papers

Leveraging Large Language Model for Intelligent Log Processing and Autonomous Debugging in Cloud AI Platforms [1.819979627431298]
This paper proposes an intelligent log processing and automatic debug framework based on Large Language Model (LLM), named Intelligent Debugger (LLM-ID)<n> Experiments on the cloud platform log dataset show that LLM-ID improves the fault location accuracy by 16.2%, which is significantly better than the current mainstream methods.
arXiv Detail & Related papers (2025-06-22T04:58:37Z)
DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving [62.62464518137153]
DriveTransformer is a simplified E2E-AD framework for the ease of scaling up.<n>It is composed of three unified operations: task self-attention, sensor cross-attention, temporal cross-attention.<n>It achieves state-of-the-art performance in both simulated closed-loop benchmark Bench2Drive and real world open-loop benchmark nuScenes with high FPS.
arXiv Detail & Related papers (2025-03-07T11:41:18Z)
InVDriver: Intra-Instance Aware Vectorized Query-Based Autonomous Driving Transformer [12.441180142943328]
InVDriver is a novel vectorized query-based system for intra-instance spatial dependencies.<n>We show that InVDriver achieves state-of-the-art performance, surpassing prior methods in both accuracy and safety.
arXiv Detail & Related papers (2025-02-25T08:20:16Z)
Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting [16.782154479264126]
Predicting backbone-temporal traffic flow presents challenges due to complex interactions between temporal factors. Existing approaches address these dimensions in isolation, neglecting their critical interdependencies. In this paper, we introduce Sanonymous-Temporal Unitized Unitized Cell (ASTUC), a unified framework designed to capture both spatial and temporal dependencies.
arXiv Detail & Related papers (2024-11-14T07:34:31Z)
DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.<n>Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.<n>Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z)
LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning [91.95362946266577]
Path planning is a fundamental scientific problem in robotics and autonomous navigation.<n>Traditional algorithms like A* and its variants are capable of ensuring path validity but suffer from significant computational and memory inefficiencies as the state space grows.<n>We propose a new LLM based route planning method that synergistically combines the precise pathfinding capabilities of A* with the global reasoning capability of LLMs.<n>This hybrid approach aims to enhance pathfinding efficiency in terms of time and space complexity while maintaining the integrity of path validity, especially in large-scale scenarios.
arXiv Detail & Related papers (2024-06-20T01:24:30Z)
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-Attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient ensembling method for self-attention networks.<n>The method not only outperforms state-of-the-art implicit techniques like BatchEnsemble, but even matches or exceeds the accuracy of an Explicit Ensemble.
arXiv Detail & Related papers (2024-05-23T11:10:32Z)
Responsible Composition and Optimization of Integration Processes under Correctness Preserving Guarantees [0.7366405857677227]
Enterprise Application Integration deals with the problem of connecting heterogeneous applications. We formalize compositions of integration patterns based on their characteristics. We describe optimization strategies that help to reduce the model complexity.
arXiv Detail & Related papers (2023-05-30T16:40:18Z)
A Cooperative Perception System Robust to Localization Errors [8.65435011972241]
We propose a distributed object-level cooperative perception system called OptiMatch. The detected 3D bounding boxes and local state information are shared between the connected vehicles. Experiment results show that the proposed framework outperforms the state-of-the-art benchmark fusion schemes.
arXiv Detail & Related papers (2022-10-12T15:07:24Z)
Higher Performance Visual Tracking with Dual-Modal Localization [106.91097443275035]
Visual Object Tracking (VOT) has synchronous needs for both robustness and accuracy. We propose a dual-modal framework for target localization, consisting of robust localization suppressingors via ONR and the accurate localization attending to the target center precisely via OFC.
arXiv Detail & Related papers (2021-03-18T08:47:56Z)
Deep Multi-Task Learning for Joint Localization, Perception, and Prediction [68.50217234419922]
This paper investigates the issues that arise in state-of-the-art autonomy stacks under localization error. We design a system that jointly performs perception, prediction, and localization. Our architecture is able to reuse computation between both tasks, and is thus able to correct localization errors efficiently.
arXiv Detail & Related papers (2021-01-17T17:20:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.