Related papers: Geo-ORBIT: A Federated Digital Twin Framework for Scene-Adaptive Lane Geometry Detection

Geo-ORBIT: A Federated Digital Twin Framework for Scene-Adaptive Lane Geometry Detection

URL: http://arxiv.org/abs/2507.08743v1
Date: Fri, 11 Jul 2025 16:45:59 GMT
Title: Geo-ORBIT: A Federated Digital Twin Framework for Scene-Adaptive Lane Geometry Detection
Authors: Rei Tamaru, Pei Li, Bin Ran,
Abstract summary: Geo-ORBIT is a unified framework that combines real-time lane detection, DT synchronization, and federated meta-learning.<n>We extend this model through Meta-GeoLane, which learns to personalize detection parameters for local entities.<n>Our system is integrated with CARLA and SUMO to create a high-fidelity DT that renders highway scenarios and captures traffic flows in real-time.
Score: 17.09138102827048
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Digital Twins (DT) have the potential to transform traffic management and operations by creating dynamic, virtual representations of transportation systems that sense conditions, analyze operations, and support decision-making. A key component for DT of the transportation system is dynamic roadway geometry sensing. However, existing approaches often rely on static maps or costly sensors, limiting scalability and adaptability. Additionally, large-scale DTs that collect and analyze data from multiple sources face challenges in privacy, communication, and computational efficiency. To address these challenges, we introduce Geo-ORBIT (Geometrical Operational Roadway Blueprint with Integrated Twin), a unified framework that combines real-time lane detection, DT synchronization, and federated meta-learning. At the core of Geo-ORBIT is GeoLane, a lightweight lane detection model that learns lane geometries from vehicle trajectory data using roadside cameras. We extend this model through Meta-GeoLane, which learns to personalize detection parameters for local entities, and FedMeta-GeoLane, a federated learning strategy that ensures scalable and privacy-preserving adaptation across roadside deployments. Our system is integrated with CARLA and SUMO to create a high-fidelity DT that renders highway scenarios and captures traffic flows in real-time. Extensive experiments across diverse urban scenes show that FedMeta-GeoLane consistently outperforms baseline and meta-learning approaches, achieving lower geometric error and stronger generalization to unseen locations while drastically reducing communication overhead. This work lays the foundation for flexible, context-aware infrastructure modeling in DTs. The framework is publicly available at https://github.com/raynbowy23/FedMeta-GeoLane.git.

Related papers

Dynamic Topology Awareness: Breaking the Granularity Rigidity in Vision-Language Navigation [22.876516699004814]
Vision-Language Navigation in Continuous Environments (VLN-CE) presents a core challenge: grounding high-level linguistic instructions into precise, safe, and long-horizon spatial actions.<n>Explicit topological maps have proven to be a vital solution for providing robust spatial memory in such tasks.<n>Existing topological planning methods suffer from a "Granularity Rigidity" problem.<n>We propose DGNav, a framework for Dynamic Topological Navigation, introducing a context-aware mechanism to modulate map density and connectivity on-the-fly.
arXiv Detail & Related papers (2026-01-29T14:06:23Z)
Wireless Traffic Prediction with Large Language Model [54.07581399989292]
TIDES is a novel framework that captures spatial-temporal correlations for wireless traffic prediction.<n> TIDES achieves efficient adaptation to domain-specific patterns without incurring excessive training overhead.<n>Our results indicate that integrating spatial awareness into LLM-based predictors is the key to unlocking scalable and intelligent network management in future 6G systems.
arXiv Detail & Related papers (2025-12-19T04:47:40Z)
3dSAGER: Geospatial Entity Resolution over 3D Objects (Technical Report) [7.378893412842889]
3dSAGER is an end-to-end pipeline for geospatial entity resolution over 3D objects.<n>We present a novel, spatial-reference-independent featurization mechanism that captures intricate geometric characteristics of matching pairs.<n>We also propose a new lightweight and interpretable blocking method, BKAFI, that leverages a trained model to efficiently generate high-recall candidate sets.
arXiv Detail & Related papers (2025-11-09T09:35:45Z)
Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method [54.461213497603154]
Occupancy-centric methods have recently achieved state-of-the-art results by offering consistent conditioning across frames and modalities.<n>Nuplan-Occ is the largest occupancy dataset to date, constructed from the widely used Nuplan benchmark.<n>We develop a unified framework that jointly synthesizes high-quality occupancy, multi-view videos, and LiDAR point clouds.
arXiv Detail & Related papers (2025-10-27T03:52:45Z)
Towards Intelligent Transportation with Pedestrians and Vehicles In-the-Loop: A Surveillance Video-Assisted Federated Digital Twin Framework [62.47416496137193]
We propose a surveillance video assisted federated digital twin (SV-FDT) framework to empower ITSs with pedestrians and vehicles in-the-loop.<n>The architecture consists of three layers: (i) the end layer, which collects traffic surveillance videos from multiple sources; (ii) the edge layer, responsible for semantic segmentation-based visual understanding, twin agent-based interaction modeling, and local digital twin system (LDTS) creation in local regions; and (iii) the cloud layer, which integrates LDTSs across different regions to construct a global DT model in realtime.
arXiv Detail & Related papers (2025-03-06T07:36:06Z)
Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework [51.26566634946208]
We introduce smileGeo, a novel visual geo-localization framework. By inter-agent communication, smileGeo integrates the inherent knowledge of these agents with additional retrieved information. Results show that our approach significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2024-08-21T03:31:30Z)
NLP-enabled Trajectory Map-matching in Urban Road Networks using a Transformer-based Encoder-decoder [1.3812010983144802]
This study introduces a data-driven, deep learning-based map-matching framework, formulating the task as machine translation, inspired by NLP.<n>A transformer-based encoder-decoder model learns contextual representations of noisy GPS points to infer trajectory behavior and road structures in an end-to-end manner.<n>Experiments on synthetic trajectories show that this approach outperforms conventional methods by integrating contextual awareness.
arXiv Detail & Related papers (2024-04-18T18:39:23Z)
Probabilistic Image-Driven Traffic Modeling via Remote Sensing [8.234589405189187]
We introduce a multi-modal, multi-task transformer-based segmentation architecture that can be used to create dense city-scale traffic models. We evaluate our method extensively using the Dynamic Traffic Speeds benchmark dataset and significantly improve the state-of-the-art.
arXiv Detail & Related papers (2024-03-08T18:43:28Z)
Elastic Interaction Energy-Informed Real-Time Traffic Scene Perception [8.429178814528617]
A topology-aware energy loss function-based network training strategy named EIEGSeg is proposed. EIEGSeg is designed for multi-class segmentation on real-time traffic scene perception. Our results demonstrate that EIEGSeg consistently improves the performance, especially on real-time, lightweight networks.
arXiv Detail & Related papers (2023-10-02T01:30:42Z)
Continuous Self-Localization on Aerial Images Using Visual and Lidar Sensors [25.87104194833264]
We propose a novel method for geo-tracking in outdoor environments by registering a vehicle's sensor information with aerial imagery of an unseen target region. We train a model in a metric learning setting to extract visual features from ground and aerial images. Our method is the first to utilize on-board cameras in an end-to-end differentiable model for metric self-localization on unseen orthophotos.
arXiv Detail & Related papers (2022-03-07T12:25:44Z)
Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention. Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z)
Multi-Agent Routing Value Iteration Network [88.38796921838203]
We propose a graph neural network based model that is able to perform multi-agent routing based on learned value in a sparsely connected graph. We show that our model trained with only two agents on graphs with a maximum of 25 nodes can easily generalize to situations with more agents and/or nodes.
arXiv Detail & Related papers (2020-07-09T22:16:45Z)
Constructing Geographic and Long-term Temporal Graph for Traffic Forecasting [88.5550074808201]
We propose Geographic and Long term Temporal Graph Convolutional Recurrent Neural Network (GLT-GCRNN) for traffic forecasting. In this work, we propose a novel framework for traffic forecasting that learns the rich interactions between roads sharing similar geographic or longterm temporal patterns.
arXiv Detail & Related papers (2020-04-23T03:50:46Z)
Learning to Move with Affordance Maps [57.198806691838364]
The ability to autonomously explore and navigate a physical space is a fundamental requirement for virtually any mobile autonomous agent. Traditional SLAM-based approaches for exploration and navigation largely focus on leveraging scene geometry. We show that learned affordance maps can be used to augment traditional approaches for both exploration and navigation, providing significant improvements in performance.
arXiv Detail & Related papers (2020-01-08T04:05:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.