Federated Multi-View Synthesizing for Metaverse
- URL: http://arxiv.org/abs/2401.00859v1
- Date: Mon, 18 Dec 2023 13:51:56 GMT
- Title: Federated Multi-View Synthesizing for Metaverse
- Authors: Yiyu Guo, Zhijin Qin, Xiaoming Tao, Geoffrey Ye Li
- Abstract summary: The metaverse is expected to provide immersive entertainment, education, and business applications.
Virtual reality (VR) transmission over wireless networks is data- and computation-intensive.
We have developed a novel multi-view synthesizing framework that can efficiently provide synthesizing, storage, and communication resources for wireless content delivery in the metaverse.
- Score: 52.59476179535153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The metaverse is expected to provide immersive entertainment, education, and
business applications. However, virtual reality (VR) transmission over wireless
networks is data- and computation-intensive, making it critical to introduce
novel solutions that meet stringent quality-of-service requirements. With
recent advances in edge intelligence and deep learning, we have developed a
novel multi-view synthesizing framework that can efficiently provide
computation, storage, and communication resources for wireless content delivery
in the metaverse. We propose a three-dimensional (3D)-aware generative model
that uses collections of single-view images. These single-view images are
transmitted to a group of users with overlapping fields of view, which avoids
massive content transmission compared to transmitting tiles or whole 3D models.
We then present a federated learning approach to guarantee an efficient
learning process. The training performance can be improved by characterizing
the vertical and horizontal data samples with a large latent feature space,
while low-latency communication can be achieved with a reduced number of
transmitted parameters during federated learning. We also propose a federated
transfer learning framework to enable fast domain adaptation to different
target domains. Simulation results have demonstrated the effectiveness of our
proposed federated multi-view synthesizing framework for VR content delivery.
Related papers
- LaVin-DiT: Large Vision Diffusion Transformer [99.98106406059333]
LaVin-DiT is a scalable and unified foundation model designed to tackle over 20 computer vision tasks in a generative framework.
We introduce key innovations to optimize generative performance for vision tasks.
The model is scaled from 0.1B to 3.4B parameters, demonstrating substantial scalability and state-of-the-art performance across diverse vision tasks.
arXiv Detail & Related papers (2024-11-18T12:05:27Z) - Augmented Reality based Simulated Data (ARSim) with multi-view consistency for AV perception networks [47.07188762367792]
We present ARSim, a framework designed to enhance real multi-view image data with 3D synthetic objects of interest.
We construct a simplified virtual scene using real data and strategically place 3D synthetic assets within it.
The resulting augmented multi-view consistent dataset is used to train a multi-camera perception network for autonomous vehicles.
arXiv Detail & Related papers (2024-03-22T17:49:11Z) - OnDev-LCT: On-Device Lightweight Convolutional Transformers towards
federated learning [29.798780069556074]
Federated learning (FL) has emerged as a promising approach to collaboratively train machine learning models across multiple edge devices.
We propose OnDev-LCT: Lightweight Convolutional Transformers for On-Device vision tasks with limited training data and resources.
arXiv Detail & Related papers (2024-01-22T02:17:36Z) - MuRF: Multi-Baseline Radiance Fields [117.55811938988256]
We present Multi-Baseline Radiance Fields (MuRF), a feed-forward approach to solving sparse view synthesis.
MuRF achieves state-of-the-art performance across multiple different baseline settings.
We also show promising zero-shot generalization abilities on the Mip-NeRF 360 dataset.
arXiv Detail & Related papers (2023-12-07T18:59:56Z) - Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training [44.790636524264]
Point Prompt Training is a novel framework for multi-dataset synergistic learning in the context of 3D representation learning.
It can overcome the negative transfer associated with synergistic learning and produce generalizable representations.
It achieves state-of-the-art performance on each dataset using a single weight-shared model with supervised multi-dataset training.
arXiv Detail & Related papers (2023-08-18T17:59:57Z) - Asynchronous Hybrid Reinforcement Learning for Latency and Reliability
Optimization in the Metaverse over Wireless Communications [8.513938423514636]
Real-time digital twinning of real-world scenes is increasing.
The disparity in transmitted scene dimension (2D as opposed to 3D) leads to asymmetric data sizes in uplink (UL) and downlink (DL)
We design a novel multi-agent reinforcement learning algorithm structure, namely Asynchronous Actors Hybrid Critic (AAHC)
arXiv Detail & Related papers (2022-12-30T14:40:00Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - Applying Deep-Learning-Based Computer Vision to Wireless Communications:
Methodologies, Opportunities, and Challenges [100.45137961106069]
Deep learning (DL) has seen great success in the computer vision (CV) field.
This article introduces ideas about applying DL-based CV in wireless communications.
arXiv Detail & Related papers (2020-06-10T11:37:49Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.