Large Generative Model Assisted 3D Semantic Communication
- URL: http://arxiv.org/abs/2403.05783v1
- Date: Sat, 9 Mar 2024 03:33:07 GMT
- Title: Large Generative Model Assisted 3D Semantic Communication
- Authors: Feibo Jiang, Yubo Peng, Li Dong, Kezhi Wang, Kun Yang, Cunhua Pan,
Xiaohu You
- Abstract summary: We propose a Generative AI Model assisted 3D SC (GAM-3DSC) system.
First, we introduce a 3D Semantic Extractor (3DSE) to extract key semantics from a 3D scenario based on user requirements.
We then present an Adaptive Semantic Compression Model (ASCM) for encoding these multi-perspective images.
Finally, we design a conditional Generative adversarial network and Diffusion model aided-Channel Estimation (GDCE) to estimate and refine the Channel State Information (CSI) of physical channels.
- Score: 51.17527319441436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic Communication (SC) is a novel paradigm for data transmission in 6G.
However, there are several challenges posed when performing SC in 3D scenarios:
1) 3D semantic extraction; 2) Latent semantic redundancy; and 3) Uncertain
channel estimation. To address these issues, we propose a Generative AI Model
assisted 3D SC (GAM-3DSC) system. Firstly, we introduce a 3D Semantic Extractor
(3DSE), which employs generative AI models, including Segment Anything Model
(SAM) and Neural Radiance Field (NeRF), to extract key semantics from a 3D
scenario based on user requirements. The extracted 3D semantics are represented
as multi-perspective images of the goal-oriented 3D object. Then, we present an
Adaptive Semantic Compression Model (ASCM) for encoding these multi-perspective
images, in which we use a semantic encoder with two output heads to perform
semantic encoding and mask redundant semantics in the latent semantic space,
respectively. Next, we design a conditional Generative adversarial network and
Diffusion model aided-Channel Estimation (GDCE) to estimate and refine the
Channel State Information (CSI) of physical channels. Finally, simulation
results demonstrate the advantages of the proposed GAM-3DSC system in
effectively transmitting the goal-oriented 3D scenario.
Related papers
- Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting [27.974762304763694]
We introduce Semantic Gaussians, a novel open-vocabulary scene understanding approach based on 3D Gaussian Splatting.
Unlike existing methods, we design a versatile projection approach that maps various 2D semantic features into a novel semantic component of 3D Gaussians.
We build a 3D semantic network that directly predicts the semantic component from raw 3D Gaussians for fast inference.
arXiv Detail & Related papers (2024-03-22T21:28:19Z) - LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation [73.36690511083894]
This paper introduces a novel framework called LN3Diff to address a unified 3D diffusion pipeline.
Our approach harnesses a 3D-aware architecture and variational autoencoder to encode the input image into a structured, compact, and 3D latent space.
It achieves state-of-the-art performance on ShapeNet for 3D generation and demonstrates superior performance in monocular 3D reconstruction and conditional 3D generation.
arXiv Detail & Related papers (2024-03-18T17:54:34Z) - Spice-E : Structural Priors in 3D Diffusion using Cross-Entity Attention [9.52027244702166]
Spice-E is a neural network that adds structural guidance to 3D diffusion models.
We show that our approach supports a variety of applications, including 3D stylization, semantic shape editing and text-conditional abstraction-to-3D.
arXiv Detail & Related papers (2023-11-29T17:36:49Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic
Scene Graph Prediction in Point Cloud [51.063494002003154]
3D semantic scene graph (3DSSG) prediction in the point cloud is challenging since the 3D point cloud only captures geometric structures with limited semantics compared to 2D images.
We propose Visual-Linguistic Semantics Assisted Training scheme that can significantly empower 3DSSG prediction models with discrimination about long-tailed and ambiguous semantic relations.
arXiv Detail & Related papers (2023-03-25T09:14:18Z) - Improving 3D Object Detection with Channel-wise Transformer [58.668922561622466]
We propose a two-stage 3D object detection framework (CT3D) with minimal hand-crafted design.
CT3D simultaneously performs proposal-aware embedding and channel-wise context aggregation.
It achieves the AP of 81.77% in the moderate car category on the KITTI test 3D detection benchmark.
arXiv Detail & Related papers (2021-08-23T02:03:40Z) - S3Net: 3D LiDAR Sparse Semantic Segmentation Network [1.330528227599978]
S3Net is a novel convolutional neural network for LiDAR point cloud semantic segmentation.
It adopts an encoder-decoder backbone that consists of Sparse Intra-channel Attention Module (SIntraAM) and Sparse Inter-channel Attention Module (SInterAM)
arXiv Detail & Related papers (2021-03-15T22:15:24Z) - Exploring Deep 3D Spatial Encodings for Large-Scale 3D Scene
Understanding [19.134536179555102]
We propose an alternative approach to overcome the limitations of CNN based approaches by encoding the spatial features of raw 3D point clouds into undirected graph models.
The proposed method achieves on par state-of-the-art accuracy with improved training time and model stability thus indicating strong potential for further research.
arXiv Detail & Related papers (2020-11-29T12:56:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.