TextSLAM: Visual SLAM with Semantic Planar Text Features
- URL: http://arxiv.org/abs/2305.10029v2
- Date: Mon, 3 Jul 2023 12:06:12 GMT
- Title: TextSLAM: Visual SLAM with Semantic Planar Text Features
- Authors: Boying Li, Danping Zou, Yuan Huang, Xinghan Niu, Ling Pei, Wenxian Yu
- Abstract summary: We propose a novel visual SLAM method that integrates text objects tightly by treating them as semantic features.
We tested our method in various scenes with the ground truth data.
The results show that integrating texture features leads to a more superior SLAM system that can match images across day and night.
- Score: 8.8100408194584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel visual SLAM method that integrates text objects tightly by
treating them as semantic features via fully exploring their geometric and
semantic prior. The text object is modeled as a texture-rich planar patch whose
semantic meaning is extracted and updated on the fly for better data
association. With the full exploration of locally planar characteristics and
semantic meaning of text objects, the SLAM system becomes more accurate and
robust even under challenging conditions such as image blurring, large
viewpoint changes, and significant illumination variations (day and night). We
tested our method in various scenes with the ground truth data. The results
show that integrating texture features leads to a more superior SLAM system
that can match images across day and night. The reconstructed semantic 3D text
map could be useful for navigation and scene understanding in robotic and mixed
reality applications. Our project page: https://github.com/SJTU-ViSYS/TextSLAM .
Related papers
- HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting [53.6394928681237]
holistic understanding of urban scenes based on RGB images is a challenging yet important problem.
Our main idea involves the joint optimization of geometry, appearance, semantics, and motion using a combination of static and dynamic 3D Gaussians.
Our approach offers the ability to render new viewpoints in real-time, yielding 2D and 3D semantic information with high accuracy.
arXiv Detail & Related papers (2024-03-19T13:39:05Z) - TextureDreamer: Image-guided Texture Synthesis through Geometry-aware
Diffusion [64.49276500129092]
TextureDreamer is an image-guided texture synthesis method.
It can transfer relightable textures from a small number of input images to target 3D shapes across arbitrary categories.
arXiv Detail & Related papers (2024-01-17T18:55:49Z) - DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - Directional Texture Editing for 3D Models [51.31499400557996]
ITEM3D is designed for automatic textbf3D object editing according to the text textbfInstructions.
Leveraging the diffusion models and the differentiable rendering, ITEM3D takes the rendered images as the bridge of text and 3D representation.
arXiv Detail & Related papers (2023-09-26T12:01:13Z) - TANGO: Text-driven Photorealistic and Robust 3D Stylization via Lighting
Decomposition [39.312567993736025]
We propose TANGO, which transfers the appearance style of a given 3D shape according to a text prompt in a photorealistic manner.
We show that TANGO outperforms existing methods of text-driven 3D style transfer in terms of photorealistic quality, consistency of 3D geometry, and robustness when stylizing low-quality meshes.
arXiv Detail & Related papers (2022-10-20T13:52:18Z) - Semantic Visual Simultaneous Localization and Mapping: A Survey [18.372996585079235]
This paper first reviews the development of semantic vSLAM, explicitly focusing on its strengths and differences.
Secondly, we explore three main issues of semantic vSLAM: the extraction and association of semantic information, the application of semantic information, and the advantages of semantic vSLAM.
Finally, we discuss future directions that will provide a blueprint for the future development of semantic vSLAM.
arXiv Detail & Related papers (2022-09-14T05:45:26Z) - Zero-Shot Text-Guided Object Generation with Dream Fields [111.06026544180398]
We combine neural rendering with multi-modal image and text representations to synthesize diverse 3D objects.
Our method, Dream Fields, can generate the geometry and color of a wide range of objects without 3D supervision.
In experiments, Dream Fields produce realistic, multi-view consistent object geometry and color from a variety of natural language captions.
arXiv Detail & Related papers (2021-12-02T17:53:55Z) - SSC: Semantic Scan Context for Large-Scale Place Recognition [13.228580954956342]
We explore the use of high-level features, namely semantics, to improve the representation ability of descriptors.
We propose a novel global descriptor, Semantic Scan Context, which explores semantic information to represent scenes more effectively.
Our approach outperforms the state-of-the-art methods with a large margin.
arXiv Detail & Related papers (2021-07-01T11:51:19Z) - TediGAN: Text-Guided Diverse Face Image Generation and Manipulation [52.83401421019309]
TediGAN is a framework for multi-modal image generation and manipulation with textual descriptions.
StyleGAN inversion module maps real images to the latent space of a well-trained StyleGAN.
visual-linguistic similarity learns the text-image matching by mapping the image and text into a common embedding space.
instance-level optimization is for identity preservation in manipulation.
arXiv Detail & Related papers (2020-12-06T16:20:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.