TextSLAM: Visual SLAM with Semantic Planar Text Features
- URL: http://arxiv.org/abs/2305.10029v2
- Date: Mon, 3 Jul 2023 12:06:12 GMT
- Title: TextSLAM: Visual SLAM with Semantic Planar Text Features
- Authors: Boying Li, Danping Zou, Yuan Huang, Xinghan Niu, Ling Pei, Wenxian Yu
- Abstract summary: We propose a novel visual SLAM method that integrates text objects tightly by treating them as semantic features.
We tested our method in various scenes with the ground truth data.
The results show that integrating texture features leads to a more superior SLAM system that can match images across day and night.
- Score: 8.8100408194584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel visual SLAM method that integrates text objects tightly by
treating them as semantic features via fully exploring their geometric and
semantic prior. The text object is modeled as a texture-rich planar patch whose
semantic meaning is extracted and updated on the fly for better data
association. With the full exploration of locally planar characteristics and
semantic meaning of text objects, the SLAM system becomes more accurate and
robust even under challenging conditions such as image blurring, large
viewpoint changes, and significant illumination variations (day and night). We
tested our method in various scenes with the ground truth data. The results
show that integrating texture features leads to a more superior SLAM system
that can match images across day and night. The reconstructed semantic 3D text
map could be useful for navigation and scene understanding in robotic and mixed
reality applications. Our project page: https://github.com/SJTU-ViSYS/TextSLAM .
Related papers
- PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM [105.01907579424362]
PanoSLAM is the first SLAM system to integrate geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation within a unified framework.
For the first time, it achieves panoptic 3D reconstruction of open-world environments directly from the RGB-D video.
arXiv Detail & Related papers (2024-12-31T08:58:10Z) - Textured Mesh Saliency: Bridging Geometry and Texture for Human Perception in 3D Graphics [50.23625950905638]
We present a new dataset for textured mesh saliency, created through an innovative eye-tracking experiment in a six degrees of freedom (6-DOF) VR environment.
Our proposed model predicts saliency maps for textured mesh surfaces by treating each triangular face as an individual unit and assigning a saliency density value to reflect the importance of each local surface region.
arXiv Detail & Related papers (2024-12-11T08:27:33Z) - Hier-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting [28.821276113559346]
We propose a semantic 3D Gaussian Splatting SLAM method featuring a novel hierarchical categorical representation.
Our Hier-SLAM outperforms existing dense SLAM methods in both mapping and tracking accuracy, while achieving a 2x operation speed-up.
It showcases the capability of handling the complex real-world scene with more than 500 semantic classes, highlighting its valuable scaling-up capability.
arXiv Detail & Related papers (2024-09-19T07:18:41Z) - TextureDreamer: Image-guided Texture Synthesis through Geometry-aware
Diffusion [64.49276500129092]
TextureDreamer is an image-guided texture synthesis method.
It can transfer relightable textures from a small number of input images to target 3D shapes across arbitrary categories.
arXiv Detail & Related papers (2024-01-17T18:55:49Z) - DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - Directional Texture Editing for 3D Models [51.31499400557996]
ITEM3D is designed for automatic textbf3D object editing according to the text textbfInstructions.
Leveraging the diffusion models and the differentiable rendering, ITEM3D takes the rendered images as the bridge of text and 3D representation.
arXiv Detail & Related papers (2023-09-26T12:01:13Z) - Semantic Visual Simultaneous Localization and Mapping: A Survey [18.372996585079235]
This paper first reviews the development of semantic vSLAM, explicitly focusing on its strengths and differences.
Secondly, we explore three main issues of semantic vSLAM: the extraction and association of semantic information, the application of semantic information, and the advantages of semantic vSLAM.
Finally, we discuss future directions that will provide a blueprint for the future development of semantic vSLAM.
arXiv Detail & Related papers (2022-09-14T05:45:26Z) - SSC: Semantic Scan Context for Large-Scale Place Recognition [13.228580954956342]
We explore the use of high-level features, namely semantics, to improve the representation ability of descriptors.
We propose a novel global descriptor, Semantic Scan Context, which explores semantic information to represent scenes more effectively.
Our approach outperforms the state-of-the-art methods with a large margin.
arXiv Detail & Related papers (2021-07-01T11:51:19Z) - TediGAN: Text-Guided Diverse Face Image Generation and Manipulation [52.83401421019309]
TediGAN is a framework for multi-modal image generation and manipulation with textual descriptions.
StyleGAN inversion module maps real images to the latent space of a well-trained StyleGAN.
visual-linguistic similarity learns the text-image matching by mapping the image and text into a common embedding space.
instance-level optimization is for identity preservation in manipulation.
arXiv Detail & Related papers (2020-12-06T16:20:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.