Visualizing Routes with AI-Discovered Street-View Patterns
- URL: http://arxiv.org/abs/2404.00431v1
- Date: Sat, 30 Mar 2024 17:32:26 GMT
- Title: Visualizing Routes with AI-Discovered Street-View Patterns
- Authors: Tsung Heng Wu, Md Amiruzzaman, Ye Zhao, Deepshikha Bhati, Jing Yang,
- Abstract summary: We propose a solution of using semantic latent vectors for quantifying visual appearance features.
We calculate image similarities among a large set of street-view images and then discover spatial imagery patterns.
We present VivaRoutes, an interactive visualization prototype, to show how visualizations leveraged with these discovered patterns can help users effectively and interactively explore multiple routes.
- Score: 4.153397474276339
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Street-level visual appearances play an important role in studying social systems, such as understanding the built environment, driving routes, and associated social and economic factors. It has not been integrated into a typical geographical visualization interface (e.g., map services) for planning driving routes. In this paper, we study this new visualization task with several new contributions. First, we experiment with a set of AI techniques and propose a solution of using semantic latent vectors for quantifying visual appearance features. Second, we calculate image similarities among a large set of street-view images and then discover spatial imagery patterns. Third, we integrate these discovered patterns into driving route planners with new visualization techniques. Finally, we present VivaRoutes, an interactive visualization prototype, to show how visualizations leveraged with these discovered patterns can help users effectively and interactively explore multiple routes. Furthermore, we conducted a user study to assess the usefulness and utility of VivaRoutes.
Related papers
- TopView: Vectorising road users in a bird's eye view from uncalibrated street-level imagery with deep learning [2.7195102129095003]
We introduce a simple approach for estimating a bird's eye view from images without prior knowledge of a given camera's intrinsic and extrinsic parameters.
The framework has been applied to several applications to generate a live Map from camera feeds and to analyse social distancing violations at the city scale.
arXiv Detail & Related papers (2024-12-18T21:55:58Z) - VISTA: A Panoramic View of Neural Representations [0.6993026261767287]
We present VISTA (Visualization of Internal States and Their Associations), a novel pipeline for visually exploring and interpreting neural network representations.
We address the challenge of analyzing vast multidimensional spaces in modern machine learning models by mapping representations into a semantic 2D space.
We demonstrate VISTA's utility by applying it to sparse autoencoder latents uncovering new properties and interpretations.
arXiv Detail & Related papers (2024-12-03T12:12:03Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - Peripheral Vision Transformer [52.55309200601883]
We take a biologically inspired approach and explore to model peripheral vision in deep neural networks for visual recognition.
We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data.
We evaluate the proposed network, dubbed PerViT, on the large-scale ImageNet dataset and systematically investigate the inner workings of the model for machine perception.
arXiv Detail & Related papers (2022-06-14T12:47:47Z) - Structured Bird's-Eye-View Traffic Scene Understanding from Onboard
Images [128.881857704338]
We study the problem of extracting a directed graph representing the local road network in BEV coordinates, from a single onboard camera image.
We show that the method can be extended to detect dynamic objects on the BEV plane.
We validate our approach against powerful baselines and show that our network achieves superior performance.
arXiv Detail & Related papers (2021-10-05T12:40:33Z) - Deep Learning for Embodied Vision Navigation: A Survey [108.13766213265069]
"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.
This paper attempts to establish an outline of the current works in the field of embodied visual navigation by providing a comprehensive literature survey.
arXiv Detail & Related papers (2021-07-07T12:09:04Z) - Exploring Visual Engagement Signals for Representation Learning [56.962033268934015]
We present VisE, a weakly supervised learning approach, which maps social images to pseudo labels derived by clustered engagement signals.
We then study how models trained in this way benefit subjective downstream computer vision tasks such as emotion recognition or political bias detection.
arXiv Detail & Related papers (2021-04-15T20:50:40Z) - MaAST: Map Attention with Semantic Transformersfor Efficient Visual
Navigation [4.127128889779478]
This work focuses on performing better or comparable to the existing learning-based solutions for visual navigation for autonomous agents.
We propose a method to encode vital scene semantics into a semantically informed, top-down egocentric map representation.
We conduct experiments on 3-D reconstructed indoor PointGoal visual navigation and demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2021-03-21T12:01:23Z) - Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery [80.6282101835164]
We present a new approach for synthesizing a novel street-view panorama given an overhead satellite image.
Our method generates a Google's omnidirectional street-view type panorama, as if it is captured from the same geographical location as the center of the satellite patch.
arXiv Detail & Related papers (2021-03-02T10:27:05Z) - SpotNet: Self-Attention Multi-Task Network for Object Detection [11.444576186559487]
We produce foreground/background segmentation labels in a semi-supervised way, using background subtraction or optical flow.
We use those segmentation maps inside the network as a self-attention mechanism to weight the feature map used to produce the bounding boxes.
We show that by using this method, we obtain a significant mAP improvement on two traffic surveillance datasets.
arXiv Detail & Related papers (2020-02-13T14:43:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.