Exploring Urban Factors with Autoencoders: Relationship Between Static and Dynamic Features
- URL: http://arxiv.org/abs/2509.06167v1
- Date: Sun, 07 Sep 2025 18:37:04 GMT
- Title: Exploring Urban Factors with Autoencoders: Relationship Between Static and Dynamic Features
- Authors: Ximena Pocco, Waqar Hassan, Karelia Salinas, Vladimir Molchanov, Luis G. Nonato,
- Abstract summary: We develop a visualization-assisted framework to analyze whether fused latent data representations are more effective than separate representations.<n>The analysis reveals that combined latent representations produce more structured patterns, while separate ones are useful in particular cases.
- Score: 0.9623578875486182
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Urban analytics utilizes extensive datasets with diverse urban information to simulate, predict trends, and uncover complex patterns within cities. While these data enables advanced analysis, it also presents challenges due to its granularity, heterogeneity, and multimodality. To address these challenges, visual analytics tools have been developed to support the exploration of latent representations of fused heterogeneous and multimodal data, discretized at a street-level of detail. However, visualization-assisted tools seldom explore the extent to which fused data can offer deeper insights than examining each data source independently within an integrated visualization framework. In this work, we developed a visualization-assisted framework to analyze whether fused latent data representations are more effective than separate representations in uncovering patterns from dynamic and static urban data. The analysis reveals that combined latent representations produce more structured patterns, while separate ones are useful in particular cases.
Related papers
- Multi-dimensional Data Analysis and Applications Basing on LLM Agents and Knowledge Graph Interactions [22.880788190504827]
Large Language Models (LLMs) perform well in natural language understanding and generation, but suffer from "hallucination" issues when processing structured knowledge.<n>This paper proposes a multi-dimensional data analysis method based on the interactions between LLM agents and Knowledge Graphs.
arXiv Detail & Related papers (2025-10-17T02:38:44Z) - Urbanite: A Dataflow-Based Framework for Human-AI Interactive Alignment in Urban Visual Analytics [4.107382739138796]
Urban visual analytics has become essential for deriving insights into pressing real-world problems.<n>The need to manage diverse datasets, distill intricate, and integrate various analytical methods presents a high barrier to entry.<n>We propose Urbanite, a framework for human-AI collaboration in urban visual analytics.
arXiv Detail & Related papers (2025-08-10T15:44:37Z) - Interactive dense pixel visualizations for time series and model attribution explanations [8.24039921933289]
DAVOTS is an interactive visual analytics approach to explore raw time series data, activations of neural networks, and attributions in a dense-pixel visualization.
We apply clustering approaches to the visualized data domains to highlight groups and present ordering strategies for individual and combined data exploration.
arXiv Detail & Related papers (2024-08-27T14:02:21Z) - Knowledge-Aware Reasoning over Multimodal Semi-structured Tables [85.24395216111462]
This study investigates whether current AI models can perform knowledge-aware reasoning on multimodal structured data.
We introduce MMTabQA, a new dataset designed for this purpose.
Our experiments highlight substantial challenges for current AI models in effectively integrating and interpreting multiple text and image inputs.
arXiv Detail & Related papers (2024-08-25T15:17:43Z) - Beyond DAGs: A Latent Partial Causal Model for Multimodal Learning [80.44084021062105]
We propose a novel latent partial causal model for multimodal data, featuring two latent coupled variables, connected by an undirected edge, to represent the transfer of knowledge across modalities.<n>Under specific statistical assumptions, we establish an identifiability result, demonstrating that representations learned by multimodal contrastive learning correspond to the latent coupled variables up to a trivial transformation.<n>Experiments on a pre-trained CLIP model embodies disentangled representations, enabling few-shot learning and improving domain generalization across diverse real-world datasets.
arXiv Detail & Related papers (2024-02-09T07:18:06Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - SGED: A Benchmark dataset for Performance Evaluation of Spiking Gesture
Emotion Recognition [12.396844568607522]
We label a new homogeneous multimodal gesture emotion recognition dataset based on the analysis of the existing data sets.
We propose a pseudo dual-flow network based on this dataset, and verify the application potential of this dataset in the affective computing community.
arXiv Detail & Related papers (2023-04-28T09:32:09Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - A graph representation based on fluid diffusion model for multimodal
data analysis: theoretical aspects and enhanced community detection [14.601444144225875]
We introduce a novel model for graph definition based on fluid diffusion.
Our method is able to strongly outperform state-of-the-art schemes for community detection in multimodal data analysis.
arXiv Detail & Related papers (2021-12-07T16:30:03Z) - A Variational Information Bottleneck Approach to Multi-Omics Data
Integration [98.6475134630792]
We propose a deep variational information bottleneck (IB) approach for incomplete multi-view observations.
Our method applies the IB framework on marginal and joint representations of the observed views to focus on intra-view and inter-view interactions that are relevant for the target.
Experiments on real-world datasets show that our method consistently achieves gain from data integration and outperforms state-of-the-art benchmarks.
arXiv Detail & Related papers (2021-02-05T06:05:39Z) - Informative Scene Decomposition for Crowd Analysis, Comparison and
Simulation Guidance [10.000622844914272]
Crowd simulation is a central topic in several fields including graphics.
With the fast-growing volume of crowd data, such a bottleneck needs to be addressed.
We propose a new framework which comprehensively tackles this problem.
arXiv Detail & Related papers (2020-04-29T12:03:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.