A Multidimensional AI-powered Framework for Analyzing Tourist Perception in Historic Urban Quarters: A Case Study in Shanghai
- URL: http://arxiv.org/abs/2509.03830v1
- Date: Thu, 04 Sep 2025 02:35:14 GMT
- Title: A Multidimensional AI-powered Framework for Analyzing Tourist Perception in Historic Urban Quarters: A Case Study in Shanghai
- Authors: Kaizhen Tan, Yufan Wu, Yuxuan Liu, Haoran Zeng,
- Abstract summary: This study proposes a multidimensional AI-powered framework for analyzing tourist perception in historic urban quarters.<n> Applied to twelve historic quarters in central Shanghai, the framework integrates focal point extraction, color theme analysis, and sentiment mining.
- Score: 5.077286019454655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Historic urban quarters play a vital role in preserving cultural heritage while serving as vibrant spaces for tourism and everyday life. Understanding how tourists perceive these environments is essential for sustainable, human-centered urban planning. This study proposes a multidimensional AI-powered framework for analyzing tourist perception in historic urban quarters using multimodal data from social media. Applied to twelve historic quarters in central Shanghai, the framework integrates focal point extraction, color theme analysis, and sentiment mining. Visual focus areas are identified from tourist-shared photos using a fine-tuned semantic segmentation model. To assess aesthetic preferences, dominant colors are extracted using a clustering method, and their spatial distribution across quarters is analyzed. Color themes are further compared between social media photos and real-world street views, revealing notable shifts. This divergence highlights potential gaps between visual expectations and the built environment, reflecting both stylistic preferences and perceptual bias. Tourist reviews are evaluated through a hybrid sentiment analysis approach combining a rule-based method and a multi-task BERT model. Satisfaction is assessed across four dimensions: tourist activities, built environment, service facilities, and business formats. The results reveal spatial variations in aesthetic appeal and emotional response. Rather than focusing on a single technical innovation, this framework offers an integrated, data-driven approach to decoding tourist perception and contributes to informed decision-making in tourism, heritage conservation, and the design of aesthetically engaging public spaces.
Related papers
- Exploring Sidewalk Sheds in New York City through Chatbot Surveys and Human Computer Interaction [47.311965900698084]
We develop an AI-based survey that collects image-based annotations and route choices from pedestrians.<n>This paper conducts a grid-based analysis of entrance annotations and applies logistic mixed-effects modeling to assess sidewalk choice patterns.<n>By integrating generative AI into urban research, this study demonstrates a novel method for evaluating sidewalk shed designs.
arXiv Detail & Related papers (2026-01-30T15:41:44Z) - Measuring Social Bias in Vision-Language Models with Face-Only Counterfactuals from Real Photos [79.03150233804458]
Real-world images entangle race and gender with correlated factors such as background and clothing, obscuring attribution.<n>We propose a textbfface-only counterfactual evaluation paradigm<n>We generate counterfactual variants by editing only facial attributes related to race and gender, keeping all other visual factors fixed.
arXiv Detail & Related papers (2026-01-11T14:35:06Z) - Bridging Cognitive Gap: Hierarchical Description Learning for Artistic Image Aesthetics Assessment [51.40989269202702]
aesthetic quality assessment task is crucial for developing a human-aligned quantitative evaluation system for AIGC.<n>We propose ArtQuant, an aesthetics assessment framework for artistic images which couples isolated aesthetic dimensions through description generation.<n>Our approach achieves epoch state-of-the-art performance on several datasets while requiring only 33% of conventional trainings.
arXiv Detail & Related papers (2025-12-29T12:18:26Z) - Street Review: A Participatory AI-Based Framework for Assessing Streetscape Inclusivity [0.0]
This study presents Street Review, a mixed-methods approach that combines participatory research with AI-based analysis to assess streetscape inclusivity.<n>In Montr'eal, Canada, 28 residents participated in semi-directed interviews and image evaluations, supported by the analysis of 45,000 street-view images from Mapillary.<n>Findings reveal variations in perceptions of inclusivity and accessibility across demographic groups, demonstrating that incorporating diverse user feedback can enhance machine learning models.
arXiv Detail & Related papers (2025-08-14T02:40:56Z) - Street View Sociability: Interpretable Analysis of Urban Social Behavior Across 15 Cities [1.256245863497516]
We analyzed 2,998 street view images from 15 cities using a multimodal large language model.<n>Results align with long-standing urban planning theory.<n>Further research could establish street view imagery as a scalable, privacy-preserving tool for studying urban sociability.
arXiv Detail & Related papers (2025-08-08T14:15:58Z) - Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers [90.4459196223986]
A similar evolution is now unfolding in AI, marking a paradigm shift from models that merely think about images to those that can truly think with images.<n>This emerging paradigm is characterized by models leveraging visual information as intermediate steps in their thought process, transforming vision from a passive input into a dynamic, manipulable cognitive workspace.
arXiv Detail & Related papers (2025-06-30T14:48:35Z) - Interpretable Multimodal Framework for Human-Centered Street Assessment: Integrating Visual-Language Models for Perceptual Urban Diagnostics [0.0]
This study introduces a novel Multimodal Street Evaluation Framework (MSEF)<n>We fine-tune the framework using LoRA and P-Tuning v2 for parameter-efficient adaptation.<n>The model achieves an F1 score of 0.84 on objective features and 89.3 percent agreement with aggregated resident perceptions.
arXiv Detail & Related papers (2025-06-05T14:34:04Z) - SITE: towards Spatial Intelligence Thorough Evaluation [121.1493852562597]
Spatial intelligence (SI) represents a cognitive ability encompassing the visualization, manipulation, and reasoning about spatial relationships.<n>We introduce SITE, a benchmark dataset towards SI Thorough Evaluation.<n>Our approach to curating the benchmark combines a bottom-up survey about 31 existing datasets and a top-down strategy drawing upon three classification systems in cognitive science.
arXiv Detail & Related papers (2025-05-08T17:45:44Z) - InclusiViz: Visual Analytics of Human Mobility Data for Understanding and Mitigating Urban Segregation [41.758626973743525]
InclusiViz is a novel visual analytics system for multi-level analysis of urban segregation.<n>We developed a deep learning model to predict mobility patterns across social groups using environmental features, augmented with explainable AI.<n>The system integrates innovative visualizations that allow users to explore segregation patterns from broad overviews to fine-grained detail.
arXiv Detail & Related papers (2025-01-07T07:50:36Z) - When Does Perceptual Alignment Benefit Vision Representations? [76.32336818860965]
We investigate how aligning vision model representations to human perceptual judgments impacts their usability.
We find that aligning models to perceptual judgments yields representations that improve upon the original backbones across many downstream tasks.
Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
arXiv Detail & Related papers (2024-10-14T17:59:58Z) - Towards Geographic Inclusion in the Evaluation of Text-to-Image Models [25.780536950323683]
We study how much annotators in Africa, Europe, and Southeast Asia vary in their perception of geographic representation, visual appeal, and consistency in real and generated images.
For example, annotators in different locations often disagree on whether exaggerated, stereotypical depictions of a region are considered geographically representative.
We recommend steps for improved automatic and human evaluations.
arXiv Detail & Related papers (2024-05-07T16:23:06Z) - Affective Image Content Analysis: Two Decades Review and New
Perspectives [132.889649256384]
We will comprehensively review the development of affective image content analysis (AICA) in the recent two decades.
We will focus on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence.
We discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.
arXiv Detail & Related papers (2021-06-30T15:20:56Z) - Placepedia: Comprehensive Place Understanding with Multi-Faceted
Annotations [79.80036503792985]
We contribute Placepedia, a large-scale place dataset with more than 35M photos from 240K unique places.
Besides the photos, each place also comes with massive multi-faceted information, e.g. GDP, population, etc.
This dataset, with its large amount of data and rich annotations, allows various studies to be conducted.
arXiv Detail & Related papers (2020-07-07T20:17:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.