How to Build Robust, Scalable Models for GSV-Based Indicators in Neighborhood Research
- URL: http://arxiv.org/abs/2601.06443v1
- Date: Sat, 10 Jan 2026 06:00:09 GMT
- Title: How to Build Robust, Scalable Models for GSV-Based Indicators in Neighborhood Research
- Authors: Xiaoya Tang, Xiaohe Yue, Heran Mane, Dapeng Li, Quynh Nguyen, Tolga Tasdizen,
- Abstract summary: We show how to select and adapt foundation models for datasets with limited size and labels, while leveraging larger, unlabeled datasets through unsupervised training.<n>Our study includes comprehensive quantitative and visual analyses comparing model performance before and after unsupervised adaptation.
- Score: 5.236003339365069
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A substantial body of health research demonstrates a strong link between neighborhood environments and health outcomes. Recently, there has been increasing interest in leveraging advances in computer vision to enable large-scale, systematic characterization of neighborhood built environments. However, the generalizability of vision models across fundamentally different domains remains uncertain, for example, transferring knowledge from ImageNet to the distinct visual characteristics of Google Street View (GSV) imagery. In applied fields such as social health research, several critical questions arise: which models are most appropriate, whether to adopt unsupervised training strategies, what training scale is feasible under computational constraints, and how much such strategies benefit downstream performance. These decisions are often costly and require specialized expertise. In this paper, we answer these questions through empirical analysis and provide practical insights into how to select and adapt foundation models for datasets with limited size and labels, while leveraging larger, unlabeled datasets through unsupervised training. Our study includes comprehensive quantitative and visual analyses comparing model performance before and after unsupervised adaptation.
Related papers
- Generalist vs Specialist Time Series Foundation Models: Investigating Potential Emergent Behaviors in Assessing Human Health Using PPG Signals [22.364607686570384]
Foundation models are large-scale machine learning models that are pre-trained on massive amounts of data.<n>Recent works, such as MOMENT, train a generalist time series foundation model with data from multiple domains.<n>This paper aims to conduct a comprehensive benchmarking study to compare the performance of generalist and specialist models.
arXiv Detail & Related papers (2025-10-16T03:13:04Z) - A Comparative Study of Scanpath Models in Graph-Based Visualization [7.592272924252313]
Eye-tracking (ET) data presents challenges related to cost, privacy, and scalability.<n>In our study, we conducted an ET experiment with 40 participants who analyzed graphs.<n>We compared human scanpaths with synthetic ones generated by models such as DeepGaze, UMSS, and Gazeformer.
arXiv Detail & Related papers (2025-03-31T14:43:42Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models [56.63218531256961]
We introduce GenBench, a benchmarking suite specifically tailored for evaluating the efficacy of Genomic Foundation Models.
GenBench offers a modular and expandable framework that encapsulates a variety of state-of-the-art methodologies.
We provide a nuanced analysis of the interplay between model architecture and dataset characteristics on task-specific performance.
arXiv Detail & Related papers (2024-06-01T08:01:05Z) - BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation [57.40024206484446]
We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models.
BVS supports a large number of adjustable parameters at the scene level.
We showcase three example application scenarios.
arXiv Detail & Related papers (2024-05-15T17:57:56Z) - Comprehensive Exploration of Synthetic Data Generation: A Survey [4.485401662312072]
This work surveys 417 Synthetic Data Generation models over the last decade.
The findings reveal increased model performance and complexity, with neural network-based approaches prevailing.
Computer vision dominates, with GANs as primary generative models, while diffusion models, transformers, and RNNs compete.
arXiv Detail & Related papers (2024-01-04T20:23:51Z) - Representation Learning for Person or Entity-centric Knowledge Graphs:
An Application in Healthcare [0.757843972001219]
This paper presents an end-to-end representation learning framework to extract entity-centric KGs from structured and unstructured data.
We introduce a star-shaped classifier to represent the multiple facets of a person and use it to guide KG creation.
We highlight that this approach has several potential applications across domains and is open-sourced.
arXiv Detail & Related papers (2023-05-09T17:39:45Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - Towards Reliable Assessments of Demographic Disparities in Multi-Label
Image Classifiers [11.973749734226852]
We consider multi-label image classification and, specifically, object categorization tasks.
Design choices and trade-offs for measurement involve more nuance than discussed in prior computer vision literature.
We identify several design choices that look merely like implementation details but significantly impact the conclusions of assessments.
arXiv Detail & Related papers (2023-02-16T20:34:54Z) - Fine-Grained Image Analysis with Deep Learning: A Survey [146.22351342315233]
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition.
This paper attempts to re-define and broaden the field of FGIA by consolidating two fundamental fine-grained research areas -- fine-grained image recognition and fine-grained image retrieval.
arXiv Detail & Related papers (2021-11-11T09:43:56Z) - Automatic Gaze Analysis: A Survey of DeepLearning based Approaches [61.32686939754183]
Eye gaze analysis is an important research problem in the field of computer vision and Human-Computer Interaction.
There are several open questions including what are the important cues to interpret gaze direction in an unconstrained environment.
We review the progress across a range of gaze analysis tasks and applications to shed light on these fundamental questions.
arXiv Detail & Related papers (2021-08-12T00:30:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.