Localization in the Crowd with Topological Constraints
- URL: http://arxiv.org/abs/2012.12482v1
- Date: Wed, 23 Dec 2020 04:33:48 GMT
- Title: Localization in the Crowd with Topological Constraints
- Authors: Shahira Abousamra and Minh Hoai and Dimitris Samaras and Chao Chen
- Abstract summary: We introduce a topological constraint that teaches the model to reason about the spatial arrangement of dots.
Topological reasoning improves the quality of the localization algorithm especially near cluttered regions.
- Score: 47.51300472171983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the problem of crowd localization, i.e., the prediction of dots
corresponding to people in a crowded scene. Due to various challenges, a
localization method is prone to spatial semantic errors, i.e., predicting
multiple dots within a same person or collapsing multiple dots in a cluttered
region. We propose a topological approach targeting these semantic errors. We
introduce a topological constraint that teaches the model to reason about the
spatial arrangement of dots. To enforce this constraint, we define a
persistence loss based on the theory of persistent homology. The loss compares
the topographic landscape of the likelihood map and the topology of the ground
truth. Topological reasoning improves the quality of the localization algorithm
especially near cluttered regions. On multiple public benchmarks, our method
outperforms previous localization methods. Additionally, we demonstrate the
potential of our method in improving the performance in the crowd counting
task.
Related papers
- Spatial regularisation for improved accuracy and interpretability in keypoint-based registration [5.286949071316761]
Recent approaches based on unsupervised keypoint detection stand out as very promising for interpretability.
Here, we propose a three-fold loss to regularise the spatial distribution of the features.
Our loss considerably improves the interpretability of the features, which now correspond to precise and anatomically meaningful landmarks.
arXiv Detail & Related papers (2025-03-06T14:48:25Z) - Interpreting Object-level Foundation Models via Visual Precision Search [53.807678972967224]
We propose a Visual Precision Search method that generates accurate attribution maps with fewer regions.
Our method bypasses internal model parameters to overcome attribution issues from multimodal fusion.
Our method can interpret failures in visual grounding and object detection tasks, surpassing existing methods across multiple evaluation metrics.
arXiv Detail & Related papers (2024-11-25T08:54:54Z) - Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation [78.54656076915565]
Topological correctness plays a critical role in many image segmentation tasks.
Most networks are trained using pixel-wise loss functions, such as Dice, neglecting topological accuracy.
We propose a novel, graph-based framework for topologically accurate image segmentation.
arXiv Detail & Related papers (2024-11-05T16:20:14Z) - Disentangled Representation Learning with the Gromov-Monge Gap [65.73194652234848]
Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning.
We introduce a novel approach to disentangled representation learning based on quadratic optimal transport.
We demonstrate the effectiveness of our approach for quantifying disentanglement across four standard benchmarks.
arXiv Detail & Related papers (2024-07-10T16:51:32Z) - Dynamical localization in 2D topological quantum random walks [0.0]
We study the dynamical localization of discrete time evolution of topological split-step quantum random walk (QRW) on a single-site defect.
By investigating the spectral properties of the discrete-time evolution operators, we show that trapped states have large overlap with the initial uniformly distributed state.
We show that mechanism of localization we identified is robust against the influence of disorder.
arXiv Detail & Related papers (2024-06-26T21:36:47Z) - Multi-Resolution Planar Region Extraction for Uneven Terrains [6.482137641059034]
This paper studies the problem of extracting planar regions in uneven terrains from unordered point cloud measurements.
We propose a multi-resolution planar region extraction strategy that balances the accuracy in boundaries and computational efficiency.
arXiv Detail & Related papers (2023-11-21T12:17:51Z) - Disentanglement Learning via Topology [22.33086299021419]
We propose TopDis, a method for learning disentangled representations via adding a multi-scale topological loss term.
Disentanglement is a crucial property of data representations substantial for the explainability and robustness of deep learning models.
We show how to use the proposed topological loss to find disentangled directions in a trained GAN.
arXiv Detail & Related papers (2023-08-24T10:29:25Z) - Consistency-Aware Anchor Pyramid Network for Crowd Localization [167.93943981468348]
Crowd localization aims to predict the spatial position of humans in a crowd scenario.
We propose an anchor pyramid scheme to adaptively determine the anchor density in each image region.
arXiv Detail & Related papers (2022-12-08T04:32:01Z) - A Cluster-based Approach for Improving Isotropy in Contextual Embedding
Space [18.490856440975996]
The representation degeneration problem in Contextual Word Representations (CWRs) hurts the expressiveness of the embedding space.
We propose a local cluster-based method to address the degeneration issue in contextual embedding spaces.
We show that removing dominant directions of verb representations can transform the space to better suit semantic applications.
arXiv Detail & Related papers (2021-06-02T14:26:37Z) - Discrete Variational Attention Models for Language Generation [51.88612022940496]
We propose a discrete variational attention model with categorical distribution over the attention mechanism owing to the discrete nature in languages.
Thanks to the property of discreteness, the training of our proposed approach does not suffer from posterior collapse.
arXiv Detail & Related papers (2020-04-21T05:49:04Z) - Multi-View Optimization of Local Feature Geometry [70.18863787469805]
We address the problem of refining the geometry of local image features from multiple views without known scene or camera geometry.
Our proposed method naturally complements the traditional feature extraction and matching paradigm.
We show that our method consistently improves the triangulation and camera localization performance for both hand-crafted and learned local features.
arXiv Detail & Related papers (2020-03-18T17:22:11Z) - Focus on Semantic Consistency for Cross-domain Crowd Understanding [34.560447389853614]
Some domain adaptation algorithms try to liberate it by training models with synthetic data.
We found that a mass of estimation errors in the background areas impede the performance of the existing methods.
In this paper, we propose a domain adaptation method to eliminate it.
arXiv Detail & Related papers (2020-02-20T08:51:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.