TensorAnalyzer: Identification of Urban Patterns in Big Cities using
Non-Negative Tensor Factorization
- URL: http://arxiv.org/abs/2210.02623v1
- Date: Thu, 6 Oct 2022 01:04:02 GMT
- Title: TensorAnalyzer: Identification of Urban Patterns in Big Cities using
Non-Negative Tensor Factorization
- Authors: Jaqueline Silveira, Germain Garc\'ia, Afonso Paiva, Marcelo Nery,
Sergio Adorno, Luis Gustavo Nonato
- Abstract summary: This paper presents a new approach to detecting the most relevant urban patterns from multiple data sources based on tensor decomposition.
We developed a generic framework namedAnalyzer, where the effectiveness and usefulness of the proposed methodology are tested.
- Score: 8.881421521529198
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Extracting relevant urban patterns from multiple data sources can be
difficult using classical clustering algorithms since we have to make a
suitable setup of the hyperparameters of the algorithms and deal with outliers.
It should be addressed correctly to help urban planners in the decision-making
process for the further development of a big city. For instance, experts' main
interest in criminology is comprehending the relationship between crimes and
the socio-economic characteristics at specific georeferenced locations. In
addition, the classical clustering algorithms take little notice of the
intricate spatial correlations in georeferenced data sources. This paper
presents a new approach to detecting the most relevant urban patterns from
multiple data sources based on tensor decomposition. Compared to classical
methods, the proposed approach's performance is attested to validate the
identified patterns' quality. The result indicates that the approach can
effectively identify functional patterns to characterize the data set for
further analysis in achieving good clustering quality. Furthermore, we
developed a generic framework named TensorAnalyzer, where the effectiveness and
usefulness of the proposed methodology are tested by a set of experiments and a
real-world case study showing the relationship between the crime events around
schools and students performance and other variables involved in the analysis.
Related papers
- Comprehensive Review and Empirical Evaluation of Causal Discovery Algorithms for Numerical Data [3.9523536371670045]
Causal analysis has become an essential component in understanding the underlying causes of phenomena across various fields.
Existing literature on causal discovery algorithms is fragmented, with inconsistent methodologies.
A lack of comprehensive evaluations, i.e., data characteristics are often ignored to be jointly analyzed when benchmarking algorithms.
arXiv Detail & Related papers (2024-07-17T23:47:05Z) - Hierarchical Bayes Approach to Personalized Federated Unsupervised
Learning [7.8583640700306585]
We develop algorithms based on optimization criteria inspired by a hierarchical Bayesian statistical framework.
We develop adaptive algorithms that discover the balance between using limited local data and collaborative information.
We evaluate our proposed algorithms using synthetic and real data, demonstrating the effective sample amplification for personalized tasks.
arXiv Detail & Related papers (2024-02-19T20:53:27Z) - Geospatial Disparities: A Case Study on Real Estate Prices in Paris [0.3495246564946556]
We propose a toolkit for identifying and mitigating biases arising from geospatial data.
We incorporate an ordinal regression case with spatial attributes, deviating from the binary classification focus.
Illustrating our methodology, we showcase practical applications and scrutinize the implications of choosing geographical aggregation levels for fairness and calibration measures.
arXiv Detail & Related papers (2024-01-29T14:53:14Z) - Improving Heterogeneous Model Reuse by Density Estimation [105.97036205113258]
This paper studies multiparty learning, aiming to learn a model using the private data of different participants.
Model reuse is a promising solution for multiparty learning, assuming that a local model has been trained for each party.
arXiv Detail & Related papers (2023-05-23T09:46:54Z) - Hub-VAE: Unsupervised Hub-based Regularization of Variational
Autoencoders [11.252245456934348]
We propose an unsupervised, data-driven regularization of the latent space with a mixture of hub-based priors and a hub-based contrastive loss.
Our algorithm achieves superior cluster separability in the embedding space, and accurate data reconstruction and generation.
arXiv Detail & Related papers (2022-11-18T19:12:15Z) - Detection and Evaluation of Clusters within Sequential Data [58.720142291102135]
Clustering algorithms for Block Markov Chains possess theoretical optimality guarantees.
In particular, our sequential data is derived from human DNA, written text, animal movement data and financial markets.
It is found that the Block Markov Chain model assumption can indeed produce meaningful insights in exploratory data analyses.
arXiv Detail & Related papers (2022-10-04T15:22:39Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Riemannian classification of EEG signals with missing values [67.90148548467762]
This paper proposes two strategies to handle missing data for the classification of electroencephalograms.
The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm.
As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.
arXiv Detail & Related papers (2021-10-19T14:24:50Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - On clustering uncertain and structured data with Wasserstein barycenters
and a geodesic criterion for the number of clusters [0.0]
This work considers the notion of Wasserstein barycenters, accompanied by appropriate clustering indices based on the intrinsic geometry of the Wasserstein space where the clustering task is performed.
Such type of clustering approaches are highly appreciated in many fields where the observational/experimental error is significant.
Under this perspective, each observation is identified by an appropriate probability measure and the proposed clustering schemes rely on discrimination criteria.
arXiv Detail & Related papers (2019-12-26T08:46:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.