Modern approaches to building interpretable models of the property market using machine learning on the base of mass cadastral valuation
- URL: http://arxiv.org/abs/2506.15723v2
- Date: Sun, 13 Jul 2025 00:09:43 GMT
- Title: Modern approaches to building interpretable models of the property market using machine learning on the base of mass cadastral valuation
- Authors: Irina G. Tanashkina, Alexey S. Tanashkin, Alexander S. Maksimchuik, Anna Yu. Poshivailo,
- Abstract summary: This paper covers all stages of modeling: the collection of initial data, identification of outliers, the search and analysis of patterns in the data, the formation and final choice of price factors, the building of the model, and the evaluation of its efficiency.<n>We show that the combination of classical linear regression with the methods of geostatistics allows to build an effective model for land parcels.
- Score: 41.94295877935867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this article, we review modern approaches to building interpretable models of property markets using machine learning on the base of mass valuation of property in the Primorye region, Russia. The researcher, lacking expertise in this topic, encounters numerous difficulties in the effort to build a good model. The main source of this is the huge difference between noisy real market data and ideal data which is very common in all types of tutorials on machine learning. This paper covers all stages of modeling: the collection of initial data, identification of outliers, the search and analysis of patterns in the data, the formation and final choice of price factors, the building of the model, and the evaluation of its efficiency. For each stage, we highlight potential issues and describe sound methods for overcoming emerging difficulties on actual examples. We show that the combination of classical linear regression with interpolation methods of geostatistics allows to build an effective model for land parcels. For flats, when many objects are attributed to one spatial point the application of geostatistical methods is difficult. Therefore we suggest linear regression with automatic generation and selection of additional rules on the base of decision trees, so called the RuleFit method. Thus we show, that despite such a strong restriction as the requirement of interpretability which is important in practical aspects, for example, legal matters, it is still possible to build effective models of real property markets.
Related papers
- Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Are Neural Topic Models Broken? [81.15470302729638]
We study the relationship between automated and human evaluation of topic models.
We find that neural topic models fare worse in both respects compared to an established classical method.
arXiv Detail & Related papers (2022-10-28T14:38:50Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Bias-inducing geometries: an exactly solvable data model with fairness implications [12.532003449620607]
We introduce an exactly solvable high-dimensional model of data imbalance.<n>We analytically unpack the typical properties of learning models trained in this synthetic framework.<n>We obtain exact predictions for the observables that are commonly employed for fairness assessment.
arXiv Detail & Related papers (2022-05-31T16:27:57Z) - Beyond Explaining: Opportunities and Challenges of XAI-Based Model
Improvement [75.00655434905417]
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex machine learning (ML) models.
This paper offers a comprehensive overview over techniques that apply XAI practically for improving various properties of ML models.
We show empirically through experiments on toy and realistic settings how explanations can help improve properties such as model generalization ability or reasoning.
arXiv Detail & Related papers (2022-03-15T15:44:28Z) - Similarity learning for wells based on logging data [8.265576412171702]
We propose a novel framework to solve the geological profile similarity estimation based on a deep learning model.
Our similarity model takes well-logging data as input and provides the similarity of wells as output.
For model testing, we used two open datasets originating in New Zealand and Norway.
arXiv Detail & Related papers (2022-02-11T12:47:56Z) - A Scalable Inference Method For Large Dynamic Economic Systems [19.757929782329892]
We present a novel Variational Bayesian Inference approach to incorporate a time-varying parameter auto-regressive model.
Our model is applied to a large blockchain dataset containing prices, transactions of individual actors, analyzing transactional flows and price movements.
We further improve the simple state-space modelling by introducing non-linearities in the forward model with the help of machine learning architectures.
arXiv Detail & Related papers (2021-10-27T10:52:17Z) - A Topological-Framework to Improve Analysis of Machine Learning Model
Performance [5.3893373617126565]
We propose a framework for evaluating machine learning models in which a dataset is treated as a "space" on which a model operates.
We describe a topological data structure, presheaves, which offer a convenient way to store and analyze model performance between different subpopulations.
arXiv Detail & Related papers (2021-07-09T23:11:13Z) - Accurate and Intuitive Contextual Explanations using Linear Model Trees [0.0]
Local post hoc model explanations have gained massive adoption.
Current state of the art methods use rudimentary methods to generate synthetic data around the point to be explained.
We use a Generative Adversarial Network for synthetic data generation and train a piecewise linear model in the form of Linear Model Trees.
arXiv Detail & Related papers (2020-09-11T10:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.