Related papers: Multimodal Machine Learning for Real Estate Appraisal: A Comprehensive Survey

Multimodal Machine Learning for Real Estate Appraisal: A Comprehensive Survey

URL: http://arxiv.org/abs/2503.22119v1
Date: Fri, 28 Mar 2025 03:47:06 GMT
Title: Multimodal Machine Learning for Real Estate Appraisal: A Comprehensive Survey
Authors: Chenya Huang, Zhidong Li, Fang Chen, Bin Liang,
Abstract summary: A novel approach to automated valuation, multimodal machine learning, has taken shape.<n> multimodal machine learning significantly outperforms single-modality or fewer-modality approaches in terms of prediction accuracy.
Score: 8.250749654561423
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real estate appraisal has undergone a significant transition from manual to automated valuation and is entering a new phase of evolution. Leveraging comprehensive attention to various data sources, a novel approach to automated valuation, multimodal machine learning, has taken shape. This approach integrates multimodal data to deeply explore the diverse factors influencing housing prices. Furthermore, multimodal machine learning significantly outperforms single-modality or fewer-modality approaches in terms of prediction accuracy, with enhanced interpretability. However, systematic and comprehensive survey work on the application in the real estate domain is still lacking. In this survey, we aim to bridge this gap by reviewing the research efforts. We begin by reviewing the background of real estate appraisal and propose two research questions from the perspecve of performance and fusion aimed at improving the accuracy of appraisal results. Subsequently, we explain the concept of multimodal machine learning and provide a comprehensive classification and definition of modalities used in real estate appraisal for the first time. To ensure clarity, we explore works related to data and techniques, along with their evaluation methods, under the framework of these two research questions. Furthermore, specific application domains are summarized. Finally, we present insights into future research directions including multimodal complementarity, technology and modality contribution.

Related papers

Decoding the Multimodal Maze: A Systematic Review on the Adoption of Explainability in Multimodal Attention-based Models [0.0]
This systematic literature review analyzes research published between January 2020 and early 2024 that focuses on the explainability of multimodal models.<n>We find that evaluation methods for XAI in multimodal settings are largely non-systematic, lacking consistency, robustness, and consideration for modality-specific cognitive and contextual factors.
arXiv Detail & Related papers (2025-08-06T13:14:20Z)
MTR-Bench: A Comprehensive Benchmark for Multi-Turn Reasoning Evaluation [56.87891213797931]
We present MTR-Bench for Large Language Models' Multi-Turn Reasoning evaluation.<n>Comprising 4 classes, 40 tasks, and 3600 instances, MTR-Bench covers diverse reasoning capabilities.<n>MTR-Bench features fully-automated framework spanning both dataset constructions and model evaluations.
arXiv Detail & Related papers (2025-05-21T17:59:12Z)
Composed Multi-modal Retrieval: A Survey of Approaches and Applications [81.54640206021757]
Composed Multi-modal Retrieval (CMR) emerges as a pivotal next-generation technology.<n>CMR enables users to query images or videos by integrating a reference visual input with textual modifications.<n>This paper provides a comprehensive survey of CMR, covering its fundamental challenges, technical advancements, and applications.
arXiv Detail & Related papers (2025-03-03T09:18:43Z)
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM [51.91311158085973]
Methods for detecting AI-generated media have evolved rapidly.<n>General-purpose detectors based on MLLMs integrate authenticity verification, explainability, and localization capabilities.<n>Ethical and security considerations have emerged as critical global concerns.
arXiv Detail & Related papers (2025-02-07T12:18:20Z)
Multimodal Alignment and Fusion: A Survey [7.250878248686215]
Multimodal integration enables improved model accuracy and broader applicability. We systematically categorize and analyze existing alignment and fusion techniques. This survey focuses on applications in domains like social media analysis, medical imaging, and emotion recognition.
arXiv Detail & Related papers (2024-11-26T02:10:27Z)
MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs [97.94579295913606]
Multimodal Large Language Models (MLLMs) have garnered increased attention from both industry and academia.<n>In the development process, evaluation is critical since it provides intuitive feedback and guidance on improving models.<n>This work aims to offer researchers an easy grasp of how to effectively evaluate MLLMs according to different needs and to inspire better evaluation methods.
arXiv Detail & Related papers (2024-11-22T18:59:54Z)
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models [71.36392373876505]
We introduce MMIE, a large-scale benchmark for evaluating interleaved multimodal comprehension and generation in Large Vision-Language Models (LVLMs) MMIE comprises 20K meticulously curated multimodal queries, spanning 3 categories, 12 fields, and 102 subfields, including mathematics, coding, physics, literature, health, and arts. It supports both interleaved inputs and outputs, offering a mix of multiple-choice and open-ended question formats to evaluate diverse competencies.
arXiv Detail & Related papers (2024-10-14T04:15:00Z)
A Survey on Multimodal Benchmarks: In the Era of Large AI Models [13.299775710527962]
Multimodal Large Language Models (MLLMs) have brought substantial advancements in artificial intelligence. This survey systematically reviews 211 benchmarks that assess MLLMs across four core domains: understanding, reasoning, generation, and application.
arXiv Detail & Related papers (2024-09-21T15:22:26Z)
Surveying the MLLM Landscape: A Meta-Review of Current Surveys [17.372501468675303]
Multimodal Large Language Models (MLLMs) have become a transformative force in the field of artificial intelligence. This survey aims to provide a systematic review of benchmark tests and evaluation methods for MLLMs.
arXiv Detail & Related papers (2024-09-17T14:35:38Z)
Attribution Regularization for Multimodal Paradigms [7.1262539590168705]
Multimodal machine learning can integrate information from multiple modalities to enhance learning and decision-making processes. It is commonly observed that unimodal models outperform multimodal models, despite the latter having access to richer information. This research project proposes a novel regularization term that encourages multimodal models to effectively utilize information from all modalities when making decisions.
arXiv Detail & Related papers (2024-04-02T23:05:56Z)
A Survey on Interpretable Cross-modal Reasoning [64.37362731950843]
Cross-modal reasoning (CMR) has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics. This survey delves into the realm of interpretable cross-modal reasoning (I-CMR) This survey presents a comprehensive overview of the typical methods with a three-level taxonomy for I-CMR.
arXiv Detail & Related papers (2023-09-05T05:06:48Z)
Single-Modal Entropy based Active Learning for Visual Question Answering [75.1682163844354]
We address Active Learning in the multi-modal setting of Visual Question Answering (VQA) In light of the multi-modal inputs, image and question, we propose a novel method for effective sample acquisition. Our novel idea is simple to implement, cost-efficient, and readily adaptable to other multi-modal tasks.
arXiv Detail & Related papers (2021-10-21T05:38:45Z)
Scaling up Search Engine Audits: Practical Insights for Algorithm Auditing [68.8204255655161]
We set up experiments for eight search engines with hundreds of virtual agents placed in different regions. We demonstrate the successful performance of our research infrastructure across multiple data collections. We conclude that virtual agents are a promising venue for monitoring the performance of algorithms across long periods of time.
arXiv Detail & Related papers (2021-06-10T15:49:58Z)
The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset: Collection, Insights and Improvements [14.707930573950787]
We present MuSe-CaR, a first of its kind multimodal dataset. The data is publicly available as it recently served as the testing bed for the 1st Multimodal Sentiment Analysis Challenge.
arXiv Detail & Related papers (2021-01-15T10:40:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.