Images in Discrete Choice Modeling: Addressing Data Isomorphism in
Multi-Modality Inputs
- URL: http://arxiv.org/abs/2312.14724v1
- Date: Fri, 22 Dec 2023 14:33:54 GMT
- Title: Images in Discrete Choice Modeling: Addressing Data Isomorphism in
Multi-Modality Inputs
- Authors: Brian Sifringer, Alexandre Alahi
- Abstract summary: This paper explores the intersection of Discrete Choice Modeling (DCM) and machine learning.
We investigate the consequences of embedding high-dimensional image data that shares isomorphic information with traditional tabular inputs within a DCM framework.
- Score: 77.54052164713394
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper explores the intersection of Discrete Choice Modeling (DCM) and
machine learning, focusing on the integration of image data into DCM's utility
functions and its impact on model interpretability. We investigate the
consequences of embedding high-dimensional image data that shares isomorphic
information with traditional tabular inputs within a DCM framework. Our study
reveals that neural network (NN) components learn and replicate tabular
variable representations from images when co-occurrences exist, thereby
compromising the interpretability of DCM parameters. We propose and benchmark
two methodologies to address this challenge: architectural design adjustments
to segregate redundant information, and isomorphic information mitigation
through source information masking and inpainting. Our experiments, conducted
on a semi-synthetic dataset, demonstrate that while architectural modifications
prove inconclusive, direct mitigation at the data source shows to be a more
effective strategy in maintaining the integrity of DCM's interpretable
parameters. The paper concludes with insights into the applicability of our
findings in real-world settings and discusses the implications for future
research in hybrid modeling that combines complex data modalities. Full control
of tabular and image data congruence is attained by using the MIT moral machine
dataset, and both inputs are merged into a choice model by deploying the
Learning Multinomial Logit (L-MNL) framework.
Related papers
- Knowledge-Aware Reasoning over Multimodal Semi-structured Tables [85.24395216111462]
This study investigates whether current AI models can perform knowledge-aware reasoning on multimodal structured data.
We introduce MMTabQA, a new dataset designed for this purpose.
Our experiments highlight substantial challenges for current AI models in effectively integrating and interpreting multiple text and image inputs.
arXiv Detail & Related papers (2024-08-25T15:17:43Z) - Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment [20.902935570581207]
We introduce a Multimodal Alignment and Reconstruction Network (MARNet) to enhance the model's resistance to visual noise.
MARNet includes a cross-modal diffusion reconstruction module for smoothly and stably blending information across different domains.
Experiments conducted on two benchmark datasets, Vireo-Food172 and Ingredient-101, demonstrate that MARNet effectively improves the quality of image information extracted by the model.
arXiv Detail & Related papers (2024-07-26T16:30:18Z) - Adaptive Affinity-Based Generalization For MRI Imaging Segmentation Across Resource-Limited Settings [1.5703963908242198]
This paper introduces a novel relation-based knowledge framework by seamlessly combining adaptive affinity-based and kernel-based distillation.
To validate our innovative approach, we conducted experiments on publicly available multi-source prostate MRI data.
arXiv Detail & Related papers (2024-04-03T13:35:51Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR)
It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z) - RDA-INR: Riemannian Diffeomorphic Autoencoding via Implicit Neural Representations [3.9858496473361402]
In this work, we focus on a limitation of neural network-based atlas building and statistical latent modeling methods.
We overcome this limitation by designing a novel encoder based on resolution-independent implicit neural representations.
arXiv Detail & Related papers (2023-05-22T09:27:17Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.