Table2Image: Interpretable Tabular Data Classification with Realistic Image Transformations
- URL: http://arxiv.org/abs/2412.06265v2
- Date: Thu, 23 Jan 2025 06:59:03 GMT
- Title: Table2Image: Interpretable Tabular Data Classification with Realistic Image Transformations
- Authors: Seungeun Lee, Il-Youp Kwak, Kihwan Lee, Subin Bae, Sangjun Lee, Seulbin Lee, Seungsang Oh,
- Abstract summary: This paper introduces Table2Image, a novel framework that transforms tabular data into realistic and diverse image representations.
We also present an interpretability framework that integrates insights from both the original data and its transformed image representations.
- Score: 5.62508658491325
- License:
- Abstract: Recent advancements in deep learning for tabular data have shown promise, but challenges remain in achieving interpretable and lightweight models. This paper introduces Table2Image, a novel framework that transforms tabular data into realistic and diverse image representations, enabling deep learning methods to achieve competitive classification performance. To address multicollinearity in tabular data, we propose a variance inflation factor (VIF) initialization, which enhances model stability and robustness by incorporating statistical feature relationships. Additionally, we present an interpretability framework that integrates insights from both the original tabular data and its transformed image representations, by leveraging Shapley additive explanations (SHAP) and methods to minimize distributional discrepancies. Experiments on benchmark datasets demonstrate the efficacy of our approach, achieving superior accuracy, area under the curve, and interpretability compared to recent leading deep learning models. Our lightweight method provides a scalable and reliable solution for tabular data classification.
Related papers
- Tab2Visual: Overcoming Limited Data in Tabular Data Classification Using Deep Learning with Visual Representations [0.09999629695552192]
We propose Tab2Visual, a novel approach that transforms heterogeneous tabular data into visual representations.
We extensively evaluate the proposed approach on diverse datasets, comparing its performance against a wide range of machine learning algorithms.
arXiv Detail & Related papers (2025-02-11T02:12:29Z) - TabDPT: Scaling Tabular Foundation Models [20.00390825519329]
We show how to harness the power of real data to improve performance and generalization.
Our model achieves state-of-the-art performance on the CC18 (classification) and CTR23 (regression) benchmarks.
TabDPT also demonstrates strong scaling as both model size and amount of available data increase.
arXiv Detail & Related papers (2024-10-23T18:00:00Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - The Common Stability Mechanism behind most Self-Supervised Learning
Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques.
We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO.
We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z) - Images in Discrete Choice Modeling: Addressing Data Isomorphism in
Multi-Modality Inputs [77.54052164713394]
This paper explores the intersection of Discrete Choice Modeling (DCM) and machine learning.
We investigate the consequences of embedding high-dimensional image data that shares isomorphic information with traditional tabular inputs within a DCM framework.
arXiv Detail & Related papers (2023-12-22T14:33:54Z) - Flow Factorized Representation Learning [109.51947536586677]
We introduce a generative model which specifies a distinct set of latent probability paths that define different input transformations.
We show that our model achieves higher likelihoods on standard representation learning benchmarks while simultaneously being closer to approximately equivariant models.
arXiv Detail & Related papers (2023-09-22T20:15:37Z) - Continual Vision-Language Representation Learning with Off-Diagonal
Information [112.39419069447902]
Multi-modal contrastive learning frameworks like CLIP typically require a large amount of image-text samples for training.
This paper discusses the feasibility of continual CLIP training using streaming data.
arXiv Detail & Related papers (2023-05-11T08:04:46Z) - PTab: Using the Pre-trained Language Model for Modeling Tabular Data [5.791972449406902]
Recent studies show that neural-based models are effective in learning contextual representation for Tabular data.
We propose a novel framework PTab, using the Pre-trained language model to model Tabular data.
Our method has achieved a better average AUC score in supervised settings compared to the state-of-the-art baselines.
arXiv Detail & Related papers (2022-09-15T08:58:42Z) - Robust Cross-Modal Representation Learning with Progressive
Self-Distillation [7.676408770854477]
The learning objective of vision-language approach of CLIP does not effectively account for the noisy many-to-many correspondences found in web-harvested image captioning datasets.
We introduce a novel training framework based on cross-modal contrastive learning that uses progressive self-distillation and soft image-text alignments to more efficiently learn robust representations from noisy data.
arXiv Detail & Related papers (2022-04-10T03:28:18Z) - Lightweight Data Fusion with Conjugate Mappings [11.760099863897835]
We present an approach to data fusion that combines the interpretability of structured probabilistic graphical models with the flexibility of neural networks.
The proposed method, lightweight data fusion (LDF), emphasizes posterior analysis over latent variables using two types of information.
arXiv Detail & Related papers (2020-11-20T19:47:13Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.