PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning
- URL: http://arxiv.org/abs/2404.00776v1
- Date: Sun, 31 Mar 2024 19:15:09 GMT
- Title: PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning
- Authors: Weihua Hu, Yiwen Yuan, Zecheng Zhang, Akihiro Nitta, Kaidi Cao, Vid Kocijan, Jure Leskovec, Matthias Fey,
- Abstract summary: We present PyTorch Frame, a PyTorch-based framework for deep learning over multi-modal tabular data.
We demonstrate the usefulness of PyTorch Frame by implementing diverse models in a modular way.
We integrate PyTorch Frame with PyTorch Geometric, a PyTorch library for Graph Neural Networks (GNNs), to perform end-to-end learning over relational databases.
- Score: 54.912520425218496
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present PyTorch Frame, a PyTorch-based framework for deep learning over multi-modal tabular data. PyTorch Frame makes tabular deep learning easy by providing a PyTorch-based data structure to handle complex tabular data, introducing a model abstraction to enable modular implementation of tabular models, and allowing external foundation models to be incorporated to handle complex columns (e.g., LLMs for text columns). We demonstrate the usefulness of PyTorch Frame by implementing diverse tabular models in a modular way, successfully applying these models to complex multi-modal tabular data, and integrating our framework with PyTorch Geometric, a PyTorch library for Graph Neural Networks (GNNs), to perform end-to-end learning over relational databases.
Related papers
- MALPOLON: A Framework for Deep Species Distribution Modeling [3.1457219084519004]
MALPOLON aims to facilitate training and inferences of deep species distribution models (deep-SDM)
It is written in Python and built upon the PyTorch library.
The framework is open-sourced on GitHub and PyPi.
arXiv Detail & Related papers (2024-09-26T17:45:10Z) - LaTable: Towards Large Tabular Models [63.995130144110156]
Tabular generative foundation models are hard to build due to the heterogeneous feature spaces of different datasets.
LaTable is a novel diffusion model that addresses these challenges and can be trained across different datasets.
We find that LaTable outperforms baselines on in-distribution generation, and that finetuning LaTable can generate out-of-distribution datasets better with fewer samples.
arXiv Detail & Related papers (2024-06-25T16:03:50Z) - PyTorch-IE: Fast and Reproducible Prototyping for Information Extraction [6.308539010172309]
PyTorch-IE is a framework designed to enable swift, reproducible, and reusable implementations of Information Extraction models.
We propose task modules to decouple the concerns of data representation and model-specific representations.
PyTorch-IE also extends support for widely used libraries such as PyTorch-Lightning for training, HuggingFace datasets for dataset reading, and Hydra for experiment configuration.
arXiv Detail & Related papers (2024-05-16T12:23:37Z) - pyvene: A Library for Understanding and Improving PyTorch Models via
Interventions [79.72930339711478]
$textbfpyvene$ is an open-source library that supports customizable interventions on a range of different PyTorch modules.
We show how $textbfpyvene$ provides a unified framework for performing interventions on neural models and sharing the intervened upon models with others.
arXiv Detail & Related papers (2024-03-12T16:46:54Z) - API-Assisted Code Generation for Question Answering on Varied Table
Structures [18.65003956496509]
A persistent challenge to table question answering (TableQA) by generating executable programs has been adapting to varied table structures.
This paper introduces a unified TableQA framework that provides a unified representation for structured tables as multi-index Pandas data frames.
To answer complex relational questions with extended program functionality and external knowledge, our framework allows customized APIs that Python programs can call.
arXiv Detail & Related papers (2023-10-23T08:26:28Z) - PyHHMM: A Python Library for Heterogeneous Hidden Markov Models [63.01207205641885]
PyHHMM is an object-oriented Python implementation of Heterogeneous-Hidden Markov Models (HHMMs)
PyHHMM emphasizes features not supported in similar available frameworks: a heterogeneous observation model, missing data inference, different model order selection criterias, and semi-supervised training.
PyHHMM relies on the numpy, scipy, scikit-learn, and seaborn Python packages, and is distributed under the Apache-2.0 License.
arXiv Detail & Related papers (2022-01-12T07:32:36Z) - Latte: Cross-framework Python Package for Evaluation of Latent-Based
Generative Models [65.51757376525798]
Latte is a Python library for evaluation of latent-based generative models.
Latte is compatible with both PyTorch and/Keras, and provides both functional and modular APIs.
arXiv Detail & Related papers (2021-12-20T16:00:28Z) - Amazon SageMaker Model Parallelism: A General and Flexible Framework for
Large Model Training [10.223511922625065]
We present Amazon SageMaker model parallelism, a software library that integrates with PyTorch.
It enables easy training of large models using model parallelism and other memory-saving features.
We evaluate performance over GPT-3, RoBERTa, BERT, and neural collaborative filtering.
arXiv Detail & Related papers (2021-11-10T22:30:21Z) - Learning Feature Aggregation for Deep 3D Morphable Models [57.1266963015401]
We propose an attention based module to learn mapping matrices for better feature aggregation across hierarchical levels.
Our experiments show that through the end-to-end training of the mapping matrices, we achieve state-of-the-art results on a variety of 3D shape datasets.
arXiv Detail & Related papers (2021-05-05T16:41:00Z) - PyTorch Tabular: A Framework for Deep Learning with Tabular Data [0.0]
PyTorch Tabular is a new deep learning library built on top of PyTorch and PyTorch Lightning.
It works on pandas dataframes directly.
Many SOTA models like NODE and TabNet are already integrated and implemented in the library with a unified API.
arXiv Detail & Related papers (2021-04-28T08:50:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.