Related papers: A Partial Replication of MaskFormer in TensorFlow on TPUs for the TensorFlow Model Garden

A Partial Replication of MaskFormer in TensorFlow on TPUs for the TensorFlow Model Garden

URL: http://arxiv.org/abs/2404.18801v1
Date: Mon, 29 Apr 2024 15:40:40 GMT
Title: A Partial Replication of MaskFormer in TensorFlow on TPUs for the TensorFlow Model Garden
Authors: Vishal Purohit, Wenxin Jiang, Akshath R. Ravikiran, James C. Davis,
Abstract summary: This paper undertakes the task of replicating the MaskFormer model, originally developed using the PyTorch framework, within the COCO ecosystem. We address key challenges encountered during the replication, non-convergence issues, slow training, adaptation of loss functions, and the integration of TPU-specific functionalities.
Score: 3.259700715934023
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper undertakes the task of replicating the MaskFormer model a universal image segmentation model originally developed using the PyTorch framework, within the TensorFlow ecosystem, specifically optimized for execution on Tensor Processing Units (TPUs). Our implementation exploits the modular constructs available within the TensorFlow Model Garden (TFMG), encompassing elements such as the data loader, training orchestrator, and various architectural components, tailored and adapted to meet the specifications of the MaskFormer model. We address key challenges encountered during the replication, non-convergence issues, slow training, adaptation of loss functions, and the integration of TPU-specific functionalities. We verify our reproduced implementation and present qualitative results on the COCO dataset. Although our implementation meets some of the objectives for end-to-end reproducibility, we encountered challenges in replicating the PyTorch version of MaskFormer in TensorFlow. This replication process is not straightforward and requires substantial engineering efforts. Specifically, it necessitates the customization of various components within the TFMG, alongside thorough verification and hyper-parameter tuning. The replication is available at: https://github.com/PurdueDualityLab/tf-maskformer/tree/main/official/projects/maskformer

Related papers

Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function [0.0]
Training semantic segmentation models for material analysis requires micrographs and their corresponding masks. We demonstrate a workflow for the improvement of semantic segmentation models through the generation of synthetic microstructural images in conjunction with masks. The approach could be generalized to various types of image data such as it serves as a user-friendly solution for training models with a small number of real images.
arXiv Detail & Related papers (2024-08-01T16:54:11Z)
Cross Entropy in Deep Learning of Classifiers Is Unnecessary -- ISBE Error is All You Need [0.0]
In deep learning classifiers, the cost function usually takes the form of a combination of SoftMax and CrossEntropy functions. This work introduces the ISBE functionality, justifying the thesis about the redundancy of cross entropy computation.
arXiv Detail & Related papers (2023-11-27T22:40:02Z)
piHyFlow Operational Semantics [65.268245109828]
piHyFlow is a formalism for representing hybrid models using a set of communicating processes. Process are encapsulated into piHyFlow base models and communicate through shared memory. piHyFlow can guarantee modularity by enforcing that models can only communicate by input and output interfaces.
arXiv Detail & Related papers (2023-10-20T17:37:39Z)
MatFormer: Nested Transformer for Elastic Inference [94.1789252941718]
MatFormer is a nested Transformer architecture designed to offer elasticity in a variety of deployment constraints. We show that a 2.6B decoder-only MatFormer language model (MatLM) allows us to extract smaller models spanning from 1.5B to 2.6B. We also observe that smaller encoders extracted from a universal MatFormer-based ViT (MatViT) encoder preserve the metric-space structure for adaptive large-scale retrieval.
arXiv Detail & Related papers (2023-10-11T17:57:14Z)
Sequence Modeling with Multiresolution Convolutional Memory [27.218134279968062]
We build a new building block for sequence modeling called a MultiresLayer. The key component of our model is the multiresolution convolution, capturing multiscale trends in the input sequence. Our model yields state-of-the-art performance on a number of sequence classification and autoregressive density estimation tasks.
arXiv Detail & Related papers (2023-05-02T17:50:54Z)
Masked Autoencoding for Scalable and Generalizable Decision Making [93.84855114717062]
MaskDP is a simple and scalable self-supervised pretraining method for reinforcement learning and behavioral cloning. We find that a MaskDP model gains the capability of zero-shot transfer to new BC tasks, such as single and multiple goal reaching.
arXiv Detail & Related papers (2022-11-23T07:04:41Z)
Parameter-Efficient Masking Networks [61.43995077575439]
Advanced network designs often contain a large number of repetitive structures (e.g., Transformer) In this study, we are the first to investigate the representative potential of fixed random weights with limited unique values by learning masks. It leads to a new paradigm for model compression to diminish the model size.
arXiv Detail & Related papers (2022-10-13T03:39:03Z)
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition [99.349598600887]
Conformer is the de facto backbone model for various downstream speech tasks based on its hybrid attention-convolution architecture. We propose the Squeezeformer model, which consistently outperforms the state-of-the-art ASR models under the same training schemes.
arXiv Detail & Related papers (2022-06-02T06:06:29Z)
OneFlow: Redesign the Distributed Deep Learning Framework from Scratch [17.798586916628174]
OneFlow is a novel distributed training framework based on an SBP (split, broadcast and partial-value) abstraction and the actor model. SBP enables much easier programming of data parallelism and model parallelism than existing frameworks. OneFlow outperforms many well-known customized libraries built on top of the state-of-the-art frameworks.
arXiv Detail & Related papers (2021-10-28T11:32:14Z)
Structured Model Pruning of Convolutional Networks on Tensor Processing Units [0.0]
Structured model pruning is a promising approach to alleviate these requirements. We measure the accuracy-efficiency trade-off for various structured model pruning methods and datasets. We show that structured model pruning can significantly improve model memory usage and speed on TPUs without losing accuracy.
arXiv Detail & Related papers (2021-07-09T03:41:31Z)
Multi-layer Optimizations for End-to-End Data Analytics [71.05611866288196]
We introduce Iterative Functional Aggregate Queries (IFAQ), a framework that realizes an alternative approach. IFAQ treats the feature extraction query and the learning task as one program given in the IFAQ's domain-specific language. We show that a Scala implementation of IFAQ can outperform mlpack, Scikit, and specialization by several orders of magnitude for linear regression and regression tree models over several relational datasets.
arXiv Detail & Related papers (2020-01-10T16:14:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.