torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free
Deep Learning Studies: A Case Study on NLP
- URL: http://arxiv.org/abs/2310.17644v1
- Date: Thu, 26 Oct 2023 17:57:15 GMT
- Title: torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free
Deep Learning Studies: A Case Study on NLP
- Authors: Yoshitomo Matsubara
- Abstract summary: We present a significantly upgraded version of torchdistill, a modular-driven coding-free deep learning framework.
We reproduce the GLUE benchmark results of BERT models using a script based on the upgraded torchdistill.
All the 27 fine-tuned BERT models and configurations to reproduce the results are published at Hugging Face.
- Score: 3.0875505950565856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reproducibility in scientific work has been becoming increasingly important
in research communities such as machine learning, natural language processing,
and computer vision communities due to the rapid development of the research
domains supported by recent advances in deep learning. In this work, we present
a significantly upgraded version of torchdistill, a modular-driven coding-free
deep learning framework significantly upgraded from the initial release, which
supports only image classification and object detection tasks for reproducible
knowledge distillation experiments. To demonstrate that the upgraded framework
can support more tasks with third-party libraries, we reproduce the GLUE
benchmark results of BERT models using a script based on the upgraded
torchdistill, harmonizing with various Hugging Face libraries. All the 27
fine-tuned BERT models and configurations to reproduce the results are
published at Hugging Face, and the model weights have already been widely used
in research communities. We also reimplement popular small-sized models and new
knowledge distillation methods and perform additional experiments for computer
vision tasks.
Related papers
- Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML)
This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature.
The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
arXiv Detail & Related papers (2024-07-17T20:01:21Z) - Evolving Knowledge Distillation with Large Language Models and Active
Learning [46.85430680828938]
Large language models (LLMs) have demonstrated remarkable capabilities across various NLP tasks.
Previous research has attempted to distill the knowledge of LLMs into smaller models by generating annotated data.
We propose EvoKD: Evolving Knowledge Distillation, which leverages the concept of active learning to interactively enhance the process of data generation using large language models.
arXiv Detail & Related papers (2024-03-11T03:55:24Z) - ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model
Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend.
ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z) - A Survey on Few-Shot Class-Incremental Learning [11.68962265057818]
Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks.
This paper provides a comprehensive survey on FSCIL.
FSCIL has achieved impressive achievements in various fields of computer vision.
arXiv Detail & Related papers (2023-04-17T10:15:08Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - An Empirical Investigation of Representation Learning for Imitation [76.48784376425911]
Recent work in vision, reinforcement learning, and NLP has shown that auxiliary representation learning objectives can reduce the need for large amounts of expensive, task-specific data.
We propose a modular framework for constructing representation learning algorithms, then use our framework to evaluate the utility of representation learning for imitation.
arXiv Detail & Related papers (2022-05-16T11:23:42Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z) - DeepZensols: Deep Natural Language Processing Framework [23.56171046067646]
This work is a framework that is able to reproduce consistent results.
It provides a means of easily creating, training, and evaluating natural language processing (NLP) deep learning (DL) models.
arXiv Detail & Related papers (2021-09-08T01:16:05Z) - torchdistill: A Modular, Configuration-Driven Framework for Knowledge
Distillation [1.8579693774597703]
We present our developed open-source framework built on PyTorch and dedicated for knowledge distillation studies.
The framework is designed to enable users to design experiments by declarative PyYAML configuration files.
We reproduce some of their original experimental results on the ImageNet and COCO datasets presented at major machine learning conferences.
arXiv Detail & Related papers (2020-11-25T17:51:30Z) - Bayesian active learning for production, a systematic study and a
reusable library [85.32971950095742]
In this paper, we analyse the main drawbacks of current active learning techniques.
We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process.
We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
arXiv Detail & Related papers (2020-06-17T14:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.