autrainer: A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks
- URL: http://arxiv.org/abs/2412.11943v2
- Date: Thu, 10 Apr 2025 13:51:44 GMT
- Title: autrainer: A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks
- Authors: Simon Rampp, Andreas Triantafyllopoulos, Manuel Milling, Björn W. Schuller,
- Abstract summary: autrainer is a PyTorch-based toolkit for computer training on audition tasks.<n>We present an overview of its inner workings and key capabilities.
- Score: 42.4526628515253
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work introduces the key operating principles for autrainer, our new deep learning training framework for computer audition tasks. autrainer is a PyTorch-based toolkit that allows for rapid, reproducible, and easily extensible training on a variety of different computer audition tasks. Concretely, autrainer offers low-code training and supports a wide range of neural networks as well as preprocessing routines. In this work, we present an overview of its inner workings and key capabilities.
Related papers
- Replacing thinking with tool usage enables reasoning in small language models [2.357055571094446]
Recent advances have established a new machine learning paradigm based on scaling up compute at inference time as well as at training time.<n>In this paper, we propose to format these tokens as a multi-turn interaction trace with a stateful tool.<n>At each turn, the new state of the tool is appended to the context of the model, whose job is to generate the tokens necessary to control the tool via a custom DSL.
arXiv Detail & Related papers (2025-07-07T14:49:18Z) - Is Visual in-Context Learning for Compositional Medical Tasks within Reach? [68.56630652862293]
In this paper, we explore the potential of visual in-context learning to enable a single model to handle multiple tasks.<n>We introduce a novel method for training in-context learners using a synthetic compositional task generation engine.
arXiv Detail & Related papers (2025-07-01T15:32:23Z) - NNTile: a machine learning framework capable of training extremely large GPT language models on a single node [83.9328245724548]
NNTile is based on a StarPU library, which implements task-based parallelism and schedules all provided tasks onto all available processing units.
It means that a particular operation, necessary to train a large neural network, can be performed on any of the CPU cores or GPU devices.
arXiv Detail & Related papers (2025-04-17T16:22:32Z) - Reinforcement Learning with Action Sequence for Data-Efficient Robot Learning [62.3886343725955]
We introduce a novel RL algorithm that learns a critic network that outputs Q-values over a sequence of actions.
By explicitly training the value functions to learn the consequence of executing a series of current and future actions, our algorithm allows for learning useful value functions from noisy trajectories.
arXiv Detail & Related papers (2024-11-19T01:23:52Z) - Deep Internal Learning: Deep Learning from a Single Input [88.59966585422914]
In many cases there is value in training a network just from the input at hand.
This is particularly relevant in many signal and image processing problems where training data is scarce and diversity is large.
This survey paper aims at covering deep internal-learning techniques that have been proposed in the past few years for these two important directions.
arXiv Detail & Related papers (2023-12-12T16:48:53Z) - A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision [93.90545426665999]
We take a close look at autoregressive decoders for multi-task learning in multimodal computer vision.
A key finding is that a small decoder learned on top of a frozen pretrained encoder works surprisingly well.
It can be seen as teaching a decoder to interact with a pretrained vision model via natural language.
arXiv Detail & Related papers (2023-03-30T13:42:58Z) - On Efficient Transformer and Image Pre-training for Low-level Vision [74.22436001426517]
Pre-training has marked numerous state of the arts in high-level computer vision.
We present an in-depth study of image pre-training.
We find pre-training plays strikingly different roles in low-level tasks.
arXiv Detail & Related papers (2021-12-19T15:50:48Z) - Deep Learning Tools for Audacity: Helping Researchers Expand the
Artist's Toolkit [8.942168855247548]
We present a software framework that integrates neural networks into the popular open-source audio editing software, Audacity.
We showcase some example use cases for both end-users and neural network developers.
arXiv Detail & Related papers (2021-10-25T23:56:38Z) - Explaining Deep Learning Representations by Tracing the Training Process [10.774699463547439]
We propose a novel explanation method that explains the decisions of a deep neural network.
We investigate how the intermediate representations at each layer of the deep network were refined during the training process.
We show that our method identifies highly representative training instances that can be used as an explanation.
arXiv Detail & Related papers (2021-09-13T11:29:04Z) - Meta-learning for downstream aware and agnostic pretraining [7.2051162210119495]
We propose using meta-learning to select tasks that provide the most informative learning signals in each episode of pretraining.
We discuss the algorithm of the method and its two variants, downstream-aware and downstream-agnostic pretraining.
arXiv Detail & Related papers (2021-06-06T23:08:09Z) - Cockpit: A Practical Debugging Tool for Training Deep Neural Networks [27.96164890143314]
We present a collection of instruments that enable a closer look into the inner workings of a learning machine.
These instruments leverage novel higher-order information about the gradient distribution and curvature.
arXiv Detail & Related papers (2021-02-12T16:28:49Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.