Related papers: SCENIC: A JAX Library for Computer Vision Research and Beyond

SCENIC: A JAX Library for Computer Vision Research and Beyond

URL: http://arxiv.org/abs/2110.11403v1
Date: Mon, 18 Oct 2021 08:41:17 GMT
Title: SCENIC: A JAX Library for Computer Vision Research and Beyond
Authors: Mostafa Dehghani and Alexey Gritsenko and Anurag Arnab and Matthias Minderer and Yi Tay
Abstract summary: Scenic is an open-source JAX library with a focus on Transformer-based models for computer vision research and beyond. The goal of this toolkit is to facilitate rapid experimentation, prototyping, and research of new vision architectures and models.
Score: 44.21002948898551
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Scenic is an open-source JAX library with a focus on Transformer-based models for computer vision research and beyond. The goal of this toolkit is to facilitate rapid experimentation, prototyping, and research of new vision architectures and models. Scenic supports a diverse range of vision tasks (e.g., classification, segmentation, detection)and facilitates working on multi-modal problems, along with GPU/TPU support for multi-host, multi-device large-scale training. Scenic also offers optimized implementations of state-of-the-art research models spanning a wide range of modalities. Scenic has been successfully used for numerous projects and published papers and continues serving as the library of choice for quick prototyping and publication of new research ideas.

Related papers

ZenSVI: An Open-Source Software for the Integrated Acquisition, Processing and Analysis of Street View Imagery Towards Scalable Urban Science [1.5494074223643037]
Street view imagery (SVI) has been instrumental in many studies in the past decade to understand and characterize street features and the built environment. We develop ZenSVI, a free and open-source Python package that integrates and implements the entire process of SVI analysis.
arXiv Detail & Related papers (2024-12-24T07:13:17Z)
Collage: Decomposable Rapid Prototyping for Information Extraction on Scientific PDFs [15.610004991273005]
We present Collage, a tool designed for rapid prototyping, visualization, and evaluation of different information extraction models on scientific PDFs. We enable both developers and users of NLP-based tools to inspect, debug, and better understand modeling pipelines by providing granular views of intermediate states of processing.
arXiv Detail & Related papers (2024-10-30T22:00:34Z)
A Survey of Small Language Models [104.80308007044634]
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources. We present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques.
arXiv Detail & Related papers (2024-10-25T23:52:28Z)
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs [61.143381152739046]
We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-centric approach. Our study uses LLMs and visual instruction tuning as an interface to evaluate various visual representations. We provide model weights, code, supporting tools, datasets, and detailed instruction-tuning and evaluation recipes.
arXiv Detail & Related papers (2024-06-24T17:59:42Z)
Enhancing Text Corpus Exploration with Post Hoc Explanations and Comparative Design [6.8863648800930655]
Text corpus exploration (TCE) spans the range of exploratory search tasks. Current systems lack the flexibility to support the range of tasks encountered in practice. We provide methods that enhance TCE tools with post hoc explanations and multiscale, comparative designs.
arXiv Detail & Related papers (2024-06-14T03:13:58Z)
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models [51.35570730554632]
ESPnet-SPK is a toolkit for training speaker embedding extractors. We provide several models, ranging from x-vector to recent SKA-TDNN. We also aspire to bridge developed models with other domains.
arXiv Detail & Related papers (2024-01-30T18:18:27Z)
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP [3.0875505950565856]
We present a significantly upgraded version of torchdistill, a modular-driven coding-free deep learning framework. We reproduce the GLUE benchmark results of BERT models using a script based on the upgraded torchdistill. All the 27 fine-tuned BERT models and configurations to reproduce the results are published at Hugging Face.
arXiv Detail & Related papers (2023-10-26T17:57:15Z)
Automatic Image Content Extraction: Operationalizing Machine Learning in Humanistic Photographic Studies of Large Visual Archives [81.88384269259706]
We introduce Automatic Image Content Extraction framework for machine learning-based search and analysis of large image archives. The proposed framework can be applied in several domains in humanities and social sciences.
arXiv Detail & Related papers (2022-04-05T12:19:24Z)
X-modaler: A Versatile and High-performance Codebase for Cross-modal Analytics [99.03895740754402]
X-modaler encapsulates the state-of-the-art cross-modal analytics into several general-purpose stages. X-modaler is an Apache-licensed, and its source codes, sample projects and pre-trained models are available on-line.
arXiv Detail & Related papers (2021-08-18T16:05:30Z)
LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis [3.4253416336476246]
This paper introduces layoutparser, an open-source library for streamlining the usage of deep learning (DL) models in document image analysis (DIA) research and applications. layoutparser comes with a set of simple and intuitive interfaces for applying and customizing DL models for layout detection, character recognition, and many other document processing tasks. We demonstrate that layoutparser is helpful for both lightweight and large-scale pipelines in real-word use cases.
arXiv Detail & Related papers (2021-03-29T05:55:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.