SciWING -- A Software Toolkit for Scientific Document Processing
- URL: http://arxiv.org/abs/2004.03807v2
- Date: Fri, 23 Oct 2020 07:27:01 GMT
- Title: SciWING -- A Software Toolkit for Scientific Document Processing
- Authors: Abhinav Ramesh Kashyap, Min-Yen Kan
- Abstract summary: SciWING provides access to pre-trained models for scientific document processing tasks.
It includes ready-to-use web and terminal-based applications and demonstrations.
- Score: 21.394568145639894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce SciWING, an open-source software toolkit which provides access
to pre-trained models for scientific document processing tasks, inclusive of
citation string parsing and logical structure recovery. SciWING enables
researchers to rapidly experiment with different models by swapping and
stacking different modules. It also enables them declare and run models from a
configuration file. It enables researchers to perform production-ready transfer
learning from general, pre-trained transformers (i.e., BERT, SciBERT etc), and
aids development of end-user applications. It includes ready-to-use web and
terminal-based applications and demonstrations (Available from
http://sciwing.io).
Related papers
- VISION: A Modular AI Assistant for Natural Human-Instrument Interaction at Scientific User Facilities [0.19736111241221438]
generative AI presents an opportunity to bridge this knowledge gap.
We present a modular architecture for the Virtual Scientific Companion (VISION)
With VISION, we performed LLM-based operation on the beamline workstation with low latency and demonstrated the first voice-controlled experiment at an X-ray scattering beamline.
arXiv Detail & Related papers (2024-12-24T04:37:07Z) - Darkit: A User-Friendly Software Toolkit for Spiking Large Language Model [50.37090759139591]
Large language models (LLMs) have been widely applied in various practical applications, typically comprising billions of parameters.
The human brain, employing bio-plausible spiking mechanisms, can accomplish the same tasks while significantly reducing energy consumption.
We are releasing a software toolkit named DarwinKit (Darkit) to accelerate the adoption of brain-inspired large language models.
arXiv Detail & Related papers (2024-12-20T07:50:08Z) - Collage: Decomposable Rapid Prototyping for Information Extraction on Scientific PDFs [15.610004991273005]
We present Collage, a tool designed for rapid prototyping, visualization, and evaluation of different information extraction models on scientific PDFs.
We enable both developers and users of NLP-based tools to inspect, debug, and better understand modeling pipelines by providing granular views of intermediate states of processing.
arXiv Detail & Related papers (2024-10-30T22:00:34Z) - ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models [51.35570730554632]
ESPnet-SPK is a toolkit for training speaker embedding extractors.
We provide several models, ranging from x-vector to recent SKA-TDNN.
We also aspire to bridge developed models with other domains.
arXiv Detail & Related papers (2024-01-30T18:18:27Z) - CodeTF: One-stop Transformer Library for State-of-the-art Code LLM [72.1638273937025]
We present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence.
Our library supports a collection of pretrained Code LLM models and popular code benchmarks.
We hope CodeTF is able to bridge the gap between machine learning/generative AI and software engineering.
arXiv Detail & Related papers (2023-05-31T05:24:48Z) - ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data
Format [88.33443450434521]
Task-oriented dialogue (TOD) systems function as digital assistants, guiding users through various tasks such as booking flights or finding restaurants.
Existing toolkits for building TOD systems often fall short of in delivering comprehensive arrays of data, models, and experimental environments.
We introduce ConvLab-3: a multifaceted dialogue system toolkit crafted to bridge this gap.
arXiv Detail & Related papers (2022-11-30T16:37:42Z) - SIERRA: A Modular Framework for Research Automation and Reproducibility [6.1678491628787455]
We present SIERRA, a novel framework for accelerating research development and improving results.
SIERRA accelerates research by automating the process of generating executable experiments from queries over independent variables.
It employs a modular architecture enabling easy customization and extension for the needs of individual researchers.
arXiv Detail & Related papers (2022-08-16T15:36:34Z) - SIERRA: A Modular Framework for Research Automation [5.220940151628734]
We present SIERRA, a novel framework for accelerating research developments and improving results.
SIERRA makes it easy to quickly specify the independent variable(s) for an experiment, generate experimental inputs, automatically run the experiment, and process the results to generate deliverables such as graphs and videos.
It employs a deeply modular approach that allows easy customization and extension of automation for the needs of individual researchers.
arXiv Detail & Related papers (2022-03-03T23:45:46Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform
for NLP Applications [65.87067607849757]
EasyTransfer is a platform to develop deep Transfer Learning algorithms for Natural Language Processing (NLP) applications.
EasyTransfer supports various NLP models in the ModelZoo, including mainstream PLMs and multi-modality models.
EasyTransfer is currently deployed at Alibaba to support a variety of business scenarios.
arXiv Detail & Related papers (2020-11-18T18:41:27Z) - Collective Knowledge: organizing research projects as a database of
reusable components and portable workflows with common APIs [0.2538209532048866]
This article provides the motivation and overview of the Collective Knowledge framework (CK or cKnowledge)
The CK concept is to decompose research projects into reusable components that encapsulate research artifacts.
The long-term goal is to accelerate innovation by connecting researchers and practitioners to share and reuse all their knowledge.
arXiv Detail & Related papers (2020-11-02T17:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.