Related papers: Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation

Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation

URL: http://arxiv.org/abs/2308.06953v3
Date: Mon, 16 Oct 2023 14:51:08 GMT
Title: Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation
Authors: David Heineman, Yao Dou, Wei Xu
Abstract summary: We introduce Thresh, a unified, customizable and deployable platform for fine-grained evaluation. Thresh provides a community hub that hosts a collection of fine-grained frameworks and corresponding annotations made and collected by the community. For deployment, Thresh offers multiple options for any scale of annotation projects from small manual inspections to large crowdsourcing ones.
Score: 11.690442820401453
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Fine-grained, span-level human evaluation has emerged as a reliable and robust method for evaluating text generation tasks such as summarization, simplification, machine translation and news generation, and the derived annotations have been useful for training automatic metrics and improving language models. However, existing annotation tools implemented for these evaluation frameworks lack the adaptability to be extended to different domains or languages, or modify annotation settings according to user needs; and, the absence of a unified annotated data format inhibits the research in multi-task learning. In this paper, we introduce Thresh, a unified, customizable and deployable platform for fine-grained evaluation. With a single YAML configuration file, users can build and test an annotation interface for any framework within minutes -- all in one web browser window. To facilitate collaboration and sharing, Thresh provides a community hub that hosts a collection of fine-grained frameworks and corresponding annotations made and collected by the community, covering a wide range of NLP tasks. For deployment, Thresh offers multiple options for any scale of annotation projects from small manual inspections to large crowdsourcing ones. Additionally, we introduce a Python library to streamline the entire process from typology design and deployment to annotation processing. Thresh is publicly accessible at https://thresh.tools.

Related papers

DocSpiral: A Platform for Integrated Assistive Document Annotation through Human-in-the-Spiral [11.336757553731639]
Acquiring structured data from domain-specific, image-based documents is crucial for many downstream tasks.<n>Many documents exist as images rather than as machine-readable text, which requires human annotation to train automated extraction systems.<n>We present DocSpiral, the first Human-in-the-Spiral assistive document annotation platform.
arXiv Detail & Related papers (2025-05-06T06:02:42Z)
Efficient Annotator Reliability Assessment with EffiARA [1.5145272476388434]
EffiARA is a framework to support the whole annotation pipeline, from understanding the resources required for an annotation task to compiling the annotated dataset. The framework's efficacy is supported by two previous studies: one improving classification performance through annotator-reliability-based soft label aggregation and sample weighting, and the other increasing the overall agreement among annotators. This work introduces the EffiARA Python package and its accompanying webtool, which provides an accessible graphical user interface for the system.
arXiv Detail & Related papers (2025-04-01T09:48:09Z)
Generative Compositor for Few-Shot Visual Information Extraction [60.663887314625164]
We propose a novel generative model, named Generative generative spatialtor, to address the challenge of few-shot VIE. Generative generative spatialtor is a hybrid pointer-generator network that emulates the operations of a compositor by retrieving words from the source text. The proposed method achieves highly competitive results in the full-sample training, while notably outperforms the baseline in the 1-shot, 5-shot, and 10-shot settings.
arXiv Detail & Related papers (2025-03-21T04:56:24Z)
AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs [54.58905728115257]
We propose the methodname pipeline for automatically annotating UI elements with detailed functionality descriptions at scale. Specifically, we leverage large language models (LLMs) to infer element functionality by comparing the UI content changes before and after simulated interactions with specific UI elements. We construct an methodname-704k dataset using the proposed pipeline, featuring multi-resolution, multi-device screenshots, diverse data domains, and detailed functionality annotations that have never been provided by previous datasets.
arXiv Detail & Related papers (2025-02-04T03:39:59Z)
COMMENTATOR: A Code-mixed Multilingual Text Annotation Framework [1.114560772534785]
We introduce a code-mixed multilingual text annotation framework, COMMENTATOR, specifically designed for annotating code-mixed text. The tool demonstrates its effectiveness in token-level and sentence-level language annotation tasks for Hinglish text.
arXiv Detail & Related papers (2024-08-06T11:56:26Z)
CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models [59.91221728187576]
This paper introduces the CMU Linguistic Linguistic Backend, an open-source framework that simplifies model deployment and continuous human-in-the-loop fine-tuning of NLP models. CMULAB enables users to leverage the power of multilingual models to quickly adapt and extend existing tools for speech recognition, OCR, translation, and syntactic analysis to new languages.
arXiv Detail & Related papers (2024-04-03T02:21:46Z)
Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI [15.220987187105607]
Unitxt is an innovative library for customizable textual data preparation and evaluation tailored to generative language models. Unitxt integrates with common libraries like HFace and LM-eval-harness, enabling easy customization and sharing between practitioners. Beyond being a tool, Unitxt is a community-driven platform, empowering users to build, share, and advance their pipelines.
arXiv Detail & Related papers (2024-01-25T08:57:33Z)
Antarlekhaka: A Comprehensive Tool for Multi-task Natural Language Annotation [0.0]
Antarlekhaka is a tool for manual annotation of a comprehensive set of tasks relevant to Natural Language Processing. The tool is Unicode-compatible, language-agnostic, Web-deployable and supports distributed annotation by multiple simultaneous annotators. It has been used for two real-life annotation tasks on two different languages, namely, Sanskrit and Bengali.
arXiv Detail & Related papers (2023-10-11T19:09:07Z)
TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture. TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling. It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z)
Summary Workbench: Unifying Application and Evaluation of Text Summarization Models [24.40171915438056]
New models and evaluation measures can be easily integrated as Docker-based plugins. Visual analyses combining multiple measures provide insights into the models' strengths and weaknesses.
arXiv Detail & Related papers (2022-10-18T04:47:25Z)
Selective Annotation Makes Language Models Better Few-Shot Learners [97.07544941620367]
Large language models can perform in-context learning, where they learn a new task from a few task demonstrations. This work examines the implications of in-context learning for the creation of datasets for new natural language tasks. We propose an unsupervised, graph-based selective annotation method, voke-k, to select diverse, representative examples to annotate.
arXiv Detail & Related papers (2022-09-05T14:01:15Z)
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw. At the heart of the approach is a single multilingual token-free Charformer model. We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z)
OPAD: An Optimized Policy-based Active Learning Framework for Document Content Analysis [6.159771892460152]
We propose textitOPAD, a novel framework using reinforcement policy for active learning in content detection tasks for documents. The framework learns the acquisition function to decide the samples to be selected while optimizing performance metrics. We show superior performance of the proposed textitOPAD framework for active learning for various tasks related to document understanding.
arXiv Detail & Related papers (2021-10-01T07:40:56Z)
A Data-Centric Framework for Composable NLP Workflows [109.51144493023533]
Empirical natural language processing systems in application domains (e.g., healthcare, finance, education) involve interoperation among multiple components. We establish a unified open-source framework to support fast development of such sophisticated NLP in a composable manner.
arXiv Detail & Related papers (2021-03-02T16:19:44Z)
UniT: Unified Knowledge Transfer for Any-shot Object Detection and Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training. We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.