Joint Multi-Dimensional Model for Global and Time-Series Annotations
- URL: http://arxiv.org/abs/2005.03117v1
- Date: Wed, 6 May 2020 20:08:46 GMT
- Title: Joint Multi-Dimensional Model for Global and Time-Series Annotations
- Authors: Anil Ramakrishna, Rahul Gupta, Shrikanth Narayanan
- Abstract summary: Crowdsourcing is a popular approach to collect annotations for unlabeled data instances.
It involves collecting a large number of annotations from several, often naive untrained annotators for each data instance which are then combined to estimate the ground truth.
Most annotation fusion schemes however ignore this aspect and model each dimension separately.
We propose a generative model for multi-dimensional annotation fusion, which models the dimensions jointly leading to more accurate ground truth estimates.
- Score: 48.159050222769494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowdsourcing is a popular approach to collect annotations for unlabeled data
instances. It involves collecting a large number of annotations from several,
often naive untrained annotators for each data instance which are then combined
to estimate the ground truth. Further, annotations for constructs such as
affect are often multi-dimensional with annotators rating multiple dimensions,
such as valence and arousal, for each instance. Most annotation fusion schemes
however ignore this aspect and model each dimension separately. In this work we
address this by proposing a generative model for multi-dimensional annotation
fusion, which models the dimensions jointly leading to more accurate ground
truth estimates. The model we propose is applicable to both global and time
series annotation fusion problems and treats the ground truth as a latent
variable distorted by the annotators. The model parameters are estimated using
the Expectation-Maximization algorithm and we evaluate its performance using
synthetic data and real emotion corpora as well as on an artificial task with
human annotations
Related papers
- MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large
Language Models [70.92847554971065]
We introduce MT-Eval, a comprehensive benchmark designed to evaluate multi-turn conversational abilities.
By analyzing human-LLM conversations, we categorize interaction patterns into four types: recollection, expansion, refinement, and follow-up.
Our evaluation of 11 well-known LLMs shows that while closed-source models generally surpass open-source ones, certain open-source models exceed GPT-3.5-Turbo in specific tasks.
arXiv Detail & Related papers (2024-01-30T04:50:28Z) - A General Model for Aggregating Annotations Across Simple, Complex, and
Multi-Object Annotation Tasks [51.14185612418977]
A strategy to improve label quality is to ask multiple annotators to label the same item and aggregate their labels.
While a variety of bespoke models have been proposed for specific tasks, our work is the first to introduce aggregation methods that generalize across many diverse complex tasks.
This article extends our prior work with investigation of three new research questions.
arXiv Detail & Related papers (2023-12-20T21:28:35Z) - A Federated Data Fusion-Based Prognostic Model for Applications with Multi-Stream Incomplete Signals [1.2277343096128712]
This article proposes a federated prognostic model that allows multiple users to jointly construct a failure time prediction model.
Numerical studies indicate that the performance of the proposed model is the same as that of classic non-federated prognostic models.
arXiv Detail & Related papers (2023-11-13T17:08:34Z) - Unveiling the Multi-Annotation Process: Examining the Influence of
Annotation Quantity and Instance Difficulty on Model Performance [1.7343894615131372]
We show how performance scores can vary when a dataset expands from a single annotation per instance to multiple annotations.
We propose a novel multi-annotator simulation process to generate datasets with varying annotation budgets.
arXiv Detail & Related papers (2023-10-23T05:12:41Z) - Multidimensional Item Response Theory in the Style of Collaborative
Filtering [0.8057006406834467]
This paper presents a machine learning approach to multidimensional item response theory (MIRT)
Inspired by collaborative filtering, we define a general class of models that includes many MIRT models.
We discuss the use of penalized joint maximum likelihood (JML) to estimate individual models and cross-validation to select the best performing model.
arXiv Detail & Related papers (2023-01-03T00:56:27Z) - Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning [112.69497636932955]
Federated learning aims to train models across different clients without the sharing of data for privacy considerations.
We study how data heterogeneity affects the representations of the globally aggregated models.
We propose sc FedDecorr, a novel method that can effectively mitigate dimensional collapse in federated learning.
arXiv Detail & Related papers (2022-10-01T09:04:17Z) - Attention Bottlenecks for Multimodal Fusion [90.75885715478054]
Machine perception models are typically modality-specific and optimised for unimodal benchmarks.
We introduce a novel transformer based architecture that uses fusion' for modality fusion at multiple layers.
We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks.
arXiv Detail & Related papers (2021-06-30T22:44:12Z) - Self-Supervision based Task-Specific Image Collection Summarization [3.115375810642661]
We propose a novel approach to task-specific image corpus summarization using semantic information and self-supervision.
Our method uses a classification-based Wasserstein generative adversarial network (WGAN) as a feature generating network.
The model then generates a summary at inference time by using K-means clustering in the semantic embedding space.
arXiv Detail & Related papers (2020-12-19T10:58:04Z) - Evaluating the Disentanglement of Deep Generative Models through
Manifold Topology [66.06153115971732]
We present a method for quantifying disentanglement that only uses the generative model.
We empirically evaluate several state-of-the-art models across multiple datasets.
arXiv Detail & Related papers (2020-06-05T20:54:11Z) - Predicting Multidimensional Data via Tensor Learning [0.0]
We develop a model that retains the intrinsic multidimensional structure of the dataset.
To estimate the model parameters, an Alternating Least Squares algorithm is developed.
The proposed model is able to outperform benchmark models present in the forecasting literature.
arXiv Detail & Related papers (2020-02-11T11:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.