Active Learning and Multi-label Classification for Ellipsis and
Coreference Detection in Conversational Question-Answering
- URL: http://arxiv.org/abs/2207.03145v1
- Date: Thu, 7 Jul 2022 08:14:54 GMT
- Title: Active Learning and Multi-label Classification for Ellipsis and
Coreference Detection in Conversational Question-Answering
- Authors: Quentin Brabant, Lina Maria Rojas-Barahona and Claire Gardent
- Abstract summary: ellipsis and coreferences are commonly occurring linguistic phenomena.
We propose to use a multi-label classifier based on DistilBERT.
We show that these methods greatly enhance the performance of the classifier for detecting these phenomena on a manually labeled dataset.
- Score: 5.984693203400407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In human conversations, ellipsis and coreference are commonly occurring
linguistic phenomena. Although these phenomena are a mean of making
human-machine conversations more fluent and natural, only few dialogue corpora
contain explicit indications on which turns contain ellipses and/or
coreferences. In this paper we address the task of automatically detecting
ellipsis and coreferences in conversational question answering. We propose to
use a multi-label classifier based on DistilBERT. Multi-label classification
and active learning are employed to compensate the limited amount of labeled
data. We show that these methods greatly enhance the performance of the
classifier for detecting these phenomena on a manually labeled dataset.
Related papers
- HuBERTopic: Enhancing Semantic Representation of HuBERT through
Self-supervision Utilizing Topic Model [62.995175485416]
We propose a new approach to enrich the semantic representation of HuBERT.
An auxiliary topic classification task is added to HuBERT by using topic labels as teachers.
Experimental results demonstrate that our method achieves comparable or better performance than the baseline in most tasks.
arXiv Detail & Related papers (2023-10-06T02:19:09Z) - Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - Extending an Event-type Ontology: Adding Verbs and Classes Using
Fine-tuned LLMs Suggestions [0.0]
We have investigated the use of advanced machine learning methods for pre-annotating data for a lexical extension task.
We have examined the correlation of the automatic scores with the human annotation.
While the correlation turned out to be strong, its influence on the annotation proper is modest due to its near linearity.
arXiv Detail & Related papers (2023-06-03T14:57:47Z) - ActiveLab: Active Learning with Re-Labeling by Multiple Annotators [19.84626033109009]
ActiveLab is a method to decide what to label next in batch active learning.
It automatically estimates when it is more informative to re-label examples vs. labeling entirely new ones.
It reliably trains more accurate classifiers with far fewer annotations than a wide variety of popular active learning methods.
arXiv Detail & Related papers (2023-01-27T17:00:11Z) - IDEA: Interactive DoublE Attentions from Label Embedding for Text
Classification [4.342189319523322]
We propose a novel model structure via siamese BERT and interactive double attentions named IDEA to capture the information exchange of text and label names.
Our proposed method outperforms the state-of-the-art methods using label texts significantly with more stable results.
arXiv Detail & Related papers (2022-09-23T04:50:47Z) - Speaker Embedding-aware Neural Diarization for Flexible Number of
Speakers with Textual Information [55.75018546938499]
We propose the speaker embedding-aware neural diarization (SEND) method, which predicts the power set encoded labels.
Our method achieves lower diarization error rate than the target-speaker voice activity detection.
arXiv Detail & Related papers (2021-11-28T12:51:04Z) - R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic
Matching [58.72111690643359]
We propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching.
We first employ BERT to encode the input sentences from a global perspective.
Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective.
To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task.
arXiv Detail & Related papers (2020-12-16T13:11:30Z) - Automatically Identifying Words That Can Serve as Labels for Few-Shot
Text Classification [12.418532541734193]
A recent approach for few-shot text classification is to convert textual inputs to cloze questions that contain some form of task description, process them with a pretrained language model and map the predicted words to labels.
To mitigate this issue, we devise an approach that automatically finds such a mapping given small amounts of training data.
For a number of tasks, the mapping found by our approach performs almost as well as hand-crafted label-to-word mappings.
arXiv Detail & Related papers (2020-10-26T14:56:22Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z) - Interaction Matching for Long-Tail Multi-Label Classification [57.262792333593644]
We present an elegant and effective approach for addressing limitations in existing multi-label classification models.
By performing soft n-gram interaction matching, we match labels with natural language descriptions.
arXiv Detail & Related papers (2020-05-18T15:27:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.