Multitasking Framework for Unsupervised Simple Definition Generation
- URL: http://arxiv.org/abs/2203.12926v1
- Date: Thu, 24 Mar 2022 08:16:04 GMT
- Title: Multitasking Framework for Unsupervised Simple Definition Generation
- Authors: Cunliang Kong, Yun Chen, Hengyuan Zhang, Liner Yang, Erhong Yang
- Abstract summary: We propose a novel task of Simple Definition Generation to help language learners and low literacy readers.
A significant challenge of this task is the lack of learner's dictionaries in many languages.
We propose a multitasking framework SimpDefiner that only requires a standard dictionary with complex definitions and a corpus containing arbitrary simple texts.
- Score: 5.2221935174520056
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The definition generation task can help language learners by providing
explanations for unfamiliar words. This task has attracted much attention in
recent years. We propose a novel task of Simple Definition Generation (SDG) to
help language learners and low literacy readers. A significant challenge of
this task is the lack of learner's dictionaries in many languages, and
therefore the lack of data for supervised training. We explore this task and
propose a multitasking framework SimpDefiner that only requires a standard
dictionary with complex definitions and a corpus containing arbitrary simple
texts. We disentangle the complexity factors from the text by carefully
designing a parameter sharing scheme between two decoders. By jointly training
these components, the framework can generate both complex and simple
definitions simultaneously. We demonstrate that the framework can generate
relevant, simple definitions for the target words through automatic and manual
evaluations on English and Chinese datasets. Our method outperforms the
baseline model by a 1.77 SARI score on the English dataset, and raises the
proportion of the low level (HSK level 1-3) words in Chinese definitions by
3.87%.
Related papers
- Generating Continuations in Multilingual Idiomatic Contexts [2.0849578298972835]
We test the ability of generative language models (LMs) in understanding nuanced language containing non-compositional figurative text.
We conduct experiments using datasets in two distinct languages (English and Portuguese) under three different training settings.
Our results suggest that the models are only slightly better at generating continuations for literal contexts than idiomatic contexts, with exceedingly small margins.
arXiv Detail & Related papers (2023-10-31T05:40:33Z) - Assisting Language Learners: Automated Trans-Lingual Definition
Generation via Contrastive Prompt Learning [25.851611353632926]
The standard definition generation task requires to automatically produce mono-lingual definitions.
We propose a novel task of Trans-Lingual Definition Generation (TLDG), which aims to generate definitions in another language.
arXiv Detail & Related papers (2023-06-09T17:32:45Z) - Did You Read the Instructions? Rethinking the Effectiveness of Task
Definitions in Instruction Learning [74.70157466822612]
We systematically study the role of task definitions in instruction learning.
We find that model performance drops substantially when removing contents describing the task output.
We propose two strategies to help models better leverage task instructions.
arXiv Detail & Related papers (2023-06-01T21:11:24Z) - SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
Tasks [88.4408774253634]
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community.
There are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers.
Recent work has begun to introduce such benchmark for several tasks.
arXiv Detail & Related papers (2022-12-20T18:39:59Z) - Prompting Language Models for Linguistic Structure [73.11488464916668]
We present a structured prompting approach for linguistic structured prediction tasks.
We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking.
We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
arXiv Detail & Related papers (2022-11-15T01:13:39Z) - Task Grouping for Multilingual Text Recognition [28.036892501896983]
We propose an automatic method for multilingual text recognition with a task grouping and assignment module using Gumbel-Softmax.
Experiments on MLT19 lend evidence to our hypothesis that there is a middle ground between combining every task together and separating every task that achieves a better configuration of task grouping/separation.
arXiv Detail & Related papers (2022-10-13T23:54:23Z) - Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language
Understanding [51.31622274823167]
We propose a hierarchical framework with a coarse-to-fine paradigm, with the bottom level shared to all the tasks, the mid-level divided to different groups, and the top-level assigned to each of the tasks.
This allows our model to learn basic language properties from all tasks, boost performance on relevant tasks, and reduce the negative impact from irrelevant tasks.
arXiv Detail & Related papers (2022-08-19T02:46:20Z) - Learning to Follow Language Instructions with Compositional Policies [22.778677208048475]
We propose a framework that learns to execute natural language instructions in an environment consisting of goal-reaching tasks.
We train a reinforcement learning agent to learn value functions that can be subsequently composed through a Boolean algebra.
We fine-tune a seq2seq model pretrained on web-scale corpora to map language to logical expressions.
arXiv Detail & Related papers (2021-10-09T21:28:26Z) - Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on
Spoken Language Understanding [101.24748444126982]
Decomposable tasks are complex and comprise of a hierarchy of sub-tasks.
Existing benchmarks, however, typically hold out examples for only the surface-level sub-task.
We propose a framework to construct robust test sets using coordinate ascent over sub-task specific utility functions.
arXiv Detail & Related papers (2021-06-29T02:53:59Z) - Words aren't enough, their order matters: On the Robustness of Grounding
Visual Referring Expressions [87.33156149634392]
We critically examine RefCOg, a standard benchmark for visual referring expression recognition.
We show that 83.7% of test instances do not require reasoning on linguistic structure.
We propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT.
arXiv Detail & Related papers (2020-05-04T17:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.