Mitigating Negative Transfer with Task Awareness for Sexism, Hate
Speech, and Toxic Language Detection
- URL: http://arxiv.org/abs/2307.03377v1
- Date: Fri, 7 Jul 2023 04:10:37 GMT
- Title: Mitigating Negative Transfer with Task Awareness for Sexism, Hate
Speech, and Toxic Language Detection
- Authors: Angel Felipe Magnoss\~ao de Paula, Paolo Rosso and Damiano Spina
- Abstract summary: This paper proposes a new approach to mitigate the negative transfer problem based on the task awareness concept.
The proposed approach results in diminishing the negative transfer together with an improvement of performance over classic MTL solution.
The proposed approach has been implemented in two unified architectures to detect Sexism, Hate Speech, and Toxic Language in text comments.
- Score: 7.661927086611542
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a novelty approach to mitigate the negative transfer
problem. In the field of machine learning, the common strategy is to apply the
Single-Task Learning approach in order to train a supervised model to solve a
specific task. Training a robust model requires a lot of data and a significant
amount of computational resources, making this solution unfeasible in cases
where data are unavailable or expensive to gather. Therefore another solution,
based on the sharing of information between tasks, has been developed:
Multi-Task Learning (MTL). Despite the recent developments regarding MTL, the
problem of negative transfer has still to be solved. Negative transfer is a
phenomenon that occurs when noisy information is shared between tasks,
resulting in a drop in performance. This paper proposes a new approach to
mitigate the negative transfer problem based on the task awareness concept. The
proposed approach results in diminishing the negative transfer together with an
improvement of performance over classic MTL solution. Moreover, the proposed
approach has been implemented in two unified architectures to detect Sexism,
Hate Speech, and Toxic Language in text comments. The proposed architectures
set a new state-of-the-art both in EXIST-2021 and HatEval-2019 benchmarks.
Related papers
- Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer [21.57847333976567]
Multimodal Continual Instruction Tuning (MCIT) enables Multimodal Large Language Models (MLLMs) to meet continuously emerging requirements without expensive retraining.
MCIT faces two major obstacles: catastrophic forgetting (where old knowledge is forgotten) and negative forward transfer.
We propose Prompt Tuning with Positive Forward Transfer (Fwd-Prompt) to address these issues.
arXiv Detail & Related papers (2024-01-17T12:44:17Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - Feature Decomposition for Reducing Negative Transfer: A Novel Multi-task
Learning Method for Recommender System [35.165907482126464]
We propose a novel multi-task learning method termed Feature Decomposition Network (FDN)
The key idea of the proposed FDN is reducing the phenomenon of feature redundancy by explicitly decomposing features into task-specific features and task-shared features with carefully designed constraints.
Experimental results show that our proposed FDN can outperform the state-of-the-art (SOTA) methods by a noticeable margin.
arXiv Detail & Related papers (2023-02-10T03:08:37Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - Sequential Reptile: Inter-Task Gradient Alignment for Multilingual
Learning [61.29879000628815]
We show that it is crucial for tasks to align gradients between them in order to maximize knowledge transfer.
We propose a simple yet effective method that can efficiently align gradients between tasks.
We extensively validate our method on various multi-task learning and zero-shot cross-lingual transfer tasks.
arXiv Detail & Related papers (2021-10-06T09:10:10Z) - On-edge Multi-task Transfer Learning: Model and Practice with
Data-driven Task Allocation [20.20889051697198]
We show that task allocation with task importance for Multi-task Transfer Learning (MTL) is a variant of the NP-complete Knapsack problem.
We propose a Data-driven Cooperative Task Allocation (DCTA) approach to solve TATIM with high computational efficiency.
Our DCTA reduces 3.24 times of processing time, and saves 48.4% energy consumption compared with the state-of-the-art when solving TATIM.
arXiv Detail & Related papers (2021-07-06T08:24:25Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Learning Boost by Exploiting the Auxiliary Task in Multi-task Domain [1.2183405753834562]
Learning two tasks in a single shared function has some benefits.
It helps to generalize the function that can be learned using generally applicable information for both tasks.
However, in a real environment, tasks inevitably have a conflict between them during the learning phase, called negative transfer.
We introduce a novel approach that can drive positive transfer and suppress negative transfer by leveraging class-wise weights in the learning process.
arXiv Detail & Related papers (2020-08-05T10:56:56Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.