Related papers: Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks

Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks

URL: http://arxiv.org/abs/2005.02473v2
Date: Fri, 22 May 2020 13:08:02 GMT
Title: Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks
Authors: Kervy Rivas Rojas, Gina Bustamante, Arturo Oncevay, Marco A. Sobrevilla Cabezudo
Abstract summary: We perform a sequence of inference steps to predict the category of a document from top to bottom of a given class taxonomy. With our efficient approaches, we outperform previous studies, using a drastically reduced number of parameters, in two well-known English datasets.
Score: 3.5557219875516655
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In hierarchical text classification, we perform a sequence of inference steps to predict the category of a document from top to bottom of a given class taxonomy. Most of the studies have focused on developing novels neural network architectures to deal with the hierarchical structure, but we prefer to look for efficient ways to strengthen a baseline model. We first define the task as a sequence-to-sequence problem. Afterwards, we propose an auxiliary synthetic task of bottom-up-classification. Then, from external dictionaries, we retrieve textual definitions for the classes of all the hierarchy's layers, and map them into the word vector space. We use the class-definition embeddings as an additional input to condition the prediction of the next layer and in an adapted beam search. Whereas the modified search did not provide large gains, the combination of the auxiliary task and the additional input of class-definitions significantly enhance the classification accuracy. With our efficient approaches, we outperform previous studies, using a drastically reduced number of parameters, in two well-known English datasets.

Related papers

TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision [41.05874642535256]
Hierarchical text classification aims to categorize each document into a set of classes in a label taxonomy. Most earlier works focus on fully or semi-supervised methods that require a large amount of human annotated data. We work on hierarchical text classification with the minimal amount of supervision: using the sole class name of each node as the only supervision.
arXiv Detail & Related papers (2024-02-29T22:26:07Z)
Learning-to-Rank Meets Language: Boosting Language-Driven Ordering Alignment for Ordinal Classification [60.28913031192201]
We present a novel language-driven ordering alignment method for ordinal classification. Recent developments in pre-trained vision-language models inspire us to leverage the rich ordinal priors in human language. Experiments on three ordinal classification tasks, including facial age estimation, historical color image (HCI) classification, and aesthetic assessment demonstrate its promising performance.
arXiv Detail & Related papers (2023-06-24T04:11:31Z)
TaxoKnow: Taxonomy as Prior Knowledge in the Loss Function of Multi-class Classification [1.130757825611188]
We introduce two methods to integrate the hierarchical taxonomy as an explicit regularizer into the loss function of learning algorithms. By reasoning on a hierarchical taxonomy, a neural network alleviates its output distributions over the classes, allowing conditioning on upper concepts for a minority class.
arXiv Detail & Related papers (2023-05-24T08:08:56Z)
AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning [53.32576252950481]
Continual learning aims to enable a model to incrementally learn knowledge from sequentially arrived data. In this paper, we propose a non-incremental learner, named AttriCLIP, to incrementally extract knowledge of new classes or tasks.
arXiv Detail & Related papers (2023-05-19T07:39:17Z)
Autoregressive Search Engines: Generating Substrings as Document Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers. Previous work has explored ways to partition the search space into hierarchical structures. In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z)
TopicNet: Semantic Graph-Guided Topic Discovery [51.71374479354178]
Existing deep hierarchical topic models are able to extract semantically meaningful topics from a text corpus in an unsupervised manner. We introduce TopicNet as a deep hierarchical topic model that can inject prior structural knowledge as an inductive bias to influence learning.
arXiv Detail & Related papers (2021-10-27T09:07:14Z)
TagRec: Automated Tagging of Questions with Hierarchical Learning Taxonomy [0.0]
Online educational platforms organize academic questions based on a hierarchical learning taxonomy (subject-chapter-topic) This paper formulates the problem as a similarity-based retrieval task where we optimize the semantic relatedness between the taxonomy and the questions. We demonstrate that our method helps to handle the unseen labels and hence can be used for taxonomy tagging in the wild.
arXiv Detail & Related papers (2021-07-03T11:50:55Z)
Inducing a hierarchy for multi-class classification problems [11.58041597483471]
In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not. In this paper, we investigate a class of methods that induce a hierarchy that can similarly improve classification performance over flat classifiers. We demonstrate the effectiveness of the class of methods both for discovering a latent hierarchy and for improving accuracy in principled simulation settings and three real data applications.
arXiv Detail & Related papers (2021-02-20T05:40:42Z)
Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z)
Exploring the Hierarchy in Relation Labels for Scene Graph Generation [75.88758055269948]
The proposed method can improve several state-of-the-art baselines by a large margin (up to $33%$ relative gain) in terms of Recall@50. Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin.
arXiv Detail & Related papers (2020-09-12T17:36:53Z)
Rank over Class: The Untapped Potential of Ranking in Natural Language Processing [8.637110868126546]
We argue that many tasks which are currently addressed using classification are in fact being shoehorned into a classification mould. We propose a novel end-to-end ranking approach consisting of a Transformer network responsible for producing representations for a pair of text sequences. In an experiment on a heavily-skewed sentiment analysis dataset, converting ranking results to classification labels yields an approximately 22% improvement over state-of-the-art text classification.
arXiv Detail & Related papers (2020-09-10T22:18:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.