HICL: Hashtag-Driven In-Context Learning for Social Media Natural
Language Understanding
- URL: http://arxiv.org/abs/2308.09985v1
- Date: Sat, 19 Aug 2023 11:31:45 GMT
- Title: HICL: Hashtag-Driven In-Context Learning for Social Media Natural
Language Understanding
- Authors: Hanzhuo Tan, Chunpu Xu, Jing Li, Yuqun Zhang, Zeyang Fang, Zeyu Chen,
Baohua Lai
- Abstract summary: In this paper, we propose a novel hashtag-driven in-context learning framework for natural language understanding on social media.
Our objective is to enable a model #Encoder to incorporate topic-related semantic information, which allows it to retrieve topic-related posts.
For empirical studies, we collected 45M tweets to set up an in-context NLU benchmark, and the experimental results on seven downstream tasks show that HICL substantially advances the previous state-of-the-art results.
- Score: 15.743523533234224
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Natural language understanding (NLU) is integral to various social media
applications. However, existing NLU models rely heavily on context for semantic
learning, resulting in compromised performance when faced with short and noisy
social media content. To address this issue, we leverage in-context learning
(ICL), wherein language models learn to make inferences by conditioning on a
handful of demonstrations to enrich the context and propose a novel
hashtag-driven in-context learning (HICL) framework. Concretely, we pre-train a
model #Encoder, which employs #hashtags (user-annotated topic labels) to drive
BERT-based pre-training through contrastive learning. Our objective here is to
enable #Encoder to gain the ability to incorporate topic-related semantic
information, which allows it to retrieve topic-related posts to enrich contexts
and enhance social media NLU with noisy contexts. To further integrate the
retrieved context with the source text, we employ a gradient-based method to
identify trigger terms useful in fusing information from both sources. For
empirical studies, we collected 45M tweets to set up an in-context NLU
benchmark, and the experimental results on seven downstream tasks show that
HICL substantially advances the previous state-of-the-art results. Furthermore,
we conducted extensive analyzes and found that: (1) combining source input with
a top-retrieved post from #Encoder is more effective than using semantically
similar posts; (2) trigger words can largely benefit in merging context from
the source and retrieved posts.
Related papers
- Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding [9.2433070542025]
Large language models (LLMs) tend to inadequately integrate input context during text generation.
We introduce a novel approach integrating contrastive decoding with adversarial irrelevant passages as negative samples.
arXiv Detail & Related papers (2024-05-04T20:38:41Z) - Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning [32.178931149612644]
In-context learning enables language models to adapt to downstream data or incorporate tasks by few samples as demonstrations within the prompts.
However, the performance of in-context learning can be unstable depending on the quality, format, or order of demonstrations.
We propose a novel approach "vocabulary-defined semantics"
arXiv Detail & Related papers (2024-01-29T14:29:48Z) - Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information.
We propose an approach to distill the generated information during fine-tuning of self-supervised speech models.
We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - IERL: Interpretable Ensemble Representation Learning -- Combining
CrowdSourced Knowledge and Distributed Semantic Representations [11.008412414253662]
Large Language Models (LLMs) encode meanings of words in the form of distributed semantics.
Recent studies have shown that LLMs tend to generate unintended, inconsistent, or wrong texts as outputs.
We propose a novel ensemble learning method, Interpretable Ensemble Representation Learning (IERL), that systematically combines LLM and crowdsourced knowledge representations.
arXiv Detail & Related papers (2023-06-24T05:02:34Z) - Harnessing Explanations: LLM-to-LM Interpreter for Enhanced
Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks.
Our method achieves state-of-the-art results on well-established TAG datasets.
Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z) - Label Words are Anchors: An Information Flow Perspective for
Understanding In-Context Learning [77.7070536959126]
In-context learning (ICL) emerges as a promising capability of large language models (LLMs)
In this paper, we investigate the working mechanism of ICL through an information flow lens.
We introduce an anchor re-weighting method to improve ICL performance, a demonstration compression technique to expedite inference, and an analysis framework for diagnosing ICL errors in GPT2-XL.
arXiv Detail & Related papers (2023-05-23T15:26:20Z) - Abstractive Summarization of Spoken and Written Instructions with BERT [66.14755043607776]
We present the first application of the BERTSum model to conversational language.
We generate abstractive summaries of narrated instructional videos across a wide variety of topics.
We envision this integrated as a feature in intelligent virtual assistants, enabling them to summarize both written and spoken instructional content upon request.
arXiv Detail & Related papers (2020-08-21T20:59:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.