Related papers: Small Stickers, Big Meanings: A Multilingual Sticker Semantic Understanding Dataset with a Gamified Approach

Small Stickers, Big Meanings: A Multilingual Sticker Semantic Understanding Dataset with a Gamified Approach

URL: http://arxiv.org/abs/2506.01668v1
Date: Mon, 02 Jun 2025 13:38:45 GMT
Title: Small Stickers, Big Meanings: A Multilingual Sticker Semantic Understanding Dataset with a Gamified Approach
Authors: Heng Er Metilda Chee, Jiayin Wang, Zhiqiang Guo, Weizhi Ma, Min Zhang,
Abstract summary: We introduce Sticktionary, a gamified annotation framework designed to gather diverse, high-quality, and contextually resonant sticker queries.<n>Second, we present StickerQueries, a multilingual sticker query dataset containing 1,115 English and 615 Chinese queries, annotated by over 60 contributors across 60+ hours.<n>Third, we demonstrate that our approach significantly enhances query generation quality, retrieval accuracy, and semantic understanding in the sticker domain.
Score: 21.279568613306573
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Stickers, though small, are a highly condensed form of visual expression, ubiquitous across messaging platforms and embraced by diverse cultures, genders, and age groups. Despite their popularity, sticker retrieval remains an underexplored task due to the significant human effort and subjectivity involved in constructing high-quality sticker query datasets. Although large language models (LLMs) excel at general NLP tasks, they falter when confronted with the nuanced, intangible, and highly specific nature of sticker query generation. To address this challenge, we propose a threefold solution. First, we introduce Sticktionary, a gamified annotation framework designed to gather diverse, high-quality, and contextually resonant sticker queries. Second, we present StickerQueries, a multilingual sticker query dataset containing 1,115 English and 615 Chinese queries, annotated by over 60 contributors across 60+ hours. Lastly, Through extensive quantitative and qualitative evaluation, we demonstrate that our approach significantly enhances query generation quality, retrieval accuracy, and semantic understanding in the sticker domain. To support future research, we publicly release our multilingual dataset along with two fine-tuned query generation models.

Related papers

MERIT: Multilingual Semantic Retrieval with Interleaved Multi-Condition Query [55.486895951981566]
MERIT is the first multilingual dataset for interleaved multi-condition semantic retrieval.<n>This paper introduces MERIT, the first multilingual dataset for interleaved multi-condition semantic retrieval.
arXiv Detail & Related papers (2025-06-03T17:59:14Z)
U-Sticker: A Large-Scale Multi-Domain User Sticker Dataset for Retrieval and Personalization [20.082343227750282]
We introduce User-Sticker, a dataset that includes temporal and user anonymous ID across conversations.<n>The raw data was collected from a popular messaging platform from 67 conversations over 720 hours of crawling.<n>The dataset captures rich temporal, multilingual, and cross-domain behaviors not previously available in other datasets.
arXiv Detail & Related papers (2025-02-26T12:50:58Z)
PerSRV: Personalized Sticker Retrieval with Vision-Language Model [21.279568613306573]
We propose the Personalized Sticker Retrieval with Vision-Language Model framework, namely PerSRV, structured into offline calculations and online processing modules.<n>For sticker-level semantic understanding, we supervised fine-tuned LLaVA-1.5-7B to generate human-like sticker semantics.<n>Thirdly, we cluster style centroids based on users' historical interactions to achieve personal preference modeling.
arXiv Detail & Related papers (2024-10-29T07:13:47Z)
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark [68.21939124278065]
Culturally-diverse multilingual Visual Question Answering benchmark designed to cover a rich set of languages and cultures. CVQA includes culturally-driven images and questions from across 30 countries on four continents, covering 31 languages with 13 scripts, providing a total of 10k questions. We benchmark several Multimodal Large Language Models (MLLMs) on CVQA, and show that the dataset is challenging for the current state-of-the-art models.
arXiv Detail & Related papers (2024-06-10T01:59:00Z)
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering [58.92057773071854]
We introduce MTVQA, the first benchmark featuring high-quality human expert annotations across 9 diverse languages.<n>MTVQA is the first benchmark featuring high-quality human expert annotations across 9 diverse languages.
arXiv Detail & Related papers (2024-05-20T12:35:01Z)
Impact of Stickers on Multimodal Chat Sentiment Analysis and Intent Recognition: A New Task, Dataset and Baseline [4.375392069380812]
We propose a new task: Multimodal chat Sentiment Analysis and Intent Recognition involving Stickers (MSAIRS) We introduce a novel multimodal dataset containing Chinese chat records and stickers excerpted from several mainstream social media platforms. Our dataset and code will be publicly available.
arXiv Detail & Related papers (2024-05-14T08:42:49Z)
Sticker820K: Empowering Interactive Retrieval with Stickers [34.67442172774095]
We propose a large-scale Chinese sticker dataset, namely Sticker820K, which consists of 820k image-text pairs. Each sticker has rich and high-quality textual annotations, including descriptions, optical characters, emotional labels, and style classifications. For the text-to-image retrieval task, our StickerCLIP demonstrates strong superiority over the CLIP, which achieves an absolute gain of 66.0% in mean recall.
arXiv Detail & Related papers (2023-06-12T05:06:53Z)
Selecting Stickers in Open-Domain Dialogue through Multitask Learning [51.67855506570727]
We propose a multitask learning method comprised of three auxiliary tasks to enhance the understanding of dialogue history, emotion and semantic meaning of stickers. Our model can better combine the multimodal information and achieve significantly higher accuracy over strong baselines.
arXiv Detail & Related papers (2022-09-16T03:45:22Z)
AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context. It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts. Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z)
Learning to Respond with Your Favorite Stickers: A Framework of Unifying Multi-Modality and User Preference in Multi-Turn Dialog [67.91114640314004]
Stickers with vivid and engaging expressions are becoming increasingly popular in online messaging apps. Some works are dedicated to automatically select sticker response by matching the stickers image with previous utterances. We propose to recommend an appropriate sticker to user based on multi-turn dialog context and sticker using history of user.
arXiv Detail & Related papers (2020-11-05T03:31:17Z)
Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog [65.7021675527543]
Stickers with vivid and engaging expressions are becoming increasingly popular in online messaging apps. Some works are dedicated to automatically select sticker response by matching text labels of stickers with previous utterances. We propose to recommend an appropriate sticker to user based on multi-turn dialog context history without any external labels.
arXiv Detail & Related papers (2020-03-10T13:10:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.