Distant Supervision for E-commerce Query Segmentation via Attention
Network
- URL: http://arxiv.org/abs/2011.04166v1
- Date: Mon, 9 Nov 2020 03:00:52 GMT
- Title: Distant Supervision for E-commerce Query Segmentation via Attention
Network
- Authors: Zhao Li, Donghui Ding, Pengcheng Zou, Yu Gong, Xi Chen, Ji Zhang,
Jianliang Gao, Youxi Wu and Yucong Duan
- Abstract summary: We propose a BiLSTM-CRF based model with an attention module to encode external features, such that external contexts information, which can be utilized naturally and effectively to help query segmentation.
Experiments on two datasets show the effectiveness of our approach compared with several kinds of baselines.
- Score: 22.946144345151605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The booming online e-commerce platforms demand highly accurate approaches to
segment queries that carry the product requirements of consumers. Recent works
have shown that the supervised methods, especially those based on deep
learning, are attractive for achieving better performance on the problem of
query segmentation. However, the lack of labeled data is still a big challenge
for training a deep segmentation network, and the problem of Out-of-Vocabulary
(OOV) also adversely impacts the performance of query segmentation. Different
from query segmentation task in an open domain, e-commerce scenario can provide
external documents that are closely related to these queries. Thus, to deal
with the two challenges, we employ the idea of distant supervision and design a
novel method to find contexts in external documents and extract features from
these contexts. In this work, we propose a BiLSTM-CRF based model with an
attention module to encode external features, such that external contexts
information, which can be utilized naturally and effectively to help query
segmentation. Experiments on two datasets show the effectiveness of our
approach compared with several kinds of baselines.
Related papers
- GQE: Generalized Query Expansion for Enhanced Text-Video Retrieval [56.610806615527885]
This paper introduces a novel data-centric approach, Generalized Query Expansion (GQE), to address the inherent information imbalance between text and video.
By adaptively segmenting videos into short clips and employing zero-shot captioning, GQE enriches the training dataset with comprehensive scene descriptions.
GQE achieves state-of-the-art performance on several benchmarks, including MSR-VTT, MSVD, LSMDC, and VATEX.
arXiv Detail & Related papers (2024-08-14T01:24:09Z) - Leveraging Inter-Chunk Interactions for Enhanced Retrieval in Large Language Model-Based Question Answering [12.60063463163226]
IIER captures the internal connections between document chunks by considering three types of interactions: structural, keyword, and semantic.
It identifies multiple seed nodes based on the target question and iteratively searches for relevant chunks to gather supporting evidence.
It refines the context and reasoning chain, aiding the large language model in reasoning and answer generation.
arXiv Detail & Related papers (2024-08-06T02:39:55Z) - QueryNER: Segmentation of E-commerce Queries [12.563241705572409]
We present a manually-annotated dataset and accompanying model for e-commerce query segmentation.
Our work instead focuses on the goal of dividing a query into meaningful chunks with broadly applicable types.
arXiv Detail & Related papers (2024-05-15T16:58:35Z) - LaSagnA: Language-based Segmentation Assistant for Complex Queries [39.620806493454616]
Large Language Models for Vision (vLLMs) generate detailed perceptual outcomes, including bounding boxes and masks.
In this study, we acknowledge that the main cause of these problems is the insufficient complexity of training queries.
We present three novel strategies to effectively handle the challenges arising from the direct integration of the proposed format.
arXiv Detail & Related papers (2024-04-12T14:40:45Z) - Open-Vocabulary Camouflaged Object Segmentation [66.94945066779988]
We introduce a new task, open-vocabulary camouflaged object segmentation (OVCOS)
We construct a large-scale complex scene dataset (textbfOVCamo) containing 11,483 hand-selected images with fine annotations and corresponding object classes.
By integrating the guidance of class semantic knowledge and the supplement of visual structure cues from the edge and depth information, the proposed method can efficiently capture camouflaged objects.
arXiv Detail & Related papers (2023-11-19T06:00:39Z) - PPN: Parallel Pointer-based Network for Key Information Extraction with
Complex Layouts [29.73609439825548]
Key Information Extraction is a challenging task that aims to extract structured value semantic entities from documents.
Existing methods follow a two-stage pipeline strategy, which may lead to the error propagation problem.
We introduce Parallel Pointer-based Network (PPN), an end-to-end model that can be applied in zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2023-07-20T03:29:09Z) - Open-vocabulary Panoptic Segmentation with Embedding Modulation [71.15502078615587]
Open-vocabulary image segmentation is attracting increasing attention due to its critical applications in the real world.
Traditional closed-vocabulary segmentation methods are not able to characterize novel objects, whereas several recent open-vocabulary attempts obtain unsatisfactory results.
We propose OPSNet, an omnipotent and data-efficient framework for Open-vocabulary Panopticon.
arXiv Detail & Related papers (2023-03-20T17:58:48Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - Aspect-Oriented Summarization through Query-Focused Extraction [23.62412515574206]
Real users' needs often fall more closely into aspects, broad topics in a dataset the user is interested in rather than specific queries.
We benchmark extractive query-focused training schemes, and propose a contrastive augmentation approach to train the model.
We evaluate on two aspect-oriented datasets and find this approach yields focused summaries, better than those from a generic summarization system.
arXiv Detail & Related papers (2021-10-15T18:06:21Z) - BriNet: Towards Bridging the Intra-class and Inter-class Gaps in
One-Shot Segmentation [84.2925550033094]
Few-shot segmentation focuses on the generalization of models to segment unseen object instances with limited training samples.
We propose a framework, BriNet, to bridge the gaps between the extracted features of the query and support images.
The effectiveness of our framework is demonstrated by experimental results, which outperforms other competitive methods.
arXiv Detail & Related papers (2020-08-14T07:45:50Z) - Query Focused Multi-Document Summarization with Distant Supervision [88.39032981994535]
Existing work relies heavily on retrieval-style methods for estimating the relevance between queries and text segments.
We propose a coarse-to-fine modeling framework which introduces separate modules for estimating whether segments are relevant to the query.
We demonstrate that our framework outperforms strong comparison systems on standard QFS benchmarks.
arXiv Detail & Related papers (2020-04-06T22:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.