EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task
Tasks for E-commerce
- URL: http://arxiv.org/abs/2308.06966v2
- Date: Mon, 28 Aug 2023 04:12:30 GMT
- Title: EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task
Tasks for E-commerce
- Authors: Yangning Li, Shirong Ma, Xiaobin Wang, Shen Huang, Chengyue Jiang,
Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang
- Abstract summary: We propose the first e-commerce instruction dataset EcomInstruct, with a total of 2.5 million instruction data.
EcomGPT outperforms ChatGPT in term of cross-dataset/task generalization on E-commerce tasks.
- Score: 68.72104414369635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, instruction-following Large Language Models (LLMs) , represented by
ChatGPT, have exhibited exceptional performance in general Natural Language
Processing (NLP) tasks. However, the unique characteristics of E-commerce data
pose significant challenges to general LLMs. An LLM tailored specifically for
E-commerce scenarios, possessing robust cross-dataset/task generalization
capabilities, is a pressing necessity. To solve this issue, in this work, we
proposed the first e-commerce instruction dataset EcomInstruct, with a total of
2.5 million instruction data. EcomInstruct scales up the data size and task
diversity by constructing atomic tasks with E-commerce basic data types, such
as product information, user reviews. Atomic tasks are defined as intermediate
tasks implicitly involved in solving a final task, which we also call
Chain-of-Task tasks. We developed EcomGPT with different parameter scales by
training the backbone model BLOOMZ with the EcomInstruct. Benefiting from the
fundamental semantic understanding capabilities acquired from the Chain-of-Task
tasks, EcomGPT exhibits excellent zero-shot generalization capabilities.
Extensive experiments and human evaluations demonstrate that EcomGPT
outperforms ChatGPT in term of cross-dataset/task generalization on E-commerce
tasks.
Related papers
- From Instance Training to Instruction Learning: Task Adapters Generation from Instructions [29.452006810725184]
This paper focuses on simulating human learning to address the shortcomings of instance training.
We introduce Task Adapters Generation from Instructions (TAGI), which automatically constructs the task-specific model.
We evaluate TAGI on the Super-Natural Instructions and P3 datasets.
arXiv Detail & Related papers (2024-06-18T08:14:28Z) - TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data [73.29220562541204]
We consider harnessing the amazing power of language models (LLMs) to solve our task.
We develop a TAT-LLM language model by fine-tuning LLaMA 2 with the training data generated automatically from existing expert-annotated datasets.
arXiv Detail & Related papers (2024-01-24T04:28:50Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models
with Semi-structured Data [67.8302955948861]
Large Language Models (LLMs) pre-trained on massive corpora have exhibited remarkable performance on various NLP tasks.
Applying these models to specific domains still poses significant challenges, such as lack of domain knowledge.
We focus on domain-specific continual pre-training of LLMs using E-commerce domain as an exemplar.
arXiv Detail & Related papers (2023-12-25T11:31:47Z) - PUMGPT: A Large Vision-Language Model for Product Understanding [18.70740237744492]
PumGPT is the first e-commerce specialized LVLM designed for multimodal product understanding tasks.
Our experiments show that PumGPT outperforms five other open-source LVLMs and GPT-4V in product understanding tasks.
arXiv Detail & Related papers (2023-08-18T14:01:37Z) - Towards Task Sampler Learning for Meta-Learning [37.02030832662183]
Meta-learning aims to learn general knowledge with diverse training tasks conducted from limited data, and then transfer it to new tasks.
It is commonly believed that increasing task diversity will enhance the generalization ability of meta-learning models.
This paper challenges this view through empirical and theoretical analysis.
arXiv Detail & Related papers (2023-07-18T01:53:18Z) - Zero-shot Item-based Recommendation via Multi-task Product Knowledge
Graph Pre-Training [106.85813323510783]
This paper presents a novel paradigm for the Zero-Shot Item-based Recommendation (ZSIR) task.
It pre-trains a model on product knowledge graph (PKG) to refine the item features from PLMs.
We identify three challenges for pre-training PKG, which are multi-type relations in PKG, semantic divergence between item generic information and relations and domain discrepancy from PKG to downstream ZSIR task.
arXiv Detail & Related papers (2023-05-12T17:38:24Z) - Learning Instance-Level Representation for Large-Scale Multi-Modal
Pretraining in E-commerce [35.73830796500975]
We propose an instance-centric multi-modal pretraining paradigm called ECLIP in this work.
To enable the model to focus on the desired product instance without reliance on expensive manual annotations, two specially configured pretext tasks are proposed.
ECLIP surpasses existing methods by a large margin on a broad range of downstream tasks, demonstrating the strong transferability to real-world E-commerce applications.
arXiv Detail & Related papers (2023-04-06T04:14:41Z) - Learning to Perform Complex Tasks through Compositional Fine-Tuning of
Language Models [20.173322408302134]
compositional fine-tuning is an approach based on explicitly decomposing a target task into component tasks.
We show that CFT outperforms end-to-end learning even with equal amounts of data.
arXiv Detail & Related papers (2022-10-23T03:22:34Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.