eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data
- URL: http://arxiv.org/abs/2402.08831v2
- Date: Sat, 3 Aug 2024 23:29:26 GMT
- Title: eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data
- Authors: Bo Peng, Xinyi Ling, Ziru Chen, Huan Sun, Xia Ning,
- Abstract summary: We construct ECInstruct, the first open-sourced, large-scale, and high-quality benchmark instruction dataset for e-commerce.
We develop eCeLLM, a series of e-commerce LLMs, by instruction-tuning general-purpose LLMs.
eCeLLM exhibits excellent generalizability to out-of-domain settings, including unseen products and unseen instructions.
- Score: 12.895762133464103
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With tremendous efforts on developing effective e-commerce models, conventional e-commerce models show limited success in generalist e-commerce modeling, and suffer from unsatisfactory performance on new users and new products - a typical out-of-domain generalization challenge. Meanwhile, large language models (LLMs) demonstrate outstanding performance in generalist modeling and out-of-domain generalizability in many fields. Toward fully unleashing their power for e-commerce, in this paper, we construct ECInstruct, the first open-sourced, large-scale, and high-quality benchmark instruction dataset for e-commerce. Leveraging ECInstruct, we develop eCeLLM, a series of e-commerce LLMs, by instruction-tuning general-purpose LLMs. Our comprehensive experiments and evaluation demonstrate that eCeLLM models substantially outperform baseline models, including the most advanced GPT-4, and the state-of-the-art task-specific models in in-domain evaluation. Moreover, eCeLLM exhibits excellent generalizability to out-of-domain settings, including unseen products and unseen instructions, highlighting its superiority as a generalist e-commerce model. Both the ECInstruct dataset and the eCeLLM models show great potential in empowering versatile and effective LLMs for e-commerce. ECInstruct and eCeLLM models are publicly accessible through https://ninglab.github.io/eCeLLM.
Related papers
- Captions Speak Louder than Images (CASLIE): Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data [19.191477918391726]
We introduce MMECInstruct, the first-ever, large-scale, and high-quality multimodal instruction dataset for e-commerce.
We also develop CASLIE, a simple, lightweight, yet effective framework for integrating multimodal information for e-commerce.
arXiv Detail & Related papers (2024-10-22T18:11:43Z) - Investigating LLM Applications in E-Commerce [17.854070801235217]
Large Language Models (LLMs) have revolutionized natural language processing in various applications especially in e-commerce.
This paper explored the efficacy of LLMs in the e-commerce domain, focusing on instruction-tuning an open source LLM model with public e-commerce datasets of varying sizes.
We examined the effectiveness of the current niche industrial application of very large LLM, using in-context learning, in e-commerce specific tasks.
arXiv Detail & Related papers (2024-08-23T00:57:37Z) - Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts [54.529880848937104]
We develop a unified MLLM with the MoE architecture, named Uni-MoE, that can handle a wide array of modalities.
Specifically, it features modality-specific encoders with connectors for a unified multimodal representation.
We evaluate the instruction-tuned Uni-MoE on a comprehensive set of multimodal datasets.
arXiv Detail & Related papers (2024-05-18T12:16:01Z) - EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models
with Semi-structured Data [67.8302955948861]
Large Language Models (LLMs) pre-trained on massive corpora have exhibited remarkable performance on various NLP tasks.
Applying these models to specific domains still poses significant challenges, such as lack of domain knowledge.
We focus on domain-specific continual pre-training of LLMs using E-commerce domain as an exemplar.
arXiv Detail & Related papers (2023-12-25T11:31:47Z) - MindLLM: Pre-training Lightweight Large Language Model from Scratch,
Evaluations and Domain Applications [46.337078949637345]
We present MindLLM, a novel series of bilingual lightweight large language models, trained from scratch.
A thorough account of experiences accrued during large model development is given, covering every step of the process.
MindLLM consistently matches or surpasses the performance of other open-source larger models on some public benchmarks.
arXiv Detail & Related papers (2023-10-24T12:22:34Z) - Adapting Large Language Models for Content Moderation: Pitfalls in Data
Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains.
In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z) - EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task
Tasks for E-commerce [68.72104414369635]
We propose the first e-commerce instruction dataset EcomInstruct, with a total of 2.5 million instruction data.
EcomGPT outperforms ChatGPT in term of cross-dataset/task generalization on E-commerce tasks.
arXiv Detail & Related papers (2023-08-14T06:49:53Z) - LLaMA-E: Empowering E-commerce Authoring with Object-Interleaved Instruction Following [16.800545001782037]
This paper proposes LLaMA-E, the unified e-commerce authoring models that address the contextual preferences of customers, sellers, and platforms.
We design the instruction set derived from tasks of ads generation, query-enhanced product title rewriting, product classification, purchase intent speculation, and general e-commerce Q&A.
The proposed LLaMA-E models achieve state-of-the-art evaluation performance and exhibit the advantage in zero-shot practical applications.
arXiv Detail & Related papers (2023-08-09T12:26:37Z) - Learning Instance-Level Representation for Large-Scale Multi-Modal
Pretraining in E-commerce [35.73830796500975]
We propose an instance-centric multi-modal pretraining paradigm called ECLIP in this work.
To enable the model to focus on the desired product instance without reliance on expensive manual annotations, two specially configured pretext tasks are proposed.
ECLIP surpasses existing methods by a large margin on a broad range of downstream tasks, demonstrating the strong transferability to real-world E-commerce applications.
arXiv Detail & Related papers (2023-04-06T04:14:41Z) - METRO: Efficient Denoising Pretraining of Large Scale Autoencoding
Language Models with Model Generated Signals [151.3601429216877]
We present an efficient method of pretraining large-scale autoencoding language models using training signals generated by an auxiliary model.
We propose a recipe, namely "Model generated dEnoising TRaining Objective" (METRO)
The resultant models, METRO-LM, consisting of up to 5.4 billion parameters, achieve new state-of-the-art on the GLUE, SuperGLUE, and SQuAD benchmarks.
arXiv Detail & Related papers (2022-04-13T21:39:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.