HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent
- URL: http://arxiv.org/abs/2402.01018v1
- Date: Thu, 1 Feb 2024 21:10:44 GMT
- Title: HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent
- Authors: Weijie Xu, Zicheng Huang, Wenxiang Hu, Xi Fang, Rajesh Kumar
Cherukuri, Naumaan Nayyar, Lorenzo Malandri, Srinivasan H. Sengamedu
- Abstract summary: We introduce HR-Multiwoz, a fully-labeled dataset of 550 conversations spanning 10 HR domains.
It is the first labeled open-sourced conversation dataset in the HR domain for NLP research.
It provides a detailed recipe for the data generation procedure along with data analysis and human evaluations.
- Score: 6.764665650605542
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in Large Language Models (LLMs) have been reshaping
Natural Language Processing (NLP) task in several domains. Their use in the
field of Human Resources (HR) has still room for expansions and could be
beneficial for several time consuming tasks. Examples such as time-off
submissions, medical claims filing, and access requests are noteworthy, but
they are by no means the sole instances. However, the aforementioned
developments must grapple with the pivotal challenge of constructing a
high-quality training dataset. On one hand, most conversation datasets are
solving problems for customers not employees. On the other hand, gathering
conversations with HR could raise privacy concerns. To solve it, we introduce
HR-Multiwoz, a fully-labeled dataset of 550 conversations spanning 10 HR
domains to evaluate LLM Agent. Our work has the following contributions: (1) It
is the first labeled open-sourced conversation dataset in the HR domain for NLP
research. (2) It provides a detailed recipe for the data generation procedure
along with data analysis and human evaluations. The data generation pipeline is
transferable and can be easily adapted for labeled conversation data generation
in other domains. (3) The proposed data-collection pipeline is mostly based on
LLMs with minimal human involvement for annotation, which is time and
cost-efficient.
Related papers
- NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews [65.35458530702442]
We focus on journalistic interviews, a domain rich in grounding communication and abundant in data.
We curate a dataset of 40,000 two-person informational interviews from NPR and CNN.
LLMs are significantly less likely than human interviewers to use acknowledgements and to pivot to higher-level questions.
arXiv Detail & Related papers (2024-11-21T01:37:38Z) - Synthetic Data Generation with Large Language Models for Personalized Community Question Answering [47.300506002171275]
We build Sy-SE-PQA based on an existing dataset, SE-PQA, which consists of questions and answers posted on the popular StackExchange communities.
Our findings suggest that LLMs have high potential in generating data tailored to users' needs.
The synthetic data can replace human-written training data, even if the generated data may contain incorrect information.
arXiv Detail & Related papers (2024-10-29T16:19:08Z) - Retrieval-Augmented Personalization for Multimodal Large Language Models [53.304699445700926]
We introduce the Retrieval Augmented Personalization (RAP) framework for MLLMs' personalization.
RAP allows real-time concept editing via updating the external database.
RAP-MLLMs can generalize to infinite visual concepts without additional finetuning.
arXiv Detail & Related papers (2024-10-17T09:10:26Z) - HR-Agent: A Task-Oriented Dialogue (TOD) LLM Agent Tailored for HR Applications [10.383829270485247]
We present HR-Agent, an efficient, confidential, and HR-specific LLM-based task-oriented dialogue system tailored for automating repetitive HR processes.
Since conversation data is not sent to an LLM during inference, it preserves confidentiality required in HR-related tasks.
arXiv Detail & Related papers (2024-10-15T03:51:08Z) - Integrating Planning into Single-Turn Long-Form Text Generation [66.08871753377055]
We propose to use planning to generate long form content.
Our main novelty lies in a single auxiliary task that does not require multiple rounds of prompting or planning.
Our experiments demonstrate on two datasets from different domains, that LLMs fine-tuned with the auxiliary task generate higher quality documents.
arXiv Detail & Related papers (2024-10-08T17:02:40Z) - DataAgent: Evaluating Large Language Models' Ability to Answer Zero-Shot, Natural Language Queries [0.0]
We evaluate OpenAI's GPT-3.5 as a "Language Data Scientist" (LDS)
The model was tested on a diverse set of benchmark datasets to evaluate its performance across multiple standards.
arXiv Detail & Related papers (2024-03-29T22:59:34Z) - Retrieval-Augmented Data Augmentation for Low-Resource Domain Tasks [66.87070857705994]
In low-resource settings, the amount of seed data samples to use for data augmentation is very small.
We propose a novel method that augments training data by incorporating a wealth of examples from other datasets.
This approach can ensure that the generated data is not only relevant but also more diverse than what could be achieved using the limited seed data alone.
arXiv Detail & Related papers (2024-02-21T02:45:46Z) - STAR: Boosting Low-Resource Information Extraction by Structure-to-Text
Data Generation with Large Language Models [56.27786433792638]
STAR is a data generation method that leverages Large Language Models (LLMs) to synthesize data instances.
We design fine-grained step-by-step instructions to obtain the initial data instances.
Our experiments show that the data generated by STAR significantly improve the performance of low-resource event extraction and relation extraction tasks.
arXiv Detail & Related papers (2023-05-24T12:15:19Z) - AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks.
We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate.
We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z) - MK-SQuIT: Synthesizing Questions using Iterative Template-filling [0.0]
We create a framework for synthetically generating question/query pairs with as little human input as possible.
These datasets can be used to train machine translation systems to convert natural language questions into queries.
arXiv Detail & Related papers (2020-11-04T22:33:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.