Related papers: Revisiting Markovian Generative Architectures for Efficient Task-Oriented Dialog Systems

Related papers

RoboCerebra: A Large-scale Benchmark for Long-horizon Robotic Manipulation Evaluation [80.20970723577818]
We introduce RoboCerebra, a benchmark for evaluating high-level reasoning in long-horizon robotic manipulation.<n>The dataset is constructed via a top-down pipeline, where GPT generates task instructions and decomposes them into subtask sequences.<n>Compared to prior benchmarks, RoboCerebra features significantly longer action sequences and denser annotations.
arXiv Detail & Related papers (2025-06-07T06:15:49Z)
KIT's Low-resource Speech Translation Systems for IWSLT2025: System Enhancement with Synthetic Data and Model Regularization [57.08591486199925]
This paper presents KIT's submissions to the IWSLT 2025 low-resource track.<n>We develop both cascaded systems, and end-to-end (E2E) Speech Translation systems.<n>Building upon pre-trained models, we fine-tune our systems with different strategies to utilize resources efficiently.
arXiv Detail & Related papers (2025-05-26T08:38:02Z)
Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation [58.799397354312596]
Large language models (LLMs) have demonstrated remarkable capabilities in various domains, particularly in system 1 tasks. Recent research on System2-to-System1 methods surge, exploring the System 2 reasoning knowledge via inference-time computation. In this paper, we focus on code generation, which is a representative System 2 task, and identify two primary challenges.
arXiv Detail & Related papers (2025-02-18T03:20:50Z)
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design [59.00758127310582]
We propose a novel framework Read-ME that transforms pre-trained dense LLMs into smaller MoE models. Our approach employs activation sparsity to extract experts. Read-ME outperforms other popular open-source dense models of similar scales.
arXiv Detail & Related papers (2024-10-24T19:48:51Z)
Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering [74.99736967448423]
We construct Design2Code - the first real-world benchmark for this task. We manually curate 484 diverse real-world webpages as test cases and develop a set of automatic evaluation metrics. Our fine-grained break-down metrics indicate that models mostly lag in recalling visual elements from the input webpages and generating correct layout designs.
arXiv Detail & Related papers (2024-03-05T17:56:27Z)
Large Language Models are Not Stable Recommender Systems [45.941176155464824]
We introduce exploratory research and find consistent patterns of positional bias in large language models (LLMs) We propose a Bayesian probabilistic framework, STELLA (Stable LLM for Recommendation), which involves a two-stage pipeline. Our framework can capitalize on existing pattern information to calibrate instability of LLMs, and enhance recommendation performance.
arXiv Detail & Related papers (2023-12-25T14:54:33Z)
Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study [76.52997424694767]
We present an in-depth empirical study of keyphrase extraction and keyphrase generation using pre-trained language models. We show that PLMs have competitive high-resource performance and state-of-the-art low-resource performance. Further results show that in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models.
arXiv Detail & Related papers (2022-12-20T13:20:21Z)
Jointly Reinforced User Simulator and Task-oriented Dialog System with Simplified Generative Architecture [24.305558215176752]
Online reinforcement learning of a GPT-2 based dialog system (DS) and a end-to-end user simulator (US) has not ever been explored. In this paper, we first propose Simplified Generative Architectures (SGA) for DS and US respectively, both based on GPT-2 but using shortened history. Our DS with the proposed SGA, when only supervised trained, achieves state-of-the-art performance on MultiWOZ2.1 and is more compute-efficient in both training and generation.
arXiv Detail & Related papers (2022-10-13T03:57:17Z)
Integrate Lattice-Free MMI into End-to-End Speech Recognition [87.01137882072322]
In automatic speech recognition (ASR) research, discriminative criteria have achieved superior performance in DNN-HMM systems. With this motivation, the adoption of discriminative criteria is promising to boost the performance of end-to-end (E2E) ASR systems. Previous works have introduced the minimum Bayesian risk (MBR, one of the discriminative criteria) into E2E ASR systems. In this work, novel algorithms are proposed in this work to integrate another widely used discriminative criterion, lattice-free maximum mutual information (LF-MMI) into E2E
arXiv Detail & Related papers (2022-03-29T14:32:46Z)
Variational Latent-State GPT for Semi-supervised Task-Oriented Dialog Systems [24.667353107453824]
Variational Latent-State GPT model (VLS-GPT) is the first to combine the strengths of the two approaches. We develop the strategy of sampling-then-forward-computation, which successfully overcomes the memory explosion issue of using GPT in variational learning. VLS-GPT is shown to significantly outperform both supervised-only and semi-supervised baselines.
arXiv Detail & Related papers (2021-09-09T14:42:29Z)
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems [75.43457658815943]
We propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems. MinTL is a simple yet effective transfer learning framework, which allows us to plug-and-play pre-trained seq2seq models. We instantiate our learning framework with two pre-trained backbones: T5 and BART, and evaluate them on MultiWOZ.
arXiv Detail & Related papers (2020-09-25T02:19:13Z)
The IMS-CUBoulder System for the SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion [27.37360427124081]
We present the systems of the University of Stuttgart IMS and the University of Colorado Boulder for SIGMORPHON 2020 Task 2 on unsupervised morphological paradigm completion. The task consists of generating the morphological paradigms of a set of lemmas, given only the lemmas themselves and unlabeled text. Our pointer-generator system obtains the best score of all seven submitted systems on average over all languages, and outperforms the official baseline, which was best overall, on Bulgarian and Kannada.
arXiv Detail & Related papers (2020-05-25T21:23:52Z)
Jointly Trained Transformers models for Spoken Language Translation [2.3886615435250302]
This work trains SLT systems with ASR objective as an auxiliary loss and both the networks are connected through neural hidden representations. This architecture has improved from BLEU from 36.8 to 44.5. All the experiments are reported on English-Portuguese speech translation task using How2 corpus.
arXiv Detail & Related papers (2020-04-25T11:28:39Z)
Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models [56.268862325167575]
This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs) We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task. We evaluate fine-tuned PLMs on the recently-introduced CANARD dataset as an in-domain task and validate the models using data from the TREC 2019 CAsT Track as an out-domain task.
arXiv Detail & Related papers (2020-04-04T11:07:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.