Related papers: Sequential Neural Networks for Noetic End-to-End Response Selection

Sequential Neural Networks for Noetic End-to-End Response Selection

URL: http://arxiv.org/abs/2003.02126v1
Date: Tue, 3 Mar 2020 04:36:33 GMT
Title: Sequential Neural Networks for Noetic End-to-End Response Selection
Authors: Qian Chen, Wen Wang
Abstract summary: This paper presents our systems that are ranked top 1 on both datasets under this challenge. We investigate a sequential matching model based only on chain sequence for multi-turn response selection. Our results demonstrate that the potentials of sequential matching approaches have not yet been fully exploited in the past for multi-turn response selection.
Score: 4.996858281980058
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The noetic end-to-end response selection challenge as one track in the 7th Dialog System Technology Challenges (DSTC7) aims to push the state of the art of utterance classification for real world goal-oriented dialog systems, for which participants need to select the correct next utterances from a set of candidates for the multi-turn context. This paper presents our systems that are ranked top 1 on both datasets under this challenge, one focused and small (Advising) and the other more diverse and large (Ubuntu). Previous state-of-the-art models use hierarchy-based (utterance-level and token-level) neural networks to explicitly model the interactions among different turns' utterances for context modeling. In this paper, we investigate a sequential matching model based only on chain sequence for multi-turn response selection. Our results demonstrate that the potentials of sequential matching approaches have not yet been fully exploited in the past for multi-turn response selection. In addition to ranking top 1 in the challenge, the proposed model outperforms all previous models, including state-of-the-art hierarchy-based models, on two large-scale public multi-turn response selection benchmark datasets.

Related papers

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling [85.590774707406]
Unified models can handle both multimodal understanding and generation within a single architecture, yet they typically operate in a single pass without iteratively refining their outputs.<n>We introduce UniT, a framework for multimodal test-time scaling that enables a single unified model to reason, verify, and refine across multiple rounds.
arXiv Detail & Related papers (2026-02-12T18:59:49Z)
The ISLab Solution to the Algonauts Challenge 2025: A Multimodal Deep Learning Approach to Brain Response Prediction [7.293664607999047]
We present a network-specific approach for predicting brain responses to complex multimodal movies.<n>We grouped the seven functional networks into four clusters and trained separate multi-subject, multi-layer perceptron (MLP) models for each.
arXiv Detail & Related papers (2025-07-25T10:21:06Z)
Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection [40.85209520973634]
An ideal model selection scheme should support two operations efficiently over a large pool of candidate models. Previous solutions to model selection require high computational complexity for at least one of these two operations. We present Standardized Embedder, an empirical realization of isolated model embedding.
arXiv Detail & Related papers (2024-06-11T17:57:49Z)
A Two-Phase Recall-and-Select Framework for Fast Model Selection [13.385915962994806]
We propose a two-phase (coarse-recall and fine-selection) model selection framework. It aims to enhance the efficiency of selecting a robust model by leveraging the models' training performances on benchmark datasets. It has been demonstrated that the proposed methodology facilitates the selection of a high-performing model at a rate about 3x times faster than conventional baseline methods.
arXiv Detail & Related papers (2024-03-28T14:44:44Z)
WLD-Reg: A Data-dependent Within-layer Diversity Regularizer [98.78384185493624]
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization. We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer. We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
arXiv Detail & Related papers (2023-01-03T20:57:22Z)
Sequential Ensembling for Semantic Segmentation [4.030520171276982]
We benchmark the popular ensembling approach of combining predictions of multiple, independently-trained, state-of-the-art models. We propose a novel method inspired by boosting to sequentially ensemble networks that significantly outperforms the naive ensemble baseline.
arXiv Detail & Related papers (2022-10-08T22:13:59Z)
Continuous-Time and Multi-Level Graph Representation Learning for Origin-Destination Demand Prediction [52.0977259978343]
This paper proposes a Continuous-time and Multi-level dynamic graph representation learning method for Origin-Destination demand prediction (CMOD) The state vectors keep historical transaction information and are continuously updated according to the most recently happened transactions. Experiments are conducted on two real-world datasets from Beijing Subway and New York Taxi, and the results demonstrate the superiority of our model against the state-of-the-art approaches.
arXiv Detail & Related papers (2022-06-30T03:37:50Z)
Structured Reordering for Modeling Latent Alignments in Sequence Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations. The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z)
A Multi-Size Neural Network with Attention Mechanism for Answer Selection [3.310455595316906]
An effective architecture,multi-size neural network with attention mechanism (AM-MSNN),is introduced into the answer selection task. It captures more levels of language granularities in parallel, because of the various sizes of filters comparing with single-layer CNN and multi-layer CNNs. It extends the sentence representations by attention mechanism, thus containing more information for different types of questions.
arXiv Detail & Related papers (2021-04-24T02:13:26Z)
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination. We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner. Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
SRQA: Synthetic Reader for Factoid Question Answering [21.28441702154528]
We introduce a new model called SRQA, which means Synthetic Reader for Factoid Question Answering. This model enhances the question answering system in the multi-document scenario from three aspects. We perform SRQA on the WebQA dataset, and experiments show that our model outperforms the state-of-the-art models.
arXiv Detail & Related papers (2020-09-02T13:16:24Z)
Multiscale Deep Equilibrium Models [162.15362280927476]
We propose a new class of implicit networks, the multiscale deep equilibrium model (MDEQ) An MDEQ directly solves for and backpropagates through the equilibrium points of multiple feature resolutions simultaneously. We illustrate the effectiveness of this approach on two large-scale vision tasks: ImageNet classification and semantic segmentation on high-resolution images from the Cityscapes dataset.
arXiv Detail & Related papers (2020-06-15T18:07:44Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.