Empirical Study of Symmetrical Reasoning in Conversational Chatbots
- URL: http://arxiv.org/abs/2407.05734v1
- Date: Mon, 8 Jul 2024 08:38:43 GMT
- Title: Empirical Study of Symmetrical Reasoning in Conversational Chatbots
- Authors: Daniela N. Rim, Heeyoul Choi,
- Abstract summary: This work explores the capability of conversational chatbots powered by large language models (LLMs) to understand predicate symmetry.
We assess the symmetrical reasoning of five chatbots: ChatGPT 4, Huggingface chat AI, Microsoft's Copilot AI, LLaMA through Perplexity, and Gemini Advanced.
Experiment results reveal varied performance among chatbots, with some approaching human-like reasoning capabilities.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work explores the capability of conversational chatbots powered by large language models (LLMs), to understand and characterize predicate symmetry, a cognitive linguistic function traditionally believed to be an inherent human trait. Leveraging in-context learning (ICL), a paradigm shift enabling chatbots to learn new tasks from prompts without re-training, we assess the symmetrical reasoning of five chatbots: ChatGPT 4, Huggingface chat AI, Microsoft's Copilot AI, LLaMA through Perplexity, and Gemini Advanced. Using the Symmetry Inference Sentence (SIS) dataset by Tanchip et al. (2020), we compare chatbot responses against human evaluations to gauge their understanding of predicate symmetry. Experiment results reveal varied performance among chatbots, with some approaching human-like reasoning capabilities. Gemini, for example, reaches a correlation of 0.85 with human scores, while providing a sounding justification for each symmetry evaluation. This study underscores the potential and limitations of LLMs in mirroring complex cognitive processes as symmetrical reasoning.
Related papers
- Analyzing Large language models chatbots: An experimental approach using a probability test [0.0]
This study consists of qualitative empirical research, conducted through exploratory tests with two different Large Language Models (LLMs)
The methodological procedure involved exploratory tests based on prompts designed with a probability question.
The "Linda Problem", widely recognized in cognitive psychology, was used as a basis to create the tests, along with the development of a new problem specifically for this experiment, the "Mary Problem"
arXiv Detail & Related papers (2024-07-10T15:49:40Z) - LLM Roleplay: Simulating Human-Chatbot Interaction [52.03241266241294]
LLM-Roleplay is a goal-oriented, persona-based method to automatically generate diverse multi-turn dialogues simulating human-chatbot interaction.
We collect natural human-chatbot dialogues from different sociodemographic groups and conduct a human evaluation to compare real human-chatbot dialogues with our generated dialogues.
arXiv Detail & Related papers (2024-07-04T14:49:46Z) - Designing and Evaluating Multi-Chatbot Interface for Human-AI Communication: Preliminary Findings from a Persuasion Task [1.360607903399872]
This study examines the impact of multi-chatbot communication in a specific persuasion setting: promoting charitable donations.
We developed an online environment that enables multi-chatbot communication and conducted a pilot experiment.
We present our development process of the multi-chatbot interface and present preliminary findings from a pilot experiment.
arXiv Detail & Related papers (2024-06-28T04:33:41Z) - From Human-to-Human to Human-to-Bot Conversations in Software Engineering [3.1747517745997014]
We aim to understand the dynamics of conversations that occur during modern software development after the integration of AI and chatbots.
We compile existing conversation attributes with humans and NLU-based chatbots and adapt them to the context of software development.
We present similarities and differences between human-to-human and human-to-bot conversations.
We conclude that the recent conversation styles that we observe with LLM-chatbots can not replace conversations with humans.
arXiv Detail & Related papers (2024-05-21T12:04:55Z) - Real-time Addressee Estimation: Deployment of a Deep-Learning Model on
the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans.
Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot.
The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z) - Evaluator for Emotionally Consistent Chatbots [2.8348950186890467]
The most recent work only evaluates on the aspects of context coherence, language fluency, response diversity, or logical self-consistency between dialogues.
This work proposes training an evaluator to determine the emotional consistency of chatbots.
arXiv Detail & Related papers (2021-12-02T21:47:29Z) - CheerBots: Chatbots toward Empathy and Emotionusing Reinforcement
Learning [60.348822346249854]
This study presents a framework whereby several empathetic chatbots are based on understanding users' implied feelings and replying empathetically for multiple dialogue turns.
We call these chatbots CheerBots. CheerBots can be retrieval-based or generative-based and were finetuned by deep reinforcement learning.
To respond in an empathetic way, we develop a simulating agent, a Conceptual Human Model, as aids for CheerBots in training with considerations on changes in user's emotional states in the future to arouse sympathy.
arXiv Detail & Related papers (2021-10-08T07:44:47Z) - Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn
Chatbot Responding with Intention [55.77218465471519]
This paper proposes an innovative framework to train chatbots to possess human-like intentions.
Our framework included a guiding robot and an interlocutor model that plays the role of humans.
We examined our framework using three experimental setups and evaluate the guiding robot with four different metrics to demonstrated flexibility and performance advantages.
arXiv Detail & Related papers (2021-03-30T15:24:37Z) - Spot The Bot: A Robust and Efficient Framework for the Evaluation of
Conversational Dialogue Systems [21.36935947626793]
emphSpot The Bot replaces human-bot conversations with conversations between bots.
Human judges only annotate for each entity in a conversation whether they think it is human or not.
emphSurvival Analysis measures which bot can uphold human-like behavior the longest.
arXiv Detail & Related papers (2020-10-05T16:37:52Z) - Investigation of Sentiment Controllable Chatbot [50.34061353512263]
In this paper, we investigate four models to scale or adjust the sentiment of the response.
The models are a persona-based model, reinforcement learning, a plug and play model, and CycleGAN.
We develop machine-evaluated metrics to estimate whether the responses are reasonable given the input.
arXiv Detail & Related papers (2020-07-11T16:04:30Z) - You Impress Me: Dialogue Generation via Mutual Persona Perception [62.89449096369027]
The research in cognitive science suggests that understanding is an essential signal for a high-quality chit-chat conversation.
Motivated by this, we propose P2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding.
arXiv Detail & Related papers (2020-04-11T12:51:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.