Related papers: An Objective Laboratory Protocol for Evaluating Cognition of Non-Human Systems Against Human Cognition

An Objective Laboratory Protocol for Evaluating Cognition of Non-Human Systems Against Human Cognition

URL: http://arxiv.org/abs/2102.08933v1
Date: Wed, 17 Feb 2021 18:40:49 GMT
Title: An Objective Laboratory Protocol for Evaluating Cognition of Non-Human Systems Against Human Cognition
Authors: David J. Jilk
Abstract summary: The existence of a non-human system with cognitive capabilities comparable to those of humans might make once-philosophical questions of safety and ethics urgent. This is important because the existence of a non-human system with cognitive capabilities comparable to those of humans might make once-philosophical questions of safety and ethics immediate and urgent.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper I describe and reduce to practice an objective protocol for evaluating the cognitive capabilities of a non-human system against human cognition in a laboratory environment. This is important because the existence of a non-human system with cognitive capabilities comparable to those of humans might make once-philosophical questions of safety and ethics immediate and urgent. Past attempts to devise evaluation methods, such as the Turing Test and many others, have not met this need; most of them either emphasize a single aspect of human cognition or a single theory of intelligence, fail to capture the human capacity for generality and novelty, or require success in the physical world. The protocol is broadly Bayesian, in that its primary output is a confidence statistic in relation to a claim. Further, it provides insight into the areas where and to what extent a particular system falls short of human cognition, which can help to drive further progress or precautions.

Related papers

Measurement of LLM's Philosophies of Human Nature [113.47929131143766]
We design the standardized psychological scale specifically targeting large language models (LLM) We show that current LLMs exhibit a systemic lack of trust in humans. We propose a mental loop learning framework, which enables LLM to continuously optimize its value system.
arXiv Detail & Related papers (2025-04-03T06:22:19Z)
On Benchmarking Human-Like Intelligence in Machines [77.55118048492021]
We argue that current AI evaluation paradigms are insufficient for assessing human-like cognitive capabilities. We identify a set of key shortcomings: a lack of human-validated labels, inadequate representation of human response variability and uncertainty, and reliance on simplified and ecologically-invalid tasks.
arXiv Detail & Related papers (2025-02-27T20:21:36Z)
Aligning Generalisation Between Humans and Machines [74.120848518198]
Recent advances in AI have resulted in technology that can support humans in scientific discovery and decision support but may also disrupt democracies and target individuals. The responsible use of AI increasingly shows the need for human-AI teaming. A crucial yet often overlooked aspect of these interactions is the different ways in which humans and machines generalise.
arXiv Detail & Related papers (2024-11-23T18:36:07Z)
An Epistemic Human-Aware Task Planner which Anticipates Human Beliefs and Decisions [8.309981857034902]
The aim is to build a robot policy that accounts for uncontrollable human behaviors. We propose a novel planning framework and build a solver based on AND-OR search. Preliminary experiments in two domains, one novel and one adapted, demonstrate the effectiveness of the framework.
arXiv Detail & Related papers (2024-09-27T08:27:36Z)
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models [53.00812898384698]
We argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking. We highlight how cognitive biases can conflate fluent information and truthfulness, and how cognitive uncertainty affects the reliability of rating scores such as Likert. We propose the ConSiDERS-The-Human evaluation framework consisting of 6 pillars -- Consistency, Scoring Criteria, Differentiating, User Experience, Responsible, and Scalability.
arXiv Detail & Related papers (2024-05-28T22:45:28Z)
Real-time Addressee Estimation: Deployment of a Deep-Learning Model on the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans. Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot. The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z)
COKE: A Cognitive Knowledge Graph for Machine Theory of Mind [87.14703659509502]
Theory of mind (ToM) refers to humans' ability to understand and infer the desires, beliefs, and intentions of others. COKE is the first cognitive knowledge graph for machine theory of mind.
arXiv Detail & Related papers (2023-05-09T12:36:58Z)
Human Uncertainty in Concept-Based AI Systems [37.82747673914624]
We study human uncertainty in the context of concept-based AI systems. We show that training with uncertain concept labels may help mitigate weaknesses in concept-based systems.
arXiv Detail & Related papers (2023-03-22T19:17:57Z)
Robust Robot Planning for Human-Robot Collaboration [11.609195090422514]
In human-robot collaboration, the objectives of the human are often unknown to the robot. We propose an approach to automatically generate an uncertain human behavior (a policy) for each given objective function. We also propose a robot planning algorithm that is robust to the above-mentioned uncertainties.
arXiv Detail & Related papers (2023-02-27T16:02:48Z)
AGENT: A Benchmark for Core Psychological Reasoning [60.35621718321559]
Intuitive psychology is the ability to reason about hidden mental variables that drive observable actions. Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning. We present a benchmark consisting of procedurally generated 3D animations, AGENT, structured around four scenarios.
arXiv Detail & Related papers (2021-02-24T14:58:23Z)
Towards hybrid primary intersubjectivity: a neural robotics library for human science [4.232614032390374]
We study primary intersubjectivity as a second person perspective experience characterized by predictive engagement. We propose an open-source methodology named textitneural robotics library (NRL) for experimental human-robot interaction. We discuss some ways human-robot (hybrid) intersubjectivity can contribute to human science research.
arXiv Detail & Related papers (2020-06-29T11:35:46Z)
You Impress Me: Dialogue Generation via Mutual Persona Perception [62.89449096369027]
The research in cognitive science suggests that understanding is an essential signal for a high-quality chit-chat conversation. Motivated by this, we propose P2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding.
arXiv Detail & Related papers (2020-04-11T12:51:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.