Generative agents in the streets: Exploring the use of Large Language
Models (LLMs) in collecting urban perceptions
- URL: http://arxiv.org/abs/2312.13126v1
- Date: Wed, 20 Dec 2023 15:45:54 GMT
- Title: Generative agents in the streets: Exploring the use of Large Language
Models (LLMs) in collecting urban perceptions
- Authors: Deepank Verma, Olaf Mumm, Vanessa Miriam Carlow
- Abstract summary: This study explores the current advancements in Generative agents powered by large language models (LLMs)
The experiment employs Generative agents to interact with the urban environments using street view images to plan their journey toward specific goals.
Since LLMs do not possess embodiment, nor have access to the visual realm, and lack a sense of motion or direction, we designed movement and visual modules that help agents gain an overall understanding of surroundings.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Evaluating the surroundings to gain understanding, frame perspectives, and
anticipate behavioral reactions is an inherent human trait. However, these
continuous encounters are diverse and complex, posing challenges to their study
and experimentation. Researchers have been able to isolate environmental
features and study their effect on human perception and behavior. However, the
research attempts to replicate and study human behaviors with proxies, such as
by integrating virtual mediums and interviews, have been inconsistent. Large
language models (LLMs) have recently been unveiled as capable of contextual
understanding and semantic reasoning. These models have been trained on large
amounts of text and have evolved to mimic believable human behavior. This study
explores the current advancements in Generative agents powered by LLMs with the
help of perceptual experiments. The experiment employs Generative agents to
interact with the urban environments using street view images to plan their
journey toward specific goals. The agents are given virtual personalities,
which make them distinguishable. They are also provided a memory database to
store their thoughts and essential visual information and retrieve it when
needed to plan their movement. Since LLMs do not possess embodiment, nor have
access to the visual realm, and lack a sense of motion or direction, we
designed movement and visual modules that help agents gain an overall
understanding of surroundings. The agents are further employed to rate the
surroundings they encounter based on their perceived sense of safety and
liveliness. As these agents store details in their memory, we query the
findings to get details regarding their thought processes. Overall, this study
experiments with current AI developments and their potential in simulated human
behavior in urban environments.
Related papers
- Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data.
We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z) - Sim-to-Real Causal Transfer: A Metric Learning Approach to
Causally-Aware Interaction Representations [62.48505112245388]
We take an in-depth look at the causal awareness of modern representations of agent interactions.
We show that recent representations are already partially resilient to perturbations of non-causal agents.
We propose a metric learning approach that regularizes latent representations with causal annotations.
arXiv Detail & Related papers (2023-12-07T18:57:03Z) - Understanding Your Agent: Leveraging Large Language Models for Behavior
Explanation [7.647395374489533]
We propose an approach to generate natural language explanations for an agent's behavior based only on observations of states and actions.
We show that our approach generates explanations as helpful as those produced by a human domain expert.
arXiv Detail & Related papers (2023-11-29T20:16:23Z) - Embodied Agents for Efficient Exploration and Smart Scene Description [47.82947878753809]
We tackle a setting for visual navigation in which an autonomous agent needs to explore and map an unseen indoor environment.
We propose and evaluate an approach that combines recent advances in visual robotic exploration and image captioning on images.
Our approach can generate smart scene descriptions that maximize semantic knowledge of the environment and avoid repetitions.
arXiv Detail & Related papers (2023-01-17T19:28:01Z) - I am Only Happy When There is Light: The Impact of Environmental Changes
on Affective Facial Expressions Recognition [65.69256728493015]
We study the impact of different image conditions on the recognition of arousal from human facial expressions.
Our results show how the interpretation of human affective states can differ greatly in either the positive or negative direction.
arXiv Detail & Related papers (2022-10-28T16:28:26Z) - MECCANO: A Multimodal Egocentric Dataset for Humans Behavior
Understanding in the Industrial-like Domain [23.598727613908853]
We present MECCANO, a dataset of egocentric videos to study humans behavior understanding in industrial-like settings.
The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset.
The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view.
arXiv Detail & Related papers (2022-09-19T00:52:42Z) - What do navigation agents learn about their environment? [39.74076893981299]
We introduce the Interpretability System for Embodied agEnts (iSEE) for Point Goal and Object Goal navigation agents.
We use iSEE to probe the dynamic representations produced by these agents for the presence of information about the agent as well as the environment.
arXiv Detail & Related papers (2022-06-17T01:33:43Z) - The Introspective Agent: Interdependence of Strategy, Physiology, and
Sensing for Embodied Agents [51.94554095091305]
We argue for an introspective agent, which considers its own abilities in the context of its environment.
Just as in nature, we hope to reframe strategy as one tool, among many, to succeed in an environment.
arXiv Detail & Related papers (2022-01-02T20:14:01Z) - Information is Power: Intrinsic Control via Information Capture [110.3143711650806]
We argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.
This objective induces an agent to both gather information about its environment, corresponding to reducing uncertainty, and to gain control over its environment, corresponding to reducing the unpredictability of future world states.
arXiv Detail & Related papers (2021-12-07T18:50:42Z) - Imitating Interactive Intelligence [24.95842455898523]
We study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment.
To build agents that can robustly interact with humans, we would ideally train them while they interact with humans.
We use ideas from inverse reinforcement learning to reduce the disparities between human-human and agent-agent interactive behaviour.
arXiv Detail & Related papers (2020-12-10T13:55:47Z) - Causal Curiosity: RL Agents Discovering Self-supervised Experiments for
Causal Representation Learning [24.163616087447874]
We introduce em causal curiosity, a novel intrinsic reward.
We show that it allows our agents to learn optimal sequences of actions.
We also show that the knowledge of causal factor representations aids zero-shot learning for more complex tasks.
arXiv Detail & Related papers (2020-10-07T02:07:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.