Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey
- URL: http://arxiv.org/abs/2406.00252v3
- Date: Tue, 18 Jun 2024 04:22:39 GMT
- Title: Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey
- Authors: Bowen Jiang, Yangxinyu Xie, Xiaomeng Wang, Weijie J. Su, Camillo J. Taylor, Tanwi Mallick,
- Abstract summary: Rationality is the quality of being guided by reason, characterized by logical thinking and decision-making that align with evidence and logical rules.
Recent research attempts to leverage the strength of multiple agents working collaboratively with various types of data and tools for enhanced consistency and reliability.
This paper aims to understand whether multi-modal and multi-agent systems are advancing toward rationality by surveying the state-of-the-art works.
- Score: 24.11852705458977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rationality is the quality of being guided by reason, characterized by logical thinking and decision-making that align with evidence and logical rules. This quality is essential for effective problem-solving, as it ensures that solutions are well-founded and systematically derived. Despite the advancements of large language models (LLMs) in generating human-like text with remarkable accuracy, they present biases inherited from the training data, inconsistency across different contexts, and difficulty understanding complex scenarios involving multiple layers of context. Therefore, recent research attempts to leverage the strength of multiple agents working collaboratively with various types of data and tools for enhanced consistency and reliability. To that end, this paper aims to understand whether multi-modal and multi-agent systems are advancing toward rationality by surveying the state-of-the-art works, identifying advancements over single-agent and single-modal systems in terms of rationality, and discussing open problems and future directions. We maintain an open repository at https://github.com/bowen-upenn/MMMA_Rationality.
Related papers
- POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation [76.67608003501479]
We introduce and specify an evaluation protocol defining a range of domain-related metrics computed on the basics of the primary evaluation indicators.
The results of such a comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.
arXiv Detail & Related papers (2024-07-20T16:37:21Z) - Multi-step Inference over Unstructured Data [2.169874047093392]
High-stakes decision-making tasks in fields such as medical, legal and finance require a level of precision, comprehensiveness, and logical consistency.
We have developed a neuro-symbolic AI platform to tackle these problems.
The platform integrates fine-tuned LLMs for knowledge extraction and alignment with a robust symbolic reasoning engine.
arXiv Detail & Related papers (2024-06-26T00:00:45Z) - MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning [3.651416979200174]
MMCTAgent is a novel critical thinking agent framework designed to address the inherent limitations of current MLLMs in complex visual reasoning tasks.
Inspired by human cognitive processes and critical thinking, MMCTAgent iteratively analyzes multi-modal information, decomposes queries, plans strategies, and dynamically evolves its reasoning.
arXiv Detail & Related papers (2024-05-28T16:55:41Z) - Cantor: Inspiring Multimodal Chain-of-Thought of MLLM [83.6663322930814]
We argue that converging visual context acquisition and logical reasoning is pivotal for tackling visual reasoning tasks.
We propose an innovative multimodal CoT framework, termed Cantor, characterized by a perception-decision architecture.
Our experiments demonstrate the efficacy of the proposed framework, showing significant improvements in multimodal CoT performance.
arXiv Detail & Related papers (2024-04-24T17:59:48Z) - Large Multimodal Agents: A Survey [78.81459893884737]
Large language models (LLMs) have achieved superior performance in powering text-based AI agents.
There is an emerging research trend focused on extending these LLM-powered AI agents into the multimodal domain.
This review aims to provide valuable insights and guidelines for future research in this rapidly evolving field.
arXiv Detail & Related papers (2024-02-23T06:04:23Z) - A Survey on Context-Aware Multi-Agent Systems: Techniques, Challenges
and Future Directions [1.1458366773578277]
Research interest in autonomous agents is on the rise as an emerging topic.
The challenge lies in enabling these agents to learn, reason, and navigate uncertainties in dynamic environments.
Context awareness emerges as a pivotal element in fortifying multi-agent systems.
arXiv Detail & Related papers (2024-02-03T00:27:22Z) - Responsible Emergent Multi-Agent Behavior [2.9370710299422607]
State of the art in Responsible AI has ignored one crucial point: human problems are multi-agent problems.
From driving in traffic to negotiating economic policy, human problem-solving involves interaction and the interplay of the actions and motives of multiple individuals.
This dissertation develops the study of responsible emergent multi-agent behavior.
arXiv Detail & Related papers (2023-11-02T21:37:32Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - MMRNet: Improving Reliability for Multimodal Object Detection and
Segmentation for Bin Picking via Multimodal Redundancy [68.7563053122698]
We propose a reliable object detection and segmentation system with MultiModal Redundancy (MMRNet)
This is the first system that introduces the concept of multimodal redundancy to address sensor failure issues during deployment.
We present a new label-free multi-modal consistency (MC) score that utilizes the output from all modalities to measure the overall system output reliability and uncertainty.
arXiv Detail & Related papers (2022-10-19T19:15:07Z) - CausalCity: Complex Simulations with Agency for Causal Discovery and
Reasoning [68.74447489372037]
We present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning.
A core component of our work is to introduce textitagency, such that it is simple to define and create complex scenarios.
We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment.
arXiv Detail & Related papers (2021-06-25T00:21:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.