AI Olympics challenge with Evolutionary Soft Actor Critic
- URL: http://arxiv.org/abs/2409.01104v2
- Date: Mon, 28 Oct 2024 09:10:11 GMT
- Title: AI Olympics challenge with Evolutionary Soft Actor Critic
- Authors: Marco Calì, Alberto Sinigaglia, Niccolò Turcato, Ruggero Carli, Gian Antonio Susto,
- Abstract summary: Our solution is based on a Model-free Deep Reinforcement Learning approach combined with an evolutionary strategy.
We will briefly describe the algorithms that have been used and then provide details of the approach.
- Score: 5.076263094490715
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the following report, we describe the solution we propose for the AI Olympics competition held at IROS 2024. Our solution is based on a Model-free Deep Reinforcement Learning approach combined with an evolutionary strategy. We will briefly describe the algorithms that have been used and then provide details of the approach
Related papers
- $EvoAl^{2048}$ [2.5526759890882764]
We report on applying a model-driven optimisation to search for an interpretable and explainable policy.
This paper describes a solution to the GECCO'24 Interpretable Control Competition using the open-source software EvoAl.
arXiv Detail & Related papers (2024-08-15T21:06:18Z) - Unleashing Artificial Cognition: Integrating Multiple AI Systems [2.402818676870194]
We present an innovative fusion of language models and query analysis techniques to unlock cognition in artificial intelligence.
The introduced open-source AI system seamlessly integrates a Chess engine with a language model, enabling it to predict moves and provide strategic explanations.
Our system holds promise for diverse applications, from medical diagnostics to financial forecasting.
arXiv Detail & Related papers (2024-08-09T07:36:30Z) - Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning [55.65738319966385]
We propose a novel online algorithm, iterative Nash policy optimization (INPO)
Unlike previous methods, INPO bypasses the need for estimating the expected win rate for individual responses.
With an LLaMA-3-8B-based SFT model, INPO achieves a 42.6% length-controlled win rate on AlpacaEval 2.0 and a 37.8% win rate on Arena-Hard.
arXiv Detail & Related papers (2024-06-30T08:00:34Z) - OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI [73.75520820608232]
We introduce OlympicArena, which includes 11,163 bilingual problems across both text-only and interleaved text-image modalities.
These challenges encompass a wide range of disciplines spanning seven fields and 62 international Olympic competitions, rigorously examined for data leakage.
Our evaluations reveal that even advanced models like GPT-4o only achieve a 39.97% overall accuracy, illustrating current AI limitations in complex reasoning and multimodal integration.
arXiv Detail & Related papers (2024-06-18T16:20:53Z) - ALYMPICS: LLM Agents Meet Game Theory -- Exploring Strategic
Decision-Making with AI Agents [77.34720446306419]
Alympics is a systematic simulation framework utilizing Large Language Model (LLM) agents for game theory research.
Alympics creates a versatile platform for studying complex game theory problems.
arXiv Detail & Related papers (2023-11-06T16:03:46Z) - IndigoVX: Where Human Intelligence Meets AI for Optimal Decision Making [0.0]
This paper defines a new approach for augmenting human intelligence with AI for optimal goal solving.
Our proposed AI, Indigo, is an acronym for Informed Numerical Decision-making through Iterative Goal-Oriented optimization.
We envisage this method being applied to games or business strategies, with the human providing strategic context and the AI offering optimal, data-driven moves.
arXiv Detail & Related papers (2023-07-21T11:54:53Z) - A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step
Inference [51.26551806938455]
Affordance-centric Question-driven Task Completion (AQTC) for Egocentric Assistant introduces a groundbreaking scenario.
We present a solution for enhancing video alignment to improve multi-step inference.
Our method secured the 2nd place in CVPR'2023 AQTC challenge.
arXiv Detail & Related papers (2023-06-26T04:19:33Z) - Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the
MineRL BASALT 2022 Competition [20.922425732605756]
The BASALT challenge asks teams to compete to develop algorithms to solve tasks with hard-to-specify reward functions in Minecraft.
We describe the competition and provide an overview of the top solutions.
arXiv Detail & Related papers (2023-03-23T17:59:17Z) - Retrospective on the 2021 BASALT Competition on Learning from Human
Feedback [92.37243979045817]
The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks.
Rather than mandating the use of LfHF techniques, we described four tasks in natural language to be accomplished in the video game Minecraft.
Teams developed a diverse range of LfHF algorithms across a variety of possible human feedback types.
arXiv Detail & Related papers (2022-04-14T17:24:54Z) - The First AI4TSP Competition: Learning to Solve Stochastic Routing
Problems [10.388013100067266]
This paper reports on the first international competition on AI for the traveling salesman problem (TTSP) at the 2021 International Conference on Artificial Intelligence (IJCAI-21)
The competition asked the participants to develop algorithms to solve a time-dependent orienteering problem with weights and time windows (TD-OPSWTW)
The winning methods described in this work have advanced the state-of-the-art in AI for routing problems using AI.
arXiv Detail & Related papers (2022-01-25T16:55:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.