Fugu-MT 論文翻訳(概要): Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning

論文の概要: Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning

arxiv url: http://arxiv.org/abs/2207.14140v1
Date: Thu, 28 Jul 2022 15:01:26 GMT
ステータス: 翻訳完了
システム内更新日: 2022-07-29 12:05:19.812437
Title: Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning
Title（参考訳）: NEATと強化学習を用いた無期限2次元ゲームプレイ
Authors: Jerin Paul Selvan, Pravin S. Game
Abstract要約: アルゴリズムの性能は、アルゴリズムが入力される環境において、アルゴリズムに従って振る舞う人工エージェントを用いて比較することができる。人工エージェントに適用されるアルゴリズムはNeuroEvolution of Augmenting Topologies (NEAT)とReinforcement Learningである。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: For over a decade now, robotics and the use of artificial agents have become a common thing.Testing the performance of new path finding or search space optimization algorithms has also become a challenge as they require simulation or an environment to test them.The creation of artificial environments with artificial agents is one of the methods employed to test such algorithms.Games have also become an environment to test them.The performance of the algorithms can be compared by using artificial agents that will behave according to the algorithm in the environment they are put in.The performance parameters can be, how quickly the agent is able to differentiate between rewarding actions and hostile actions.This can be tested by placing the agent in an environment with different types of hurdles and the goal of the agent is to reach the farthest by taking decisions on actions that will lead to avoiding all the obstacles.The environment chosen is a game called "Flappy Bird".The goal of the game is to make the bird fly through a set of pipes of random heights.The bird must go in between these pipes and must not hit the top, the bottom, or the pipes themselves.The actions that the bird can take are either to flap its wings or drop down with gravity.The algorithms that are enforced on the artificial agents are NeuroEvolution of Augmenting Topologies (NEAT) and Reinforcement Learning.The NEAT algorithm takes an "N" initial population of artificial agents.They follow genetic algorithms by considering an objective function, crossover, mutation, and augmenting topologies.Reinforcement learning, on the other hand, remembers the state, the action taken at that state, and the reward received for the action taken using a single agent and a Deep Q-learning Network.The performance of the NEAT algorithm improves as the initial population of the artificial agents is increased.
Abstract（参考訳）: For over a decade now, robotics and the use of artificial agents have become a common thing.Testing the performance of new path finding or search space optimization algorithms has also become a challenge as they require simulation or an environment to test them.The creation of artificial environments with artificial agents is one of the methods employed to test such algorithms.Games have also become an environment to test them.The performance of the algorithms can be compared by using artificial agents that will behave according to the algorithm in the environment they are put in.The performance parameters can be, how quickly the agent is able to differentiate between rewarding actions and hostile actions.This can be tested by placing the agent in an environment with different types of hurdles and the goal of the agent is to reach the farthest by taking decisions on actions that will lead to avoiding all the obstacles.The environment chosen is a game called "Flappy Bird". The goal of the game is to make the bird fly through a set of pipes of random heights.The bird must go in between these pipes and must not hit the top, the bottom, or the pipes themselves.The actions that the bird can take are either to flap its wings or drop down with gravity.The algorithms that are enforced on the artificial agents are NeuroEvolution of Augmenting Topologies (NEAT) and Reinforcement Learning.The NEAT algorithm takes an "N" initial population of artificial agents.They follow genetic algorithms by considering an objective function, crossover, mutation, and augmenting topologies.Reinforcement learning, on the other hand, remembers the state, the action taken at that state, and the reward received for the action taken using a single agent and a Deep Q-learning Network.The performance of the NEAT algorithm improves as the initial population of the artificial agents is increased.

関連論文リスト

REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
報酬関数と人間の嗜好の相違は、現実世界で破滅的な結果をもたらす可能性がある。近年の手法は、人間の嗜好から報酬関数を学習することで、不適応を緩和することを目的としている。本稿では,ロボットRLHFフレームワークにおける報酬正規化の新たな概念を提案する。
論文参考訳（メタデータ） (2023-12-22T04:56:37Z)
AI planning in the imagination: High-level planning on learned abstract search spaces [68.75684174531962]
我々は,エージェントが訓練中に学習する抽象的な検索空間において,エージェントが計画することを可能にする,PiZeroと呼ばれる新しい手法を提案する。本研究では,旅行セールスマン問題,ソコバン問題,2048年,施設立地問題,パックマン問題など,複数の分野で評価を行った。
論文参考訳（メタデータ） (2023-08-16T22:47:16Z)
Achieving mouse-level strategic evasion performance using real-time computational planning [59.60094442546867]
計画とは、脳が想像し、予測可能な未来を成立させる特別な能力である。我々は,動物の生態が空間計画の価値をどのように支配するかという研究に基づいて,より効率的な生物学的に着想を得た計画アルゴリズムであるTLPPOを開発した。 TLPPOを用いたリアルタイムエージェントの性能とライブマウスの性能を比較し,ロボット捕食者を避けることを課題とする。
論文参考訳（メタデータ） (2022-11-04T18:34:36Z)
A neural net architecture based on principles of neural plasticity and development evolves to effectively catch prey in a simulated environment [2.834895018689047]
A-Lifeにとっての大きな課題は、行動が「ライフライク」なエージェントを深く構築することである。本稿では,動物の脳を構成するプロセスに類似したプロセスを用いて,人工エージェントを駆動するネットワークを構築するためのアーキテクチャとアプローチを提案する。このアーキテクチャは、センサー入力の変化に対する迅速な応答を可能にするため、小さな自律ロボットやドローンを制御するのに有用であると考えています。
論文参考訳（メタデータ） (2022-01-28T05:10:56Z)
Mimicking Playstyle by Adapting Parameterized Behavior Trees in RTS Games [0.0]
行動木(BT)は、ゲームにおける人工知能(AI)の分野に影響を与えた。 BTは手作りのBTの複雑さをほとんど難なくし、エラーを起こしやすくした。この分野の最近のトレンドはAIエージェントの自動作成に焦点を当てている。本稿では,人間のゲームプレイを模倣し一般化する,AIエージェントの半自動構築手法を提案する。
論文参考訳（メタデータ） (2021-11-23T20:36:28Z)
XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees [55.9643422180256]
本稿では,ロボットの密集した動的環境における衝突のない軌道を計算するためのセンサベース学習ナビゲーションアルゴリズムを提案する。我々のアプローチは、sim2realパラダイムを用いて訓練された深層強化学習に基づくエキスパートポリシーを使用する。シミュレーション環境でのアルゴリズムの利点を強調し、移動中の歩行者の間でClearpath Jackalロボットをナビゲートする。
論文参考訳（メタデータ） (2021-04-22T01:33:10Z)
Learning What To Do by Simulating the Past [76.86449554580291]
学習した特徴エンコーダと学習した逆モデルを組み合わせることで、エージェントが人間の行動を後方にシミュレートして、彼らがすべきことを推測できることを示す。得られたアルゴリズムは、そのスキルに最適なポリシーから抽出された単一の状態を与えられたMuJoCo環境で特定のスキルを再現することができる。
論文参考訳（メタデータ） (2021-04-08T17:43:29Z)
PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics [89.81550748680245]
PasticineLabと呼ばれる新しい微分可能な物理ベンチマークを導入する。各タスクにおいて、エージェントはマニピュレータを使用して、プラスチックを所望の構成に変形させる。本稿では,既存の強化学習(RL)手法と勾配に基づく手法について評価する。
論文参考訳（メタデータ） (2021-04-07T17:59:23Z)
Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data [18.750834997334664]
我々は、人間は有理論的であり、他人の意思決定過程を推論する際に異なる知能レベルを持っていると論じる。学習中の人間の潜在知能レベルを推論する,新しいマルチエージェント逆強化学習フレームワークを提案する。
論文参考訳（メタデータ） (2021-03-07T07:48:31Z)
Accelerated Sim-to-Real Deep Reinforcement Learning: Learning Collision Avoidance from Human Player [5.960346570280513]
本稿では,移動ロボットに使用するセンサレベルマップレス衝突回避アルゴリズムを提案する。ロボットが人間体験データと自己探索データの両方から学習できるように,効率的な学習戦略を提案する。ゲームフォーマットシミュレーションフレームワークは、人間のプレイヤーがモバイルロボットを目標まで遠隔操作できるように設計されている。
論文参考訳（メタデータ） (2021-02-21T23:27:34Z)
Generating Human-Like Movement: A Comparison Between Two Approaches Based on Environmental Features [4.511923587827301]
環境特性に基づいて人間のような軌道を生成するための2つの新しいアルゴリズムが提示されている。人間の類似性は、最終生成軌道を現実的なものと判断する人間の専門家によってテストされている。予め定義した基準により,実際の軌道に近い軌道を生成するにもかかわらず,特徴ベースA*アルゴリズムは,アトラクションベースA*アルゴリズムと比較して時間効率が低いことを示す。
論文参考訳（メタデータ） (2020-12-11T16:45:32Z)
Never Give Up: Learning Directed Exploration Strategies [63.19616370038824]
そこで我々は,多岐にわたる探索政策を学習し,ハード・サーベイ・ゲームを解決するための強化学習エージェントを提案する。エージェントの最近の経験に基づいて,k-アネレスト隣人を用いたエピソード記憶に基づく本質的な報酬を構築し,探索政策を訓練する。自己教師付き逆動力学モデルを用いて、近くのルックアップの埋め込みを訓練し、エージェントが制御できる新しい信号をバイアスする。
論文参考訳（メタデータ） (2020-02-14T13:57:22Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。