Related papers: Trajectory Planning for UAV-Based Smart Farming Using Imitation-Based Triple Deep Q-Learning

Trajectory Planning for UAV-Based Smart Farming Using Imitation-Based Triple Deep Q-Learning

URL: http://arxiv.org/abs/2512.18604v1
Date: Sun, 21 Dec 2025 05:30:19 GMT
Title: Trajectory Planning for UAV-Based Smart Farming Using Imitation-Based Triple Deep Q-Learning
Authors: Wencan Mao, Quanxi Zhou, Tomas Couso Coddou, Manabu Tsukada, Yunling Liu, Yusheng Ji,
Abstract summary: We formulate the trajectory planning problem as a Markov decision process (MDP) and leverage multi-agent reinforcement learning (MARL) to solve it.<n>We propose a novel imitation-based triple deep Q-network (ITDQN) algorithm, which employs an elite imitation mechanism to reduce exploration costs.<n>Our proposed ITDQN outperforms DDQN by 4.43% in weed recognition rate and 6.94% in data collection rate.
Score: 5.160399918845654
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Unmanned aerial vehicles (UAVs) have emerged as a promising auxiliary platform for smart agriculture, capable of simultaneously performing weed detection, recognition, and data collection from wireless sensors. However, trajectory planning for UAV-based smart agriculture is challenging due to the high uncertainty of the environment, partial observations, and limited battery capacity of UAVs. To address these issues, we formulate the trajectory planning problem as a Markov decision process (MDP) and leverage multi-agent reinforcement learning (MARL) to solve it. Furthermore, we propose a novel imitation-based triple deep Q-network (ITDQN) algorithm, which employs an elite imitation mechanism to reduce exploration costs and utilizes a mediator Q-network over a double deep Q-network (DDQN) to accelerate and stabilize training and improve performance. Experimental results in both simulated and real-world environments demonstrate the effectiveness of our solution. Moreover, our proposed ITDQN outperforms DDQN by 4.43\% in weed recognition rate and 6.94\% in data collection rate.

Related papers

Blockchain-Enabled Routing for Zero-Trust Low-Altitude Intelligent Networks [77.17664010626726]
We focus on the routing with multiple UAV clusters in low-altitude intelligent networks (LAINs)<n>To minimize the damage caused by potential threats, we present the zero-trust architecture with the software-defined perimeter and blockchain techniques.<n>We show that the proposed framework reduces the average E2E delay by 59% and improves the TSR by 29% on average compared to benchmarks.
arXiv Detail & Related papers (2026-02-27T04:30:35Z)
Hierarchical Task Offloading and Trajectory Optimization in Low-Altitude Intelligent Networks Via Auction and Diffusion-based MARL [37.79695337425523]
Low-altitude intelligent networks (LAINs) can support mission-critical applications such as disaster response, environmental monitoring, and real-time sensing.<n>These systems face key challenges, including energy-constrained UAVs, task arrivals, and heterogeneous computing resources.<n>We propose an integrated air-ground collaborative network and formulate a time-dependent integer nonlinear programming problem that jointly optimize UAV trajectory planning and task offloading decisions.
arXiv Detail & Related papers (2025-12-05T08:14:45Z)
AerialMind: Towards Referring Multi-Object Tracking in UAV Scenarios [64.51320327698231]
We introduce AerialMind, the first large-scale RMOT benchmark in UAV scenarios.<n>We develop an innovative semi-automated collaborative agent-based labeling assistant framework.<n>We also propose HawkEyeTrack, a novel method that collaboratively enhances vision-language representation learning.
arXiv Detail & Related papers (2025-11-26T04:44:27Z)
AirFed: Federated Graph-Enhanced Multi-Agent Reinforcement Learning for Multi-UAV Cooperative Mobile Edge Computing [21.4907371859268]
Multiple Unmanned Aerial Vehicles (UAVs) cooperative Mobile Edge Computing (MEC) systems face critical challenges in coordinating trajectory planning, task offloading, and resource allocation.<n>Existing approaches suffer from limited scalability, slow convergence, and inefficient knowledge sharing among UAVs.<n>This paper proposes AirFed, a novel federated graph-enhanced multi-agent reinforcement learning framework.
arXiv Detail & Related papers (2025-10-27T06:31:35Z)
Low-altitude UAV Friendly-Jamming for Satellite-Maritime Communications via Generative AI-enabled Deep Reinforcement Learning [72.23178920029957]
This paper presents a satellite-maritime communication system assisted by low-altitude unmanned aerial vehicle (UAV) friendly-jamming.<n>We formulate a secure satellite-maritime communication multi-objective optimization problem (SSMCMOP)<n>In order to solve the dynamic and long-term optimization problem, we reformulate it into a Markov decision process.<n>We then propose a transformer-enhanced soft actor-critic (TransSAC) algorithm, which is a generative artificial intelligence-enabled deep reinforcement learning approach.
arXiv Detail & Related papers (2025-01-26T10:13:51Z)
Task Delay and Energy Consumption Minimization for Low-altitude MEC via Evolutionary Multi-objective Deep Reinforcement Learning [52.64813150003228]
The low-altitude economy (LAE), driven by unmanned aerial vehicles (UAVs) and other aircraft, has revolutionized fields such as transportation, agriculture, and environmental monitoring.<n>In the upcoming six-generation (6G) era, UAV-assisted mobile edge computing (MEC) is particularly crucial in challenging environments such as mountainous or disaster-stricken areas.<n>The task offloading problem is one of the key issues in UAV-assisted MEC, primarily addressing the trade-off between minimizing the task delay and the energy consumption of the UAV.
arXiv Detail & Related papers (2025-01-11T02:32:42Z)
Deep Reinforcement Learning for Task Offloading in UAV-Aided Smart Farm Networks [3.6118662460334527]
We introduce a Deep Q-Learning (DQL) approach to solve this multi-objective problem. We show that our proposed DQL-based method achieves comparable results when it comes to the UAVs' remaining battery levels and percentage of deadline violations.
arXiv Detail & Related papers (2022-09-15T15:29:57Z)
Trajectory Design for UAV-Based Internet-of-Things Data Collection: A Deep Reinforcement Learning Approach [93.67588414950656]
In this paper, we investigate an unmanned aerial vehicle (UAV)-assisted Internet-of-Things (IoT) system in a 3D environment. We present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm. Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional non-learning based baseline methods.
arXiv Detail & Related papers (2021-07-23T03:33:29Z)
UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach [18.266087952180733]
We propose a new end-to-end reinforcement learning approach to UAV-enabled data collection from Internet of Things (IoT) devices. An autonomous drone is tasked with gathering data from distributed sensor nodes subject to limited flying time and obstacle avoidance. We show that our proposed network architecture enables the agent to make movement decisions for a variety of scenario parameters.
arXiv Detail & Related papers (2020-07-01T15:14:16Z)
Simultaneous Navigation and Radio Mapping for Cellular-Connected UAV with Deep Reinforcement Learning [46.55077580093577]
How to achieve ubiquitous 3D communication coverage for UAVs in the sky is a new challenge. We propose a new coverage-aware navigation approach, which exploits the UAV's controllable mobility to design its navigation/trajectory. We propose a new framework called simultaneous navigation and radio mapping (SNARM), where the UAV's signal measurement is used to train the deep Q network.
arXiv Detail & Related papers (2020-03-17T08:16:14Z)
Data Freshness and Energy-Efficient UAV Navigation Optimization: A Deep Reinforcement Learning Approach [88.45509934702913]
We design a navigation policy for multiple unmanned aerial vehicles (UAVs) where mobile base stations (BSs) are deployed. We incorporate different contextual information such as energy and age of information (AoI) constraints to ensure the data freshness at the ground BS. By applying the proposed trained model, an effective real-time trajectory policy for the UAV-BSs captures the observable network states over time.
arXiv Detail & Related papers (2020-02-21T07:29:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.