Please login first
Lightweight Reinforcement Learning for Real-Time Lunar Landing Control
1 , * 2 , 2
1  University of Europe for Applied Sciences, 14469 Potsdam Campus, Germany
2  School of Management Sciences, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi, 23460, Pakistan
Academic Editor: Lucia Billeci

Abstract:

Autonomous lunar probe landing presents a complex control challenge due to limited sensor feedback, delayed communications, and dynamic terrain conditions. In this work, we present a lightweight and optimized reinforcement learning solution using a Deep Q-Network (DQN) agent trained on the LunarLander-v3 environment from the Gymnasium library. Our aim is to develop a model capable of precise, resource-efficient landings under constrained simulation settings.

The agent interacts with an 8-dimensional state space and 4 discrete action choices, learning through experience replay and an ε-greedy policy. We systematically evaluated the impact of neural network architecture (Tiny, Base, Wide, Deep) and conducted extensive hyperparameter tuning via grid search across learning rates, discount factors, and update rates. The best-performing configuration—128-128 Wide architecture, learning rate 0.0005, discount factor 0.99, soft update rate 0.01—demonstrated superior performance, achieving an average reward of 262.89 in 355.98 seconds of training.

Final testing revealed that training to a threshold of 250 reward yields a 93% landing success rate, outperforming both under-trained and over-trained agents in terms of efficiency and generalization. This result was validated across 100 test episodes, confirming consistent, high-accuracy autonomous landing behavior.

Our findings highlight the viability of deploying lightweight, well-tuned DQN agents for real-time lunar landing scenarios. The proposed approach serves as a scalable blueprint for future space robotics systems, bridging the gap between simulation and real-world feasibility. Future work will incorporate terrain complexity and uncertainty modeling to extend robustness in dynamic planetary environments.

Keywords: Reinforcement Learning; Deep Q-Networks; Autonomous Systems; Lunar Lander; Neural Networks; Gymnasium Environment; Space Robotics; Planetary Landing; AI-based Control Systems
Comments on this paper
Currently there are no comments available.


 
 
Top