BME Explores Autonomous Glider Control Using Reinforcement Learning

Glider

This user story showcases how Dr. Miklós E. Mincsovics, an Assistant Professor at BME, applied reinforcement learning in MATLAB to optimize glider dynamics. Overcoming challenges, he achieved energy-efficient flight and inspired students with real-world projects.

  • 21

Imagine a glider soaring through the sky, its wings subtly adjusting to optimize flight. But what if the glider could learn to do this autonomously, navigating complex maneuvers and extending its flight distance without human intervention? Machine learning, in particular reinforcement learning, has opened up new possibilities for such advancements. As a mathematician fascinated by the underlying theories of machine learning, I was focused on the exploration of its practical applications, particularly in areas where control and optimization intersect with real-world dynamics. 

As a lecturer at the Budapest University of Technology and Economics (BME), I am always on the lookout for complex, real-world problems that my students can explore in my differential equations course. Two years ago, I encountered Zhukovski’s two-dimensional glider equations that encompass a model to describe glider flight dynamics. This model became even more intriguing when I considered an extrapolation: what if we could adjust the glider’s wing angle in real-time? This raised a host of fascinating questions, such as how to control the wing to perform a loop, land at a precise location, or maximize flight distance. When Dr. Ákos Koppány Kiss suggested using reinforcement learning with MATLAB to tackle these challenges, I eagerly began exploring this approach. 

Challenge 

The potential of reinforcement learning was clear, but how to apply it was far from straightforward. Despite its theoretical elegance, the translation of machine learning principles into practical solutions involves technical problems that are rarely addressed in textbooks. 

First, I had limited hands-on experience using MATLAB toolboxes for reinforcement learning, so getting started was daunting. However, MATLAB’s user-friendly environment provided an accessible entry point and a solid foundation upon which to build. 

Then came the need to formalize our goals numerically. Unlike simple classroom examples, “fly as far as possible” is too vague a directive for an AI agent. The objective had to be precise and quantifiable for the agent to adapt its actions accordingly. Defining this goal became a crucial part of the learning process. 

The final and perhaps most significant challenge was computational power. Training an agent is computationally intensive, especially when the glider’s learning process requires extended simulations for longer flights. Running these models on my laptop quickly proved inefficient. I had to rethink the problem to find a strategy that didn’t rely solely on computational brute force. 

Solution 

These challenges highlighted the need to gather insights about the model and deduce an optimal strategy rather than relying purely on brute force. Our first breakthrough came from our discovery of a specific velocity-angle pairing that allowed the glider to fly in a stable, straight line without unnecessary adjustments. This enabled us to divide the task into subtasks. 

The first subtask was guiding the glider into this steady state as quickly as possible, which provided a structured, quantifiable goal for the agent. After the glider achieved this state, it was allowed to continue flying in a steady state until it approached a near-zero altitude (y = 0). The final subtask involved teaching the agent to transfer the remaining kinetic energy into distance, which maximizes flight efficiency. 

After establishing quantifiable goals, we commenced training the agents. The core of this process was the design of a reward function that aligned with each goal. This function provided feedback to the agent on its performance, much like the relationship between a coach and a competitor or a parent and a child. Feedback could range from simple binary signals—”good” for success and “bad” for failure —to detailed, step-by-step guidance. Experimenting to find the right balance and fine-tuning the process required significant time and patience, especially considering that a single training session could take hours to complete and success was not guaranteed. 

Results 

Through division of the task into two sub-goals and training specialized agents for each, we were able to overcome the limitations of our initial setup and achieve an optimal flight strategy in practice. 

Each agent demonstrated strong performance in its specific task: one for quick achievement of a steady state and the other for effective transfer of energy into distance. Together, these agents provided an effective solution to the glider problem to move from theoretical modeling to practical application. 

As a next step, we hope that the results will be applicable to the design of energy-efficient flights, where appropriate wing adjustments can decrease fuel consumption. 

Beyond technical success, this project also created new teaching opportunities. Recently, I conducted a minicourse for mathematical engineering students in Rennes, where they began work on their own variations of the glider problem as a hands-on project. Students developed unique strategies and applied their insights to the problem, which underscored the value of this research. This project not only contributed to machine learning applications for control systems but also inspired the next generation of mathematicians and engineers to explore the intersection of theory and real-world problem-solving. 

Summary

Challenge: Applying reinforcement learning to optimize glider dynamics required the surpassing of limited expertise, computational constraints, and the challenge of defining precise, measurable objectives for the learning agent. 

Solution: MATLAB’s user-friendly interface enabled a smooth entry point for implementing reinforcement learning to allow the team’s design of structured reward functions and effective training of agents. By breaking the problem down into subtasks and leveraging reinforcement learning, the agents learned how to achieve steady flight and maximize efficient energy transfer. 

Results: 

  • Agents achieved steady flight and maximized energy transfer; 
  • Reinforcement learning enabled an optimal flight strategy; 
  • MATLAB facilitated efficient, resource-conscious training; 
  • Results highlighted the potential for energy-efficient flight design; and 
  • New educational and research opportunities emerged. 

Featured products

MathWorks® products:

Learn more

  • Budapest University of Technology and Economics (BME)
    BME traces its roots to the Institutum Geometrico-Hydrotechnicum, established in 1782 as Europe’s first university-based engineering institute. BME focuses on training experts in technology, IT, natural sciences, economics, and management. Its mission extends beyond education to include scientific research, spanning fundamental and applied studies, technological innovation, and the practical application of findings.
    Visit BME page
  • MATLAB Campus-Wide License Page
    Unlock the full potential of your institution’s academic pursuits with the MATLAB Campus-Wide License, providing access to a comprehensive suite of tools for computational analysis, modeling, and data visualization.
    Visit the page
  • Blog: MATLAB AI Chat Playground: A New Era in Generative AI for MATLAB Users
    MATLAB AI Chat Playground is a groundbreaking tool for MATLAB users. In this interactive chat environment, users can experiment with generative AI, get code examples, and explore MATLAB functionalities.
    Read the blog
  • On-Demand Webinar: The Benefits of the MATLAB Campus-Wide License – Focusing on AI Applications
    Watch this video to get acquainted with all the tools and opportunities provided by the MATLAB Campus License, including artificial intelligence applications with MATLAB.
    Watch the webinar

Recommended Events

Recommended Posts