PROJECT DETAIL · 2025

Deep Reinforcement Learning for Airfoil Pitching Moment Control

Under-review Computers & Fluids manuscript on PPO-based DRL control of a NACA0012 airfoil (Rec=3000\mathrm{Re}_c=3000, α=10\alpha=10^\circ) for quarter-chord pitching-moment trim using CFD-in-the-loop active flow control.

  • Current Focus
  • DRL
  • Active Flow Control
  • CFD
  • HPC

Problem Statement

Closed-loop aerodynamic trim in separated unsteady flow is difficult because pitching-moment dynamics are strongly coupled to vortex shedding and reward design.

Why It Matters

This study demonstrates a reproducible DRL-CFD path for pitching-moment control using fluidic actuation, relevant to AFC-enabled control authority without conventional moving surfaces.

Methods / Numerical + ML Setup

  • Simulated a NACA0012 airfoil at Rec=3000\mathrm{Re}_c=3000 and α=10\alpha=10^\circ in PyFR using high-order FR/DG for 2D compressible Navier-Stokes dynamics.
  • Used two AFC actuator families: wall-normal blowing/suction (including ZNMF variant) and near-wall tangential Coanda-type blowing.
  • Trained PPO policies with TorchRL (and compatible Stable-Baselines3 workflows) using 156 velocity probes as feedback state.
  • Compared reward formulations based on moving-average CmC_m windows (TaT_a, TpT_p, 2Tp2T_p) and an added standard-deviation penalty term.
  • Executed training on 4 parallel CFD environments with GPU-backed runs and deterministic post-training policy evaluation.

Key Results

  • Learned policies consistently drove mean quarter-chord pitching moment toward trim (CmO(103)|\langle C_m \rangle| \sim \mathcal{O}(10^{-3})).
  • Long averaging windows achieved mean trim but allowed larger instantaneous CmC_m oscillations.
  • Adding a variance penalty improved instantaneous trim behavior while preserving control authority.
  • Selected policies also increased mean lift and reduced mean drag relative to baseline uncontrolled flow.
  • Coanda-type blowing achieved comparable trim with lower actuation mass-flux demand than wall-normal blowing/suction setups.

Toolchain

  • PyFR
  • TorchRL
  • Stable-Baselines3
  • PPO actor-critic networks
  • Python
  • Slurm
  • GPU HPC (NVIDIA V100)

Challenges and Lessons

  • Balancing sample efficiency against expensive CFD rollouts and long training horizons.
  • Designing rewards that enforce both mean trim and low instantaneous unsteadiness.
  • Working with partially practical sensing assumptions before moving to sparse pressure-only observations.

Future Work

  • Shift from velocity-field probes to sparse pressure sensing with POMDP-oriented state design.
  • Extend to 3D and higher-Reynolds-number regimes for stronger physical realism.
  • Advance toward wind-tunnel-in-the-loop DRL and multi-objective rewards (trim, drag/lift, actuation cost).

Prasanna Thoguluva Rajendran · Ph.D. Student in Aerospace Engineering · University of Arizona, Tucson, AZ