Journal of Shanghai Jiao Tong University ›› 2025, Vol. 59 ›› Issue (3): 400-412.doi: 10.16183/j.cnki.jsjtu.2023.344

• New Type Power System and the Integrated Energy • Previous Articles     Next Articles

Online Steady-State Scheduling of New Power Systems Based on Hierarchical Reinforcement Learning

ZHAO Yingying1,2, QIU Yue3, ZHU Tianchen3(), LI Fan1,2, SU Yun1,2, TAI Zhenying3, SUN Qingyun3, FAN Hang4   

  1. 1. State Grid Shanghai Municipal Electric Power Company, Shanghai 200125, China
    2. East China Electric Power Test and Research Institute Co., Ltd., Shanghai 200437, China
    3. Beihang University, Beijing 100191, China
    4. School of Economics and Management, North China Electric Power University, Beijing 100096, China
  • Received:2023-07-24 Revised:2023-09-26 Accepted:2023-11-22 Online:2025-03-28 Published:2025-04-02

Abstract:

With the construction of new power systems, the stochasticity of high-proportion renewable energy significantly increases the uncertainty in the operation of the power grid, posing severe challenges to its safe, stable, and economically efficient operation. Data-driven artificial intelligence methods, such as deep reinforcement learning, are becoming increasingly important for regulating and assisting decision-making in the power grid in the new power system. However, current online scheduling algorithms based on deep reinforcement learning still face challenges in modeling the high-dimensional decision space and optimizing scheduling strategies, resulting in low model search efficiency and slow convergence. Therefore, a novel online steady-state scheduling method is proposed for the new power system based on hierarchical reinforcement learning, which reduces the decision space by adaptively selecting key nodes for adjustment. In addition, a state context-aware module based on gated recurrent units is introduced to model the high-dimensional environmental state, and a model with the optimization objectives of comprehensive operating costs, energy consumption, and over-limit conditions is constructed considering various operational constraints. The effectiveness of the proposed algorithm is thoroughly validated through experiments on three standard test cases, including IEEE-118, L2RPN-WCCI-2022, and SG-126.

Key words: operation scheduling of power grid, reinforcement learning, hierarchical decision making, state representation

CLC Number: