新型电力系统与综合能源

基于安全深度强化学习的电网有功频率协同优化控制

展开
  • 1.国家电网有限公司华东分部,上海 200002
    2. AINERGY, 美国 圣塔克拉拉 95051
    3.国电南瑞科技股份有限公司,南京 210024
周 毅(1982-),高级工程师,主要从事电网调度、电力系统自动化研究.
周良才,高级工程师;E-mail:liangcaizhou@163.com.

收稿日期: 2022-09-13

  修回日期: 2023-02-15

  录用日期: 2023-02-24

  网络出版日期: 2023-03-28

基金资助

国家电网有限公司华东分部科技项目(SGHD0000DKJS2100235)

Coordinated Active Power-Frequency Control Based on Safe Deep Reinforcement Learning

Expand
  • 1. State Grid East China Branch, Shanghai 200002, China
    2. AINERGY, Santa Clara 95051, USA
    3. NARI Technology Development Co., Ltd., Nanjing 210024, China

Received date: 2022-09-13

  Revised date: 2023-02-15

  Accepted date: 2023-02-24

  Online published: 2023-03-28

摘要

可再生能源占比不断增加给互联电网频率控制带来严峻考验.由于常规的自动发电控制(AGC)策略没有考虑电网潮流安全约束,所以传统方法根据专家知识和经验进行尝试性发电机功率调整,需耗费较多时间;基于最优电力潮流的互联电网AGC优化模型由于非凸性和大规模性,求解时间较长且存在收敛性问题.鉴于常规深度强化学习具有“离线训练、在线端对端形成策略”的优点,但在动作探索过程中无法保证系统安全性,提出一种基于安全深度强化学习的电网有功频率协同优化控制方法.首先,将电网频率控制建模为约束马尔可夫决策过程,对决策过程添加相关安全约束进行智能体设计;然后,基于华东电网实际系统算例对智能体进行训练和性能提升;最后,对比智能体决策与常规AGC策略效果.结果表明:所提方法在多种运行方式下可快速生成有功频率控制策略,且保证系统频率恢复过程中电网的安全性,可辅助调度员在线决策.

本文引用格式

周毅, 周良才, 史迪, 赵小英, 闪鑫 . 基于安全深度强化学习的电网有功频率协同优化控制[J]. 上海交通大学学报, 2024 , 58(5) : 682 -692 . DOI: 10.16183/j.cnki.jsjtu.2022.358

Abstract

The continuous increase in renewables penetration poses a severe challenge to the frequency control of interconnected power grid. Since the conventional automatic generation control (AGC) strategy does not consider the power flow constraints of the network, the traditional approach is to make tentative generator power adjustments based on expert knowledge and experience, which is time consuming. The optimal power flow-based AGC optimization model has a long solution time and convergence issues due to its non-convexity and large size. Deep reinforcement learning has the advantage of “offline training and online end-to-end strategy formation”, which yet cannot ensure the security of artificial intelligence (AI) in power grid applications. A coordinated optimal control method is proposed for active power and frequency control based on safe deep reinforcement learning. First, the method models the frequency control problem as a constrained Markov decision process, and an agent is designed by considering various safety constraints. Then, the agent is trained using the example of East China Power Grid through continuous interactions with the grid. Finally, the effect of the agent and the conventional AGC strategy is compared. The results show that the proposed approach can quickly generate control strategies under various operating conditions, and can assist dispatchers to make decisions online.

参考文献

[1] 周毅, 钱学东. 华东电网动态区域控制误差应用分析[J]. 电力系统自动化, 2010, 34(8): 106-110.
  ZHOU Yi, QIAN Xuedong. Applications analysis of dynamic ACE in East China power grid[J]. Automation of Electric Power Systems, 2010, 34(8): 106-110.
[2] 李兆伟, 吴雪莲, 庄侃沁, 等. “9·19”锦苏直流双极闭锁事故华东电网频率特性分析及思考[J]. 电力系统自动化, 2017, 41(7): 149-155.
  LI Zhaowei, WU Xuelian, ZHUANG Kanqin, et al. Analysis and reflection on frequency characteristics of East China grid after bipolar locking of “9·19” Jinping-Sunan DC transmission line[J]. Automation of Electric Power Systems, 2017, 41(7): 149-155.
[3] 颜伟, 王聪, 毛艳丽, 等. 基于OPF的互联电网AGC优化模型[J]. 电力系统保护与控制, 2015, 43(23): 35-40.
  YAN Wei, WANG Cong, MAO Yanli, et al. AGC optimal model based on OPF technology for interconnected power grid[J]. Power System Protection & Control, 2015, 43(23): 35-40.
[4] 刘阳, 夏添, 汪旸. 区域电网内多输电断面有功协同控制策略在线生成方法[J]. 电力自动化设备, 2020, 40(7): 204-210.
  LIU Yang, XIA Tian, WANG Yang. On-line generation method of active power coordinated control strategy for multiple transmission sections in regional power grid[J]. Electric Power Automation Equipment, 2020, 40(7): 204-210.
[5] 贾宏杰, 穆云飞, 余晓丹. 基于直流潮流灵敏度的断面潮流定向控制[J]. 电力系统自动化, 2010, 34(2): 34-38.
  JIA Hongjie, MU Yunfei, YU Xiaodan. Directional control method to interface power based on DC power flow and sensitivity[J]. Automation of Electric Power Systems, 2010, 34(2): 34-38.
[6] 赵建宁, 徐武祥, 杨强. 基于AGC的稳定断面潮流控制的设计与实现[J]. 电力系统自动化, 2006, 30(22): 85-88.
  ZHAO Jianning, XU Wuxiang, YANG Qiang. Design and implementation of active power control for tie lines based on automatic generation control[J]. Automation of Electric Power Systems, 2006, 30(22): 85-88.
[7] SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of Go without human knowledge[J]. Nature, 2017, 550(7676): 354-359.
[8] SILVER D, SINGH S, PRECUP D, et al. Reward is enough[J]. Artificial Intelligence, 2021, 299: 103535.
[9] 刘威, 张东霞, 王新迎, 等. 基于深度强化学习的电网紧急控制策略研究[J]. 中国电机工程学报, 2018, 38(1): 109-119.
  LIU Wei, ZHANG Dongxia, WANG Xinying, et al. A decision making strategy for generating unit tripping under emergency circumstances based on deep reinforcement learning[J]. Proceedings of the CSEE, 2018, 38(1): 109-119.
[10] 王甜婧, 汤涌, 郭强, 等. 基于知识经验和深度强化学习的大电网潮流计算收敛自动调整方法[J]. 中国电机工程学报, 2020, 40(8): 2396-2405.
  WANG Tianjing, TANG Yong, GUO Qiang, et al. Automatic adjustment method of power flow calculation convergence for large-scale power grid based on knowledge experience and deep reinforcement learning[J]. Proceedings of the CSEE, 2020, 40(8): 2396-2405
[11] HASHMY Y, YU Z, SHI D, et al. Wide-area measurement system-based low frequency oscillation damping control through reinforcement learning[J]. IEEE Transactions on Smart Grid, 2020, 11(6): 5072-5083.
[12] DUAN J J, SHI D, DIAO R S, et al. Deep-reinforcement-learning-based autonomous voltage control for power grid operations[J]. IEEE Transactions on Power Systems, 2020, 35(1): 814-817.
[13] FENG C, ZHANG J. Reinforcement learning based dynamic model selection for short-term load forecasting[C]// 2019 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference. Washington, USA: IEEE, 2019: 1-5.
[14] WANG X N, WANG Y S, SHI D, et al. Two-stage WECC composite load modeling: A double deep Q-learning networks approach[J]. IEEE Transactions on Smart Grid, 2020, 11(5): 4331-4344.
[15] ZHANG B, LU X, DIAO R S, et al. Real-time autonomous line flow control using proximal policy optimization[C]// 2020 IEEE Power & Energy Society General Meeting. Montreal, Canada: IEEE, 2020: 1-5.
[16] LAN T, DUAN J J, ZHANG B, et al. AI-based autonomous line flow control via topology adjustment for maximizing time-series ATCs[C]// 2020 IEEE Power & Energy Society General Meeting. Montreal, Canada: IEEE, 2020: 1-5.
[17] ZHOU Y H, ZHANG B, XU C L, et al. A Data-driven method for fast AC optimal power flow solutions via deep reinforcement learning[J]. Journal of Modern Power Systems & Clean Energy, 2020, 8(6): 1128-1139.
[18] YAN Z M, XU Y. Data-driven load frequency control for stochastic power systems: A deep reinforcement learning method with continuous action search[J]. IEEE Transactions on Power Systems, 2019, 34(2): 1653-1656.
[19] ADIBI M, WOUDE J. A reinforcement learning approach for frequency control of inverter-based microgrids[C] // IFAC Workshop on Control of Smart Grid and Renewable Energy Systems. Jeju, Republic of Korea: IFAC, 2019: 111-116.
[20] ROZADA S, APOSTOLOPOULOU D, ALONSO E. Deep multi-agent reinforcement learning for cost-efficient distributed load frequency control[J]. IET Energy Systems Integration, 2021, 3(3): 327-343.
[21] SOOFI A F, MANSHADI S D, LIU G Y, et al. A SOCP relaxation for cycle constraints in the optimal power flow problem[J]. IEEE Transactions on Smart Grid, 2021, 12(2): 1663-1673.
[22] HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]// Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: IMLS, 2018: 1861-1870.
[23] HAARNOJA T, ZHOU A, HARTIKAINEN K, et al. Soft actor-critic algorithms and applications[DB/OL]. (2019-01-29)[2018-12-13]. https://arxiv.org/pdf/1812.05905.pdf.
[24] ACHIAM J, HELD D, TAMAR A, ABBEEL P. Constrained policy optimization[C]// Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: IMLS, 2017: 22-31.
[25] 徐春雷, 吴海伟, 刁瑞盛, 等. 基于深度强化学习算法的“电网脑”及其示范工程应用[J]. 电力需求侧管理, 2021, 23(4): 73-78.
  XU Chunlei, WU Haiwei, DIAO Ruisheng, et al. Deep reinforcement learning-based grid mind and field demonstration application[J]. Power Demand Side Management, 2021, 23(4): 73-78.
文章导航

/