Coordinated Active Power-Frequency Control Based on Safe Deep Reinforcement Learning

doi:10.16183/j.cnki.jsjtu.2022.358

Abstract

Abstract:

The continuous increase in renewables penetration poses a severe challenge to the frequency control of interconnected power grid. Since the conventional automatic generation control (AGC) strategy does not consider the power flow constraints of the network, the traditional approach is to make tentative generator power adjustments based on expert knowledge and experience, which is time consuming. The optimal power flow-based AGC optimization model has a long solution time and convergence issues due to its non-convexity and large size. Deep reinforcement learning has the advantage of “offline training and online end-to-end strategy formation”, which yet cannot ensure the security of artificial intelligence (AI) in power grid applications. A coordinated optimal control method is proposed for active power and frequency control based on safe deep reinforcement learning. First, the method models the frequency control problem as a constrained Markov decision process, and an agent is designed by considering various safety constraints. Then, the agent is trained using the example of East China Power Grid through continuous interactions with the grid. Finally, the effect of the agent and the conventional AGC strategy is compared. The results show that the proposed approach can quickly generate control strategies under various operating conditions, and can assist dispatchers to make decisions online.

Key words: coordinated power and frequency control, artificial intelligence (AI), safe deep reinforcement learning, constrained Markov decision process, agent

CLC Number:

TM711

ZHOU Yi, ZHOU Liangcai, SHI Di, ZHAO Xiaoying, SHAN Xin. Coordinated Active Power-Frequency Control Based on Safe Deep Reinforcement Learning[J]. Journal of Shanghai Jiao Tong University, 2024, 58(5): 682-692.

Figures/Tables 11

Fig.1

Fig.2

Fig.3

Fig.4

Tab.1

Tab.2

Fig.5

Fig.6

Fig.7

Fig.8

Fig.9

References 25

[1]	周毅, 钱学东. 华东电网动态区域控制误差应用分析[J]. 电力系统自动化, 2010, 34(8): 106-110.
	ZHOU Yi, QIAN Xuedong. Applications analysis of dynamic ACE in East China power grid[J]. Automation of Electric Power Systems, 2010, 34(8): 106-110.
[2]	李兆伟, 吴雪莲, 庄侃沁, 等. “9·19”锦苏直流双极闭锁事故华东电网频率特性分析及思考[J]. 电力系统自动化, 2017, 41(7): 149-155.
	LI Zhaowei, WU Xuelian, ZHUANG Kanqin, et al. Analysis and reflection on frequency characteristics of East China grid after bipolar locking of “9·19” Jinping-Sunan DC transmission line[J]. Automation of Electric Power Systems, 2017, 41(7): 149-155.
[3]	颜伟, 王聪, 毛艳丽, 等. 基于OPF的互联电网AGC优化模型[J]. 电力系统保护与控制, 2015, 43(23): 35-40.
	YAN Wei, WANG Cong, MAO Yanli, et al. AGC optimal model based on OPF technology for interconnected power grid[J]. Power System Protection & Control, 2015, 43(23): 35-40.
[4]	刘阳, 夏添, 汪旸. 区域电网内多输电断面有功协同控制策略在线生成方法[J]. 电力自动化设备, 2020, 40(7): 204-210.
	LIU Yang, XIA Tian, WANG Yang. On-line generation method of active power coordinated control strategy for multiple transmission sections in regional power grid[J]. Electric Power Automation Equipment, 2020, 40(7): 204-210.
[5]	贾宏杰, 穆云飞, 余晓丹. 基于直流潮流灵敏度的断面潮流定向控制[J]. 电力系统自动化, 2010, 34(2): 34-38.
	JIA Hongjie, MU Yunfei, YU Xiaodan. Directional control method to interface power based on DC power flow and sensitivity[J]. Automation of Electric Power Systems, 2010, 34(2): 34-38.
[6]	赵建宁, 徐武祥, 杨强. 基于AGC的稳定断面潮流控制的设计与实现[J]. 电力系统自动化, 2006, 30(22): 85-88.
	ZHAO Jianning, XU Wuxiang, YANG Qiang. Design and implementation of active power control for tie lines based on automatic generation control[J]. Automation of Electric Power Systems, 2006, 30(22): 85-88.
[7]	SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of Go without human knowledge[J]. Nature, 2017, 550(7676): 354-359.
[8]	SILVER D, SINGH S, PRECUP D, et al. Reward is enough[J]. Artificial Intelligence, 2021, 299: 103535.
[9]	刘威, 张东霞, 王新迎, 等. 基于深度强化学习的电网紧急控制策略研究[J]. 中国电机工程学报, 2018, 38(1): 109-119.
	LIU Wei, ZHANG Dongxia, WANG Xinying, et al. A decision making strategy for generating unit tripping under emergency circumstances based on deep reinforcement learning[J]. Proceedings of the CSEE, 2018, 38(1): 109-119.
[10]	王甜婧, 汤涌, 郭强, 等. 基于知识经验和深度强化学习的大电网潮流计算收敛自动调整方法[J]. 中国电机工程学报, 2020, 40(8): 2396-2405.
	WANG Tianjing, TANG Yong, GUO Qiang, et al. Automatic adjustment method of power flow calculation convergence for large-scale power grid based on knowledge experience and deep reinforcement learning[J]. Proceedings of the CSEE, 2020, 40(8): 2396-2405
[11]	HASHMY Y, YU Z, SHI D, et al. Wide-area measurement system-based low frequency oscillation damping control through reinforcement learning[J]. IEEE Transactions on Smart Grid, 2020, 11(6): 5072-5083.
[12]	DUAN J J, SHI D, DIAO R S, et al. Deep-reinforcement-learning-based autonomous voltage control for power grid operations[J]. IEEE Transactions on Power Systems, 2020, 35(1): 814-817.
[13]	FENG C, ZHANG J. Reinforcement learning based dynamic model selection for short-term load forecasting[C]// 2019 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference. Washington, USA: IEEE, 2019: 1-5.
[14]	WANG X N, WANG Y S, SHI D, et al. Two-stage WECC composite load modeling: A double deep Q-learning networks approach[J]. IEEE Transactions on Smart Grid, 2020, 11(5): 4331-4344.
[15]	ZHANG B, LU X, DIAO R S, et al. Real-time autonomous line flow control using proximal policy optimization[C]// 2020 IEEE Power & Energy Society General Meeting. Montreal, Canada: IEEE, 2020: 1-5.
[16]	LAN T, DUAN J J, ZHANG B, et al. AI-based autonomous line flow control via topology adjustment for maximizing time-series ATCs[C]// 2020 IEEE Power & Energy Society General Meeting. Montreal, Canada: IEEE, 2020: 1-5.
[17]	ZHOU Y H, ZHANG B, XU C L, et al. A Data-driven method for fast AC optimal power flow solutions via deep reinforcement learning[J]. Journal of Modern Power Systems & Clean Energy, 2020, 8(6): 1128-1139.
[18]	YAN Z M, XU Y. Data-driven load frequency control for stochastic power systems: A deep reinforcement learning method with continuous action search[J]. IEEE Transactions on Power Systems, 2019, 34(2): 1653-1656.
[19]	ADIBI M, WOUDE J. A reinforcement learning approach for frequency control of inverter-based microgrids[C] // IFAC Workshop on Control of Smart Grid and Renewable Energy Systems. Jeju, Republic of Korea: IFAC, 2019: 111-116.
[20]	ROZADA S, APOSTOLOPOULOU D, ALONSO E. Deep multi-agent reinforcement learning for cost-efficient distributed load frequency control[J]. IET Energy Systems Integration, 2021, 3(3): 327-343.
[21]	SOOFI A F, MANSHADI S D, LIU G Y, et al. A SOCP relaxation for cycle constraints in the optimal power flow problem[J]. IEEE Transactions on Smart Grid, 2021, 12(2): 1663-1673.
[22]	HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]// Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: IMLS, 2018: 1861-1870.
[23]	HAARNOJA T, ZHOU A, HARTIKAINEN K, et al. Soft actor-critic algorithms and applications[DB/OL]. (2019-01-29)[2018-12-13]. https://arxiv.org/pdf/1812.05905.pdf.
[24]	ACHIAM J, HELD D, TAMAR A, ABBEEL P. Constrained policy optimization[C]// Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: IMLS, 2017: 22-31.
[25]	徐春雷, 吴海伟, 刁瑞盛, 等. 基于深度强化学习算法的“电网脑”及其示范工程应用[J]. 电力需求侧管理, 2021, 23(4): 73-78.
	XU Chunlei, WU Haiwei, DIAO Ruisheng, et al. Deep reinforcement learning-based grid mind and field demonstration application[J]. Power Demand Side Management, 2021, 23(4): 73-78.

编号	总样本数	断面潮流越限的样本个数	成功率/%	平均决策时间/ms
1	20 000	0	100	15.178
2	20 000	0	100	16.290
3	20 000	68	99.66	19.703
4	20 000	86	99.57	16.842
5	20 000	16	99.92	17.633

智能体	样本数
智能体	无断面越限	1个断面越限	2个断面越限	3个断面越限
基准场景	0	17 213	2 434	353
智能体1	20 000	0	0	0
智能体2	20 000	0	0	0
智能体3	19 932	68	0	0
智能体4	19 914	74	11	1
智能体5	19 984	15	1	0