上海交通大学学报 ›› 2024, Vol. 58 ›› Issue (5): 682-692.doi: 10.16183/j.cnki.jsjtu.2022.358
收稿日期:
2022-09-13
修回日期:
2023-02-15
接受日期:
2023-02-24
出版日期:
2024-05-28
发布日期:
2024-06-17
通讯作者:
周良才,高级工程师;E-mail:作者简介:
周 毅(1982-),高级工程师,主要从事电网调度、电力系统自动化研究.
基金资助:
ZHOU Yi1, ZHOU Liangcai1(), SHI Di2, ZHAO Xiaoying2, SHAN Xin3
Received:
2022-09-13
Revised:
2023-02-15
Accepted:
2023-02-24
Online:
2024-05-28
Published:
2024-06-17
摘要:
可再生能源占比不断增加给互联电网频率控制带来严峻考验.由于常规的自动发电控制(AGC)策略没有考虑电网潮流安全约束,所以传统方法根据专家知识和经验进行尝试性发电机功率调整,需耗费较多时间;基于最优电力潮流的互联电网AGC优化模型由于非凸性和大规模性,求解时间较长且存在收敛性问题.鉴于常规深度强化学习具有“离线训练、在线端对端形成策略”的优点,但在动作探索过程中无法保证系统安全性,提出一种基于安全深度强化学习的电网有功频率协同优化控制方法.首先,将电网频率控制建模为约束马尔可夫决策过程,对决策过程添加相关安全约束进行智能体设计;然后,基于华东电网实际系统算例对智能体进行训练和性能提升;最后,对比智能体决策与常规AGC策略效果.结果表明:所提方法在多种运行方式下可快速生成有功频率控制策略,且保证系统频率恢复过程中电网的安全性,可辅助调度员在线决策.
中图分类号:
周毅, 周良才, 史迪, 赵小英, 闪鑫. 基于安全深度强化学习的电网有功频率协同优化控制[J]. 上海交通大学学报, 2024, 58(5): 682-692.
ZHOU Yi, ZHOU Liangcai, SHI Di, ZHAO Xiaoying, SHAN Xin. Coordinated Active Power-Frequency Control Based on Safe Deep Reinforcement Learning[J]. Journal of Shanghai Jiao Tong University, 2024, 58(5): 682-692.
[1] | 周毅, 钱学东. 华东电网动态区域控制误差应用分析[J]. 电力系统自动化, 2010, 34(8): 106-110. |
ZHOU Yi, QIAN Xuedong. Applications analysis of dynamic ACE in East China power grid[J]. Automation of Electric Power Systems, 2010, 34(8): 106-110. | |
[2] | 李兆伟, 吴雪莲, 庄侃沁, 等. “9·19”锦苏直流双极闭锁事故华东电网频率特性分析及思考[J]. 电力系统自动化, 2017, 41(7): 149-155. |
LI Zhaowei, WU Xuelian, ZHUANG Kanqin, et al. Analysis and reflection on frequency characteristics of East China grid after bipolar locking of “9·19” Jinping-Sunan DC transmission line[J]. Automation of Electric Power Systems, 2017, 41(7): 149-155. | |
[3] | 颜伟, 王聪, 毛艳丽, 等. 基于OPF的互联电网AGC优化模型[J]. 电力系统保护与控制, 2015, 43(23): 35-40. |
YAN Wei, WANG Cong, MAO Yanli, et al. AGC optimal model based on OPF technology for interconnected power grid[J]. Power System Protection & Control, 2015, 43(23): 35-40. | |
[4] | 刘阳, 夏添, 汪旸. 区域电网内多输电断面有功协同控制策略在线生成方法[J]. 电力自动化设备, 2020, 40(7): 204-210. |
LIU Yang, XIA Tian, WANG Yang. On-line generation method of active power coordinated control strategy for multiple transmission sections in regional power grid[J]. Electric Power Automation Equipment, 2020, 40(7): 204-210. | |
[5] | 贾宏杰, 穆云飞, 余晓丹. 基于直流潮流灵敏度的断面潮流定向控制[J]. 电力系统自动化, 2010, 34(2): 34-38. |
JIA Hongjie, MU Yunfei, YU Xiaodan. Directional control method to interface power based on DC power flow and sensitivity[J]. Automation of Electric Power Systems, 2010, 34(2): 34-38. | |
[6] | 赵建宁, 徐武祥, 杨强. 基于AGC的稳定断面潮流控制的设计与实现[J]. 电力系统自动化, 2006, 30(22): 85-88. |
ZHAO Jianning, XU Wuxiang, YANG Qiang. Design and implementation of active power control for tie lines based on automatic generation control[J]. Automation of Electric Power Systems, 2006, 30(22): 85-88. | |
[7] | SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of Go without human knowledge[J]. Nature, 2017, 550(7676): 354-359. |
[8] | SILVER D, SINGH S, PRECUP D, et al. Reward is enough[J]. Artificial Intelligence, 2021, 299: 103535. |
[9] | 刘威, 张东霞, 王新迎, 等. 基于深度强化学习的电网紧急控制策略研究[J]. 中国电机工程学报, 2018, 38(1): 109-119. |
LIU Wei, ZHANG Dongxia, WANG Xinying, et al. A decision making strategy for generating unit tripping under emergency circumstances based on deep reinforcement learning[J]. Proceedings of the CSEE, 2018, 38(1): 109-119. | |
[10] | 王甜婧, 汤涌, 郭强, 等. 基于知识经验和深度强化学习的大电网潮流计算收敛自动调整方法[J]. 中国电机工程学报, 2020, 40(8): 2396-2405. |
WANG Tianjing, TANG Yong, GUO Qiang, et al. Automatic adjustment method of power flow calculation convergence for large-scale power grid based on knowledge experience and deep reinforcement learning[J]. Proceedings of the CSEE, 2020, 40(8): 2396-2405 | |
[11] | HASHMY Y, YU Z, SHI D, et al. Wide-area measurement system-based low frequency oscillation damping control through reinforcement learning[J]. IEEE Transactions on Smart Grid, 2020, 11(6): 5072-5083. |
[12] | DUAN J J, SHI D, DIAO R S, et al. Deep-reinforcement-learning-based autonomous voltage control for power grid operations[J]. IEEE Transactions on Power Systems, 2020, 35(1): 814-817. |
[13] | FENG C, ZHANG J. Reinforcement learning based dynamic model selection for short-term load forecasting[C]// 2019 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference. Washington, USA: IEEE, 2019: 1-5. |
[14] | WANG X N, WANG Y S, SHI D, et al. Two-stage WECC composite load modeling: A double deep Q-learning networks approach[J]. IEEE Transactions on Smart Grid, 2020, 11(5): 4331-4344. |
[15] | ZHANG B, LU X, DIAO R S, et al. Real-time autonomous line flow control using proximal policy optimization[C]// 2020 IEEE Power & Energy Society General Meeting. Montreal, Canada: IEEE, 2020: 1-5. |
[16] | LAN T, DUAN J J, ZHANG B, et al. AI-based autonomous line flow control via topology adjustment for maximizing time-series ATCs[C]// 2020 IEEE Power & Energy Society General Meeting. Montreal, Canada: IEEE, 2020: 1-5. |
[17] | ZHOU Y H, ZHANG B, XU C L, et al. A Data-driven method for fast AC optimal power flow solutions via deep reinforcement learning[J]. Journal of Modern Power Systems & Clean Energy, 2020, 8(6): 1128-1139. |
[18] | YAN Z M, XU Y. Data-driven load frequency control for stochastic power systems: A deep reinforcement learning method with continuous action search[J]. IEEE Transactions on Power Systems, 2019, 34(2): 1653-1656. |
[19] | ADIBI M, WOUDE J. A reinforcement learning approach for frequency control of inverter-based microgrids[C] // IFAC Workshop on Control of Smart Grid and Renewable Energy Systems. Jeju, Republic of Korea: IFAC, 2019: 111-116. |
[20] | ROZADA S, APOSTOLOPOULOU D, ALONSO E. Deep multi-agent reinforcement learning for cost-efficient distributed load frequency control[J]. IET Energy Systems Integration, 2021, 3(3): 327-343. |
[21] | SOOFI A F, MANSHADI S D, LIU G Y, et al. A SOCP relaxation for cycle constraints in the optimal power flow problem[J]. IEEE Transactions on Smart Grid, 2021, 12(2): 1663-1673. |
[22] | HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]// Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: IMLS, 2018: 1861-1870. |
[23] | HAARNOJA T, ZHOU A, HARTIKAINEN K, et al. Soft actor-critic algorithms and applications[DB/OL]. (2019-01-29)[2018-12-13]. https://arxiv.org/pdf/1812.05905.pdf. |
[24] | ACHIAM J, HELD D, TAMAR A, ABBEEL P. Constrained policy optimization[C]// Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: IMLS, 2017: 22-31. |
[25] | 徐春雷, 吴海伟, 刁瑞盛, 等. 基于深度强化学习算法的“电网脑”及其示范工程应用[J]. 电力需求侧管理, 2021, 23(4): 73-78. |
XU Chunlei, WU Haiwei, DIAO Ruisheng, et al. Deep reinforcement learning-based grid mind and field demonstration application[J]. Power Demand Side Management, 2021, 23(4): 73-78. |
[1] | 苗镇华1, 黄文焘2, 张依恋3, 范勤勤1. 基于深度强化学习的多模态多目标多机器人任务分配算法[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 377-387. |
[2] | 全家乐, 马先龙, 沈昱恒. 基于近端策略动态优化的多智能体编队方法[J]. 空天防御, 2024, 7(2): 52-62. |
[3] | 马驰, 张国群, 孙俊格, 吕广喆, 张涛. 基于深度强化学习的综合电子系统重构方法[J]. 空天防御, 2024, 7(1): 63-70. |
[4] | 石大山, 葛宇航, 孟光韦, 季飚, 蔡克荣. 弹载综合电子系统及其关键技术发展分析[J]. 空天防御, 2023, 6(3): 52-57. |
[5] | 李胤, 郭健彬, 张绍伟, 王坤云, 慈慧鹏. 基于多主体和情景意识的人机系统安全性建模方法[J]. 空天防御, 2023, 6(1): 29-37. |
[6] | 刘亚辉, 申兴旺, 顾星海, 彭涛, 鲍劲松, 张丹. 面向柔性作业车间动态调度的双系统强化学习方法[J]. 上海交通大学学报, 2022, 56(9): 1262-1275. |
[7] | 李少远, 殷翔. 智能体自我博弈学习是否存在性能极限?[J]. 上海交通大学学报, 2021, 55(Sup.1): 3-4. |
[8] | 何清波,姜添曦. 人工智能可以通过操纵波来实现吗?[J]. 上海交通大学学报, 2021, 55(Sup.1): 1-2. |
[9] | 李鹏, 阮晓钢, 朱晓庆, 柴洁, 任顶奇, 刘鹏飞. 基于深度强化学习的区域化视觉导航方法[J]. 上海交通大学学报, 2021, 55(5): 575-585. |
[10] | 赵仁星, 王玲, 冯明月, 贾文通. “马赛克战”作战概念构想及对策分析[J]. 空天防御, 2021, 4(3): 48-54. |
[11] | 郑德重, 杨媛媛, 谢哲, 倪扬帆, 李文涛. 基于Gaussian混合的距离度量学习数据划分方法[J]. 上海交通大学学报, 2021, 55(2): 131-140. |
[12] | 尹航, 郭谡, 温超然, 杨闯, 毕鹏. 美微型空射诱饵武器发展分析与应对策略[J]. 空天防御, 2019, 2(3): 84-. |
[13] | 周蓓蓓, 刘珏. 智能化技术在精确打击体系中的应用[J]. 空天防御, 2019, 2(3): 77-83. |
[14] | 邱志明, 罗荣, 王亮, 肖玉杰, 李烨, 何翼. 军事智能技术在海战领域应用的几点思考[J]. 空天防御, 2019, 2(1): 1-5. |
[15] | 易平,王科迪,黄程,顾双驰,邹福泰,李建华. 人工智能对抗攻击研究综述[J]. 上海交通大学学报(自然版), 2018, 52(10): 1298-1306. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||