上海交通大学学报 ›› 2024, Vol. 58 ›› Issue (5): 682-692.doi: 10.16183/j.cnki.jsjtu.2022.358
收稿日期:
2022-09-13
修回日期:
2023-02-15
接受日期:
2023-02-24
出版日期:
2024-05-28
发布日期:
2024-06-17
通讯作者:
周良才,高级工程师;E-mail:作者简介:
周 毅(1982-),高级工程师,主要从事电网调度、电力系统自动化研究.
基金资助:
ZHOU Yi1, ZHOU Liangcai1(), SHI Di2, ZHAO Xiaoying2, SHAN Xin3
Received:
2022-09-13
Revised:
2023-02-15
Accepted:
2023-02-24
Online:
2024-05-28
Published:
2024-06-17
摘要:
可再生能源占比不断增加给互联电网频率控制带来严峻考验.由于常规的自动发电控制(AGC)策略没有考虑电网潮流安全约束,所以传统方法根据专家知识和经验进行尝试性发电机功率调整,需耗费较多时间;基于最优电力潮流的互联电网AGC优化模型由于非凸性和大规模性,求解时间较长且存在收敛性问题.鉴于常规深度强化学习具有“离线训练、在线端对端形成策略”的优点,但在动作探索过程中无法保证系统安全性,提出一种基于安全深度强化学习的电网有功频率协同优化控制方法.首先,将电网频率控制建模为约束马尔可夫决策过程,对决策过程添加相关安全约束进行智能体设计;然后,基于华东电网实际系统算例对智能体进行训练和性能提升;最后,对比智能体决策与常规AGC策略效果.结果表明:所提方法在多种运行方式下可快速生成有功频率控制策略,且保证系统频率恢复过程中电网的安全性,可辅助调度员在线决策.
中图分类号:
周毅, 周良才, 史迪, 赵小英, 闪鑫. 基于安全深度强化学习的电网有功频率协同优化控制[J]. 上海交通大学学报, 2024, 58(5): 682-692.
ZHOU Yi, ZHOU Liangcai, SHI Di, ZHAO Xiaoying, SHAN Xin. Coordinated Active Power-Frequency Control Based on Safe Deep Reinforcement Learning[J]. Journal of Shanghai Jiao Tong University, 2024, 58(5): 682-692.
[1] | 周毅, 钱学东. 华东电网动态区域控制误差应用分析[J]. 电力系统自动化, 2010, 34(8): 106-110. |
ZHOU Yi, QIAN Xuedong. Applications analysis of dynamic ACE in East China power grid[J]. Automation of Electric Power Systems, 2010, 34(8): 106-110. | |
[2] | 李兆伟, 吴雪莲, 庄侃沁, 等. “9·19”锦苏直流双极闭锁事故华东电网频率特性分析及思考[J]. 电力系统自动化, 2017, 41(7): 149-155. |
LI Zhaowei, WU Xuelian, ZHUANG Kanqin, et al. Analysis and reflection on frequency characteristics of East China grid after bipolar locking of “9·19” Jinping-Sunan DC transmission line[J]. Automation of Electric Power Systems, 2017, 41(7): 149-155. | |
[3] | 颜伟, 王聪, 毛艳丽, 等. 基于OPF的互联电网AGC优化模型[J]. 电力系统保护与控制, 2015, 43(23): 35-40. |
YAN Wei, WANG Cong, MAO Yanli, et al. AGC optimal model based on OPF technology for interconnected power grid[J]. Power System Protection & Control, 2015, 43(23): 35-40. | |
[4] | 刘阳, 夏添, 汪旸. 区域电网内多输电断面有功协同控制策略在线生成方法[J]. 电力自动化设备, 2020, 40(7): 204-210. |
LIU Yang, XIA Tian, WANG Yang. On-line generation method of active power coordinated control strategy for multiple transmission sections in regional power grid[J]. Electric Power Automation Equipment, 2020, 40(7): 204-210. | |
[5] | 贾宏杰, 穆云飞, 余晓丹. 基于直流潮流灵敏度的断面潮流定向控制[J]. 电力系统自动化, 2010, 34(2): 34-38. |
JIA Hongjie, MU Yunfei, YU Xiaodan. Directional control method to interface power based on DC power flow and sensitivity[J]. Automation of Electric Power Systems, 2010, 34(2): 34-38. | |
[6] | 赵建宁, 徐武祥, 杨强. 基于AGC的稳定断面潮流控制的设计与实现[J]. 电力系统自动化, 2006, 30(22): 85-88. |
ZHAO Jianning, XU Wuxiang, YANG Qiang. Design and implementation of active power control for tie lines based on automatic generation control[J]. Automation of Electric Power Systems, 2006, 30(22): 85-88. | |
[7] | SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of Go without human knowledge[J]. Nature, 2017, 550(7676): 354-359. |
[8] | SILVER D, SINGH S, PRECUP D, et al. Reward is enough[J]. Artificial Intelligence, 2021, 299: 103535. |
[9] | 刘威, 张东霞, 王新迎, 等. 基于深度强化学习的电网紧急控制策略研究[J]. 中国电机工程学报, 2018, 38(1): 109-119. |
LIU Wei, ZHANG Dongxia, WANG Xinying, et al. A decision making strategy for generating unit tripping under emergency circumstances based on deep reinforcement learning[J]. Proceedings of the CSEE, 2018, 38(1): 109-119. | |
[10] | 王甜婧, 汤涌, 郭强, 等. 基于知识经验和深度强化学习的大电网潮流计算收敛自动调整方法[J]. 中国电机工程学报, 2020, 40(8): 2396-2405. |
WANG Tianjing, TANG Yong, GUO Qiang, et al. Automatic adjustment method of power flow calculation convergence for large-scale power grid based on knowledge experience and deep reinforcement learning[J]. Proceedings of the CSEE, 2020, 40(8): 2396-2405 | |
[11] | HASHMY Y, YU Z, SHI D, et al. Wide-area measurement system-based low frequency oscillation damping control through reinforcement learning[J]. IEEE Transactions on Smart Grid, 2020, 11(6): 5072-5083. |
[12] | DUAN J J, SHI D, DIAO R S, et al. Deep-reinforcement-learning-based autonomous voltage control for power grid operations[J]. IEEE Transactions on Power Systems, 2020, 35(1): 814-817. |
[13] | FENG C, ZHANG J. Reinforcement learning based dynamic model selection for short-term load forecasting[C]// 2019 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference. Washington, USA: IEEE, 2019: 1-5. |
[14] | WANG X N, WANG Y S, SHI D, et al. Two-stage WECC composite load modeling: A double deep Q-learning networks approach[J]. IEEE Transactions on Smart Grid, 2020, 11(5): 4331-4344. |
[15] | ZHANG B, LU X, DIAO R S, et al. Real-time autonomous line flow control using proximal policy optimization[C]// 2020 IEEE Power & Energy Society General Meeting. Montreal, Canada: IEEE, 2020: 1-5. |
[16] | LAN T, DUAN J J, ZHANG B, et al. AI-based autonomous line flow control via topology adjustment for maximizing time-series ATCs[C]// 2020 IEEE Power & Energy Society General Meeting. Montreal, Canada: IEEE, 2020: 1-5. |
[17] | ZHOU Y H, ZHANG B, XU C L, et al. A Data-driven method for fast AC optimal power flow solutions via deep reinforcement learning[J]. Journal of Modern Power Systems & Clean Energy, 2020, 8(6): 1128-1139. |
[18] | YAN Z M, XU Y. Data-driven load frequency control for stochastic power systems: A deep reinforcement learning method with continuous action search[J]. IEEE Transactions on Power Systems, 2019, 34(2): 1653-1656. |
[19] | ADIBI M, WOUDE J. A reinforcement learning approach for frequency control of inverter-based microgrids[C] // IFAC Workshop on Control of Smart Grid and Renewable Energy Systems. Jeju, Republic of Korea: IFAC, 2019: 111-116. |
[20] | ROZADA S, APOSTOLOPOULOU D, ALONSO E. Deep multi-agent reinforcement learning for cost-efficient distributed load frequency control[J]. IET Energy Systems Integration, 2021, 3(3): 327-343. |
[21] | SOOFI A F, MANSHADI S D, LIU G Y, et al. A SOCP relaxation for cycle constraints in the optimal power flow problem[J]. IEEE Transactions on Smart Grid, 2021, 12(2): 1663-1673. |
[22] | HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]// Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: IMLS, 2018: 1861-1870. |
[23] | HAARNOJA T, ZHOU A, HARTIKAINEN K, et al. Soft actor-critic algorithms and applications[DB/OL]. (2019-01-29)[2018-12-13]. https://arxiv.org/pdf/1812.05905.pdf. |
[24] | ACHIAM J, HELD D, TAMAR A, ABBEEL P. Constrained policy optimization[C]// Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: IMLS, 2017: 22-31. |
[25] | 徐春雷, 吴海伟, 刁瑞盛, 等. 基于深度强化学习算法的“电网脑”及其示范工程应用[J]. 电力需求侧管理, 2021, 23(4): 73-78. |
XU Chunlei, WU Haiwei, DIAO Ruisheng, et al. Deep reinforcement learning-based grid mind and field demonstration application[J]. Power Demand Side Management, 2021, 23(4): 73-78. |
[1] | 唐成, 梁一林, 李汶洁, 陈国铃. 宽带通信卫星态势感知网络安全研究[J]. 空天防御, 2025, 8(2): 136-141. |
[2] | 薛雅丽, 徐夏易, 李锦毅, 崔闪, 洪君, 刘世豪. 智能控制技术在导弹制导系统中的应用与发展前景[J]. 空天防御, 2025, 8(2): 1-6. |
[3] | 杨映荷, 魏汉迪, 范迪夏, 李昂. 基于高斯过程回归和深度强化学习的水下扑翼推进性能寻优方法[J]. 上海交通大学学报, 2025, 59(1): 70-78. |
[4] | 詹何庆1,韩贵来1,魏传安1,李治群2. 人工智能结合磁共振成像和计算建模在心脏电生理与临床诊断中的应用[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 53-65. |
[5] | 杜海阔1,2, 郭正玉3,4, 章露露1,2, 蔡云泽1,2. 基于多目标松散同步搜索的多目标多智能体异步路径规划[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 667-677. |
[6] | 金飞宇,陈龙胜,李统帅,石童昕. 高阶MIMO非线性多智能体系统分布式协同抗干扰控制[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 656-666. |
[7] | 董玉博1, 崔涛1, 周禹帆1, 宋勋2, 祝月2, 董鹏1. 基于长周期极坐标系追击问题的多智能体强化学习奖赏函数设计方法[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 646-655. |
[8] | 耿宗盛1,赵东东1, 2,周兴文1,闫磊1, 阎石1, 2. 基于全分布式事件驱动控制的多智能体系统领导-跟随一致性研究[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 640-645. |
[9] | 邢优靖1, 高金凤1, 刘小平1, 2, 吴平1. 带有时延和切换拓扑的二阶非线性多智能体系统事件触发固定时间一致性[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 625-639. |
[10] | 吴治海, 谢林柏. 异步自我感知功能失效下双积分多智能体系统的容错动态一致性[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 613-624. |
[11] | 李舒逸, 李旻哲, 敬忠良. 动态环境下基于改进DQN的多智能体路径规划方法[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 601-612. |
[12] | 唐胜景, 王太岩, 赵刚练, 郭杰, 李佳丽, 尹航. 面向目标跟踪的多传感器数据融合研究综述[J]. 空天防御, 2024, 7(4): 18-29. |
[13] | 苗镇华1, 黄文焘2, 张依恋3, 范勤勤1. 基于深度强化学习的多模态多目标多机器人任务分配算法[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 377-387. |
[14] | 全家乐, 马先龙, 沈昱恒. 基于近端策略动态优化的多智能体编队方法[J]. 空天防御, 2024, 7(2): 52-62. |
[15] | 刘宇, 文利燕, 姜斌, 马亚杰, 崔玉康. 切换拓扑下异构多智能体系统自适应输出一致性[J]. 上海交通大学学报, 2024, 58(11): 1805-1815. |
阅读次数 | ||||||||||||||||||||||||||||||||||||||||||||||||||
全文 450
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
摘要 1716
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||