基于改进双延迟深度确定性策略梯度算法的电网有功安全校正控制

doi:10.19595/j.cnki.1000-6753.tces.221073

摘要
图/表
参考文献
相关文章 (4)

全文: PDF (2020 KB) HTML
输出: BibTeX | EndNote (RIS)

摘要

新型电力系统中,由于源荷不确定性的影响,发生线路过载事故的风险增大,传统的有功安全校正方法无法有效兼顾计算速度及效果等。基于此,该文提出一种基于改进双延迟深度确定性策略梯度算法的电网有功安全校正控制方法。首先,在满足系统静态安全约束条件下,以可调元件出力调整量最小且保证系统整体运行安全性最高为目标,建立有功安全校正控制模型。其次,构建有功安全校正的深度强化学习框架,定义计及目标与约束的奖励函数、反映电力系统运行的观测状态、可改变系统状态的调节动作以及基于改进双延迟深度确定性策略梯度算法的智能体。最后,构造考虑源荷不确定性的历史系统过载场景,借助深度强化学习模型对智能体进行持续交互训练以获得良好的决策效果;并且进行在线应用,计及源荷未来可能的取值,快速得到最优的元件调整方案,消除过载线路。IEEE 39节点系统和IEEE 118节点系统算例结果表明,所提方法能够有效消除电力系统中的线路过载且避免短时间内再次越限,在计算速度、校正效果等方面,与传统方法相比具有明显的优势。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	顾雪平
	刘彤
	李少岩
	王铁强
	杨晓东

关键词 ：新型电力系统, 有功安全校正, 深度强化学习, 改进双延迟深度确定性策略, 最优调整方案

Abstract：

With the construction and development of the novel power system, the probability of line overload caused by component faults or source-load fluctuations has been significantly increased. If the system cannot be corrected timely and effectively, the propagation speed and range of cascading faults may be aggravated and lead to a blackout accident. Therefore, the timely and effective implementation of safety correction measures to eliminate power flow over the limit is of great significance to ensure the safe operation of the system.
An active power safety correction control method is proposed based on the twin delayed deep deterministic policy gradient algorithm (TD3) algorithm. Firstly, an active power safety correction model is established. One of the objectives is to minimize the sum of the absolute values of the adjustments of the adjustable components, and the other is to ensure the maximum safety of the system.
Secondly, a deep reinforcement learning framework for active power safety correction is established, as shown in Fig.A1. State expresses the characteristics of the power system. Action is the output of adjustable components. The reward function comprises the objective function and constraint conditions of the active power safety correction model. The agent selects the TD3 algorithm.
Finally, the active power safety correction control is carried out based on the improved TD3 algorithm. The historical overload scenario is constructed to pre-train the active power safety correction model based on the improved TD3 algorithm. Considering the influence of source-load fluctuation on the correction results during the correction process, the possible fluctuation value of the source-load output is calculated for each operating condition. During the online application, the predicted value of source and load in the next 5 minutes plus the

Fig.A1 Interaction process between agent and environment
prediction error value are used as the output value of new energy and the load value at the current time, which are input into the actor network together with the states of other system components. The improved TD3 algorithm with sufficient pre-learning can make the optimal decision quickly according to the system state.
An operation state of the IEEE39-bus system is used to verify the effectiveness and feasibility of the proposed method. In this state, line 23 is suddenly disconnected, then leads to line 13 overload. The correction result is shown in Fig.A2.

Fig.A2 Load rate of each line before and after correction
100 groups of source and load prediction error values are selected, and the predicted values are added to evaluate the system's security after correcting the proposed method. At the same time, the correction results without considering the change of new energy output and load are used for comparison. The results show that the correction effect based on the proposed method considering the fluctuation of source load, system uniformity, and the line's highest load rate is the best. It can withstand relatively more uncertainties to ensure that the system will not appear overloaded in a short time.
The same historical overload scenario trains and tests Deep Deterministic Policy Gradient (DDPG), TD3, and improved TD3 deep reinforcement learning algorithms. The results show that the proposed improved TD3 method is better than the other two algorithms regarding training time, testing time, and calculation results.
Compared with the traditional sensitivity method and optimization method, the calculation time of the proposed method is shorter, but the total adjustment amount is more. The optimization method has the slightest adjustment, but the calculation time is about 10 times the sensitive method. Regarding the proposed method, the number of adjustment components is small, and the time is short. The system uniformity after correction is the highest, but the total adjustment amount is slightly greater than the optimization method.
In conclusion, the calculation results of the active power safety correction model established by the proposed method are more consistent with the actual operation scenario of the power grid. In addition, compared with the traditional methods, the proposed method has certain advantages and is more suitable for the current novel power system.

Key words： Novel power systems active power security correction deep reinforcement learning improved twin delayed deep deterministic policy gradient optimal adjustment scheme

收稿日期: 2022-06-08

PACS:

TM732

基金资助:

国家电网公司科技资助项目（SGTYHT/17-JS-199）

通讯作者: 李少岩男,1989年生,副教授,研究方向为电力系统安全防御与恢复控制、人工智能技术及其在电力系统中的应用等。E-mail: shaoyan.li@ncepu.edu.cn

作者简介: 顾雪平男,1964年生,教授,博士生导师,研究方向为电力系统安全稳定评估与控制、电力系统安全防御与恢复控制、人工智能技术及其在电力系统中的应用等。E-mail: xpgu@ncepu.edu.cn

引用本文:

顾雪平, 刘彤, 李少岩, 王铁强, 杨晓东. 基于改进双延迟深度确定性策略梯度算法的电网有功安全校正控制[J]. 电工技术学报, 2023, 38(8): 2162-2177. Gu Xueping, Liu Tong, Li Shaoyan, Wang Tieqiang, Yang Xiaodong. Active Power Correction Control of Power Grid Based on Improved Twin Delayed Deep Deterministic Policy Gradient Algorithm. Transactions of China Electrotechnical Society, 2023, 38(8): 2162-2177.

链接本文:

https://dgjsxb.ces-transaction.com/CN/10.19595/j.cnki.1000-6753.tces.221073 https://dgjsxb.ces-transaction.com/CN/Y2023/V38/I8/2162

[1] 林涛, 毕如玉, 陈汝斯, 等. 基于二阶锥规划的计及多种快速控制手段的综合安全校正策略[J]. 电工技术学报, 2020, 35(1): 167-178.
Lin Tao, Bi Ruyu, Chen Rusi, et al.Comprehensive security correction strategy based on second-order cone programming considering multiple fast control measures[J]. Transactions of China Electrotechnical Society, 2020, 35(1): 167-178.
[2] 刘瑶, 彭书涛, 张志华, 等. 基于抽样盲数的线路N-1静态安全评估[J]. 电力系统保护与控制, 2019, 47(7): 106-112.
Liu Yao, Peng Shutao, Zhang Zhihua, et al.Static security assessment according to N-1 criterion for transmission lines based on sampled-blind-number[J]. Power System Protection and Control, 2019, 47(7): 106-112.
[3] 陈中, 朱政光, 严俊, 等. 基于交直流混联系统静态安全域安全校正控制后的优化调度[J]. 电力自动化设备, 2021, 41(4): 139-147, 169.
Chen Zhong, Zhu Zhengguang, Yan Jun, et al.Optimal dispatch after security correction control based on steady-state security region of AC/DC hybrid system[J]. Electric Power Automation Equip- ment, 2021, 41(4): 139-147, 169.
[4] 孙淑琴, 颜文丽, 吴晨悦, 等. 基于原-对偶内点法的输电断面有功安全校正控制方法[J]. 电力系统保护与控制, 2021, 49(7): 75-85.
Sun Shuqin, Yan Wenli, Wu Chenyue, et al.Active power flow safety correction control method of transmission sections based on a primal-dual interior point method[J]. Power System Protection and Control, 2021, 49(7): 75-85.
[5] 王艳松, 卢志强, 李强, 等. 基于源-荷协同的电网静态安全校正最优控制算法[J]. 电力系统保护与控制, 2019, 47(20): 73-80.
Wang Yansong, Lu Zhiqiang, Li Qiang, et al.Optimal control algorithm for static safety correction of power grid based on source-load coordination[J]. Power System Protection and Control, 2019, 47(20): 73-80.
[6] Wang Qin, McCalley J D, Zheng Tongxin, et al. Solving corrective risk-based security-constrained optimal power flow with Lagrangian relaxation and Benders decomposition[J]. International Journal of Electrical Power & Energy Systems, 2016, 75: 255-264.
[7] 刘阳, 夏添, 汪旸. 区域电网内多输电断面有功协同控制策略在线生成方法[J]. 电力自动化设备, 2020, 40(7): 204-210.
Liu Yang, Xia Tian, Wang Yang.On-line generation method of active power coordinated control strategy for multiple transmission sections in regional power grid[J]. Electric Power Automation Equipment, 2020, 40(7): 204-210.
[8] Wang Chenlu, Feng Changyou, Zeng Yuan, et al.Improved correction strategy for power flow control based on multi-machine sensitivity analysis[J]. IEEE Access, 8: 82391-82403.
[9] 邓佑满, 黎辉, 张伯明, 等. 电力系统有功安全校正策略的反向等量配对调整法[J]. 电力系统自动化, 1999, 23(18): 5-8.
Deng Youman, Li Hui, Zhang Boming, et al.Adjust- ment of equal and opposite quantities in pair s for strategy of active power security correction of power systems[J]. Automation of Electric Power Systems, 1999, 23(18): 5-8.
[10] 顾雪平, 张尚, 王涛, 等. 安全域视角下的有功安全校正优化控制方法[J]. 电力系统自动化, 2017, 41(18): 17-24.
Gu Xueping, Zhang Shang, Wang Tao, et al.Opti- mization control strategy for active power correction from perspective of security region[J]. Automation of Electric Power Systems, 2017, 41(18): 17-24.
[11] 陈中, 朱政光, 严俊. 基于安全距离灵敏度的交直流混联系统安全校正策略[J]. 电力自动化设备, 2019, 39(9): 144-150, 165.
Chen Zhong, Zhu Zhengguang, Yan Jun.Security correction strategy of AC/DC hybrid system based on security distance sensitivity[J]. Electric Power Auto- mation Equipment, 2019, 39(9): 144-150, 165.
[12] 徐正清, 肖艳炜, 李群山, 等. 基于灵敏度及粒子群算法的输电断面功率越限控制方法对比研究[J]. 电力系统保护与控制, 2020, 48(15): 177-186.
Xu Zhengqing, Xiao Yanwei, Li Qunshan, et al.Comparative study based on sensitivity and particle swarm optimization algorithm for power flow over- limit control method of transmission section[J]. Power System Protection and Control, 2020, 48(15): 177-186.
[13] 孙国强, 张恪, 卫志农, 等. 基于深度学习的含统一潮流控制器的电力系统快速安全校正[J]. 电力系统自动化, 2020, 44(19): 119-127.
Sun Guoqiang, Zhang Ke, Wei Zhinong, et al.Deep learning based fast security correction of power system with unified power flow controller[J]. Auto- mation of Electric Power Systems, 2020, 44(19): 119-127.
[14] Mnih V, Kavukcuoglu K, Silver D, et al.Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533.
[15] 李永刚, 王月, 吴滨源. 基于双重Q学习的动态风速预测模型[J]. 电工技术学报, 2022, 37(7): 1810-1819.
Li Yonggang, Wang Yue, Wu Binyuan.Dynamic wind speed prediction model based on double Q learning[J]. Transactions of China Electrotechnical Society, 2022, 37(7): 1810-1819.
[16] 梁煜东, 陈峦, 张国洲, 等. 基于深度强化学习的多能互补发电系统负荷频率控制策略[J]. 电工技术学报, 2022, 37(7): 1768-1779.
Liang Yudong, Chen Luan, Zhang Guozhou, et al.Load frequency control strategy of hybrid power generation system: a deep reinforcement learning— based approach[J]. Transactions of China Electro- technical Society, 2022, 37(7): 1768-1779.
[17] 赵冬梅, 陶然, 马泰屹, 等. 基于多智能体深度确定策略梯度算法的有功-无功协调调度模型[J]. 电工技术学报, 2021, 36(9): 1914-1925.
Zhao Dongmei, Tao Ran, Ma Taiyi, et al.Active and reactive power coordinated dispatching based on multi-agent deep deterministic policy gradient algorithm[J]. Transactions of China Electrotechnical Society, 2021, 36(9): 1914-1925.
[19] Mocanu E, Mocanu D C, Nguyen P H, et al.On-line building energy optimization using deep reinforce- ment learning[J]. IEEE Transactions on Smart Grid, 2019, 10(4): 3698-3708.
[20] 李嘉文, 余涛, 张孝顺, 等. 基于改进深度确定性梯度算法的AGC发电功率指令分配方法[J]. 中国电机工程学报, 2021, 41(21): 7198-7211.
Li Jiawen, Yu Tao, Zhang Xiaoshun, et al.AGC power generation command allocation method based on improved deep deterministic policy gradient algorithm[J]. Proceedings of the CSEE, 2021, 41(21): 7198-7211.
[21] 叶宇剑, 袁泉, 汤奕, 等. 抑制柔性负荷过响应的微网分散式调控参数优化[J]. 中国电机工程学报, 2022, 42(5): 1748-1759.
Ye Yujian, Yuan Quan, Tang Yi, et al.Decentralized coordination parameters optimization in microgrids mitigating demand response synchronization effect of flexible loads[J]. Proceedings of the CSEE, 2022, 42(5): 1748-1759.
[22] 孙伟卿, 王承民, 张焰, 等. 电力系统运行均匀性分析与评估[J]. 电工技术学报, 2014, 29(4): 173-180.
Sun Weiqing, Wang Chengmin, Zhang Yan, et al.Analysis and evaluation on power system operation homogeneity[J]. Transactions of China Electro- technical Society, 2014, 29(4): 173-180.