Automatic Generation Control Based on the Taylor Twin Delayed Deep Deterministic Policy Gradient Algorithm
Xi Lei1,2, Wang Wentao1, Quan Yue3, Liu Zhihong4, Ren Jianyu1
1. College of Electrical Engineering and New Energy China Three Gorges University Yichang 443002 China;
2. Hubei Provincial Key Laboratory for Operation and Control of Cascaded Hydropower Station China Three Gorges University Yichang 443002 China;
3. School of Electrical Engineering Beijing Jiaotong University Beijing 400030 China;
4. State Key Laboratory of Power Transmission Equipment Technology Chongqing University Chongqing 400030 China
The proportion of clean energy in modern power systems is steadily increasing. The large-scale integration of clean energy, which is highly random and intermittent, introduces significant stochastic disturbances that severely impact the control performance and frequency stability of power systems. This paper explores the challenges of declining control performance and frequency stability caused by large-scale clean energy integration from the perspective of Automatic Generation Control. Currently, Automatic Generation Control methods in engineering applications mainly follow a centralized model. Since centralized control prioritizes optimizing performance within its own region, it is difficult to achieve coordinated control across different areas. In addition, factors such as communication delays and geographical location further limit the coordination and consistency of centralized control methods.
The new power systems based on a distributed model divide the system into interconnected subsystems. Load Frequency Control is used to regulate the output of each generator, ensuring coordinated operation across multiple areas of the grid. In this process, reinforcement learning algorithms based on Markov decision processes have advantages in solving random problems and enabling multi-area coordinated control. As a result, they are gradually being introduced into Automatic Generation Control to achieve optimal control performance and frequency stability in multi-area grids. However, reinforcement learning algorithms often suffer from the problem of overestimating action values, which can lead to larger frequency deviations in power systems.
To address this issue in distributed power systems, we propose the Taylor twin delayed deep deterministic policy gradient algorithm to obtain the optimal multi-area coordinated solution. This approach aims to improve control performance and frequency stability in power systems with large-scale clean energy integration. The proposed algorithm uses a Taylor series expansion to update the value network, which mitigates the issue of action value overestimation commonly found in reinforcement learning. This improvement enhances the control accuracy of the algorithm, thereby improving the frequency stability of the power system. Additionally, an experience replay strategy is introduced to replace random sampling of training data. This strategy assigns lower priority to samples affected by random disturbances and noise, which tend to reduce learning capability, while giving higher priority to samples with greater learning potential. This approach increases the accuracy of optimization, thus reducing the impact of random disturbances on control performance.
To validate the effectiveness of the proposed algorithm, we developed an improved IEEE standard two-area Load Frequency Control model and a wind-solar-hydro-thermal-storage integrated three-area interconnected Load Frequency Control model. Simulations were conducted by introducing step disturbances, random square wave disturbances, and other load variations. The control performance of the TaTD3-ReLo, TaTD3, TD3, DDPG, and DQN algorithms was analyzed under different operating conditions. The series of simulation results demonstrated that the TaTD3-ReLo algorithm exhibits strong robustness and high learning capability. Compared to other reinforcement learning algorithms, the proposed algorithm shows superior control performance and more stable frequency responses. It also enables effective coordination among distributed multi-area interconnected grids in the new power system model, addressing the decline in control performance and frequency stability caused by the large-scale integration of clean energy.
席磊, 王文涛, 全悦, 刘治洪, 任建宇. 基于泰勒双延迟深度确定性策略梯度算法的自动发电控制[J]. 电工技术学报, 0, (): 20241673-20241673.
Xi Lei, Wang Wentao, Quan Yue, Liu Zhihong, Ren Jianyu. Automatic Generation Control Based on the Taylor Twin Delayed Deep Deterministic Policy Gradient Algorithm. Transactions of China Electrotechnical Society, 0, (): 20241673-20241673.
[1] 李军徽, 潘雅慧, 穆钢, 等. 高比例风电系统中储能集群辅助火电机组调峰分层优化控制策略[J/OL]. 电工技术学报, 2024: 1-18[2024-09-23]. https://doi.org/10.19595/j.cnki.1000-6753.tces.240545.
Li Junhui, Pan Yahui, Mu Gang, et al.Hierarchical optimal control strategy for storage cluster-assisted thermal unit peaking in high-ratio wind power system [J/OL]. 2024: 1-18[2024-09-23]. https://doi.org/10.19595/j.cnki.1000-6753.tces.240545.
[2] Debbarma S, Saikia L C, Sinha N. Automatic generation control using two degree of freedom fractional order PID controller[J]. International Journal of Electrical Power & Energy Systems, 2014, 58: 120-129.
[3] Sahu R K, Panda S, Yegireddy N K. A novel hybrid DEPS optimized fuzzy PI/PID controller for load frequency control of multi-area interconnected power systems[J]. Journal of Process Control, 2014, 24(10): 1596-1608.
[4] Sahu B K, Pati S, Mohanty P K, et al. Teaching-learning based optimization algorithm based fuzzy-PID controller for automatic generation control of multi-area power system[J]. Applied Soft Computing, 2015, 27: 240-249.
[5] Liu Fang, Li Yong, Cao Yijia, et al. A two-layer active disturbance rejection controller design for load frequency control of interconnected power system[J]. IEEE Transactions on Power Systems, 2016, 31(4): 3320-3321.
[6] 王磊, 胡国, 吴海, 等. 基于分层深度强化学习的分布式能源系统多能协同优化方法[J]. 电力系统自动化, 2024, 48(1): 67-76. Wang Lei, Hu Guo, Wu Hai, et al. Multi-energy collaborative optimization method for distributed energy systems based on hierarchical deep reinforcement learning[J]. Automation of Electric Power Systems, 2024, 48(1): 67-76.
[7] Yin Linfei, Zhang Chenwei, Wang Yaoxiong, et al. Emotional deep learning programming controller for automatic voltage control of power systems[J]. IEEE Access, 2021, 9: 31880-31891.
[8] Zhang Xiao shun, Yu Tao, Pan Zhen ning, et al. Lifelong learning for complementary generation control of interconnected power grids with high-penetration renewables and EVs[J]. IEEE Transactions on Power Systems, 2018, 33(4): 4097-4110.
[9] 罗清局, 朱继忠. 基于多参数规划改进ADMM的线性电-气综合能源系统分布式优化调度[J]. 电工技术学报, 2024, 39(9): 2797-2809.
Luo Qingju, Zhu Jizhong.Distributed optimal dispatch of linear integrated electricity and gas system based on multi-parameter programming modified ADMM[J]. Transactions of China Electrotechnical Society, 2024, 39(9): 2797-2809.
[10] Li Jiawen, Yu Tao, Zhu Hanxin, et al. Multi-agent deep reinforcement learning for sectional AGC dispatch[J]. IEEE Access, 2020, 8: 158067-158081.
[11] 张薇, 王浚宇, 杨茂, 等. 基于分布式双层强化学习的区域综合能源系统多时间尺度优化调度[J/OL]. 电工技术学报, 2024: 1-16[2024-10-23].https://doi.org/10.19595/j.cnki.1000-6753.tces.240907.
Zhang Wei, Wang Junyu, Yang Mao, el al. The multi-time-scale optimal scheduling for regional integrated energy system based on the distributed bi-layer reinforcement learning[J]. Transactions of China Electrotechnical Society, 2024: 1-16[2024-10-23]. https://doi.org/10.19595/j.cnki.1000-6753.tces.240907.
[12] Li Jiawen, Yu Tao. Virtual generation alliance automatic generation control based on deep reinforcement learning[J]. IEEE Access, 2020, 8: 182204-182217.
[13] Yu Tao, Zhou Bin, Chan K W, et al. Stochastic optimal relaxed automatic generation control in non-Markov environment based on multi-step $Q(łambda)$ learning[J]. IEEE Transactions on Power Systems, 2011, 26(3): 1272-1282.
[14] Yu T, Zhou B, Chan K W, et al. R, 2018, 9(3): 2152-2165.
[16] Thrun S, Schwartz A.Issues in using function approximation for reinforcement learning; proceedings of the Proceedings of the 1993 connectionist models summer school, F, 2014 [C]. Psychology Press.
[18] 李彦营, 席磊, 郭宜果, 等. 基于权重双Q-时延更新学习算法的自动发电控制[J]. 中国电机工程学报, 2022, 42(15): 5459-5471.
Li Yanying, Xi Lei, Guo Yiguo, et al.Automatic generation control based on the weighted double Q-delayed update learning algorithm[J]. Proceedings of the CSEE, 2022, 42(15): 5459-5471.
[19] Xi Lei, Li Haokai, Zhu Jizhong, et al. A novel automatic generation control method based on the large-scale electric vehicles and wind power integration into the grid[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(5): 5824-5834.
[20] 席磊, 刘治洪, 李彦营. 基于拉格朗日松弛强化学习算法的自动发电控制[J]. 中国电机工程学报, 2023, 43(4): 1359-1369. Xi Lei, Liu Zhihong, Li Yanying. Automatic generation control based on Lagrangian relaxation reinforcement learning algorithm[J]. Proceedings of the CSEE, 2023, 43(4): 1359-1369.
[21] Lillicrap T, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning[J]. arXiv preprint arXiv:150902971, 2015.
[22] Vaswani S, Kazemi A, Babanezhad R, et al. Addressing function approximation error in actor-critic methods: supplementary material A. proof of convergence of clipped double Q-learning[C] //Proceedings of the International conference on machine learning, 2018
[23] Garibbo M, Robeyns M, Aitchison L.Taylor TD-learning[J]. Advances in Neural Information Processing Systems, 2024, 36.
[24] Sujit S, Nath S, Braga P, et al.Prioritizing samples in reinforcement learning with reducible loss[J]. Advances in Neural Information Processing Systems, 2024, 36.
[25] 甘伟, 艾小猛, 方家琨, 等. 风-火-水-储-气联合优化调度策略[J]. 电工技术学报, 2017, 32(增刊1): 11-20.
Gan Wei, Ai Xiaomeng, Fang Jiakun, et al. Coordinated optimal operation of the wind, coal, hydro, gas units with energy storage[J]. Transactions of China Electrotechnical Society, 2017, 32(S1): 11-20.
[26] Magdy G, Shabib G, Elbaset A A, et al. Renewable power systems dynamic security using a new coordination of frequency control strategy based on virtual synchronous generator and digital frequency protection[J]. International Journal of Electrical Power & Energy Systems, 2019, 109: 351-368.
[27] 赵熙临, 周红玉, 付波, 等. 一种用于微网调频的风电与抽水蓄能综合控制方法[J]. 河南理工大学学报(自然科学版), 2023, 42(4): 121-129. Zhao Xilin, Zhou Hongyu, Fu Bo, et al. A comprehensive control method for wind power and pumped storage in the frequency regulation of microgrid[J]. Journal of Henan Polytechnic University (Natural Science), 2023, 42(4): 121-129.
[28] 李嘉文, 余涛, 张孝顺, 等. 基于改进深度确定性梯度算法的AGC发电功率指令分配方法[J]. 中国电机工程学报, 2021, 41(21): 7198-7212. Li Jiawen, Yu Tao, Zhang Xiaoshun, et al. AGC power generation command allocation method based on improved deep deterministic policy gradient algorithm[J]. Proceedings of the CSEE, 2021, 41(21): 7198-7212.
[29] Jaleeli N, VanSlyck L S. NERC’s new control performance standards[J]. IEEE Transactions on Power Systems, 1999, 14(3): 1092-1099.
[30] 吴珊, 边晓燕, 张菁娴, 等. 面向新型电力系统灵活性提升的国内外辅助服务市场研究综述[J]. 电工技术学报, 2023, 38(6): 1662-1677. Wu Shan, Bian Xiaoyan, Zhang Jingxian, et al. A review of domestic and foreign ancillary services market for improving flexibility of new power system[J]. Transactions of China Electrotechnical Society, 2023, 38(6): 1662-1677.
[31] Mnih V, Kavukcuoglu K, Silver D, et al. Playing atari with deep reinforcement learning[J]. ArXiv e-Prints, 2013: arXiv: 1312.5602.