In the context of energy transition and the construction of new type power systems, distribution networks are gradually evolving into a critical link for supporting the consumption of renewable energy. The large-scale integration of distributed photovoltaic systems and electric vehicles has driven a structural transformation of distribution networks from a unidirectional radial topology to a bidirectional interactive one, leading to technical challenges such as node voltage violations, increased network losses, and three-phase voltage unbalance. Traditional mathematical programming methods face challenges due to their high computational complexity, which makes online decision-making difficult to implement. Meanwhile, deep reinforcement learning algorithms exhibit limitations in managing the regulation of multi-timescale active management devices. There is an urgent need to research operational adjustment strategies tailored to the characteristics of active distribution networks to ensure their safe, stable, and efficient operation.
To address these issues, a “day-ahead and intra-day” two-stage dynamic optimal power flow dispatch method for three-phase unbalanced active distribution networks is proposed, incorporating data-driven approaches. This method leverages the advantages of both physical mechanism modeling and data-driven learning to achieve refined coordinated control of multiple types of regulation resources. Firstly, considering the constraints of active management and demand response, a dynamic optimal power flow model for three-phase unbalanced active distribution networks is established, with the objectives of voltage regulation and loss reduction. Then, at the hour-level long timescale of day-ahead dispatch, the tap positions of slow discrete regulating devices are determined using mixed-integer second-order cone programming with lift-and-project relaxation. At the minute-level short timescale of intra-day dispatch, the dynamic optimal power flow problem is formulated as a Markov decision process, and an improved proximal policy optimization algorithm based on expert knowledge and a hierarchical reward shaping mechanism is proposed for online dispatch of fast continuous regulating devices.
Simulation analyses are conducted using the IEEE 33-bus and 123-bus three-phase radial distribution networks as case studies, drawing the following conclusions: (1) During the day-ahead optimization stage, considering the inter-phase and temporal coupling relationships, the tap positions of slow discrete regulating devices such as capacitor banks and on-load tap-changers are determined by using mixed-integer second-order cone programming with lift-and-project relaxation. Both the fitting error and the relaxation deviation degree of the proposed method are less than 10-3, indicating an improvement in computational accuracy compared to traditional methods. (2) During the intra-day optimization stage, the dynamic optimal power flow problem is transformed into a Markov decision process, and online control of fast continuous regulating devices such as distributed photovoltaic systems, electric vehicles, energy storage systems, and static var compensators is carried out based on the improved proximal policy optimization algorithm. By comprehensively considering the constraints of active management and demand response, a collaborative optimization of voltage regulation and loss reduction in active distribution networks is achieved, while ensuring that the three-phase voltage unbalance degree meets the national standard limit of 2%. (3) The proposed improved proximal policy optimization algorithm, on one hand, enhances the enforcement capability of the three-phase voltage unbalance degree constraint and effectively guides the agent to reduce active power curtailment of distributed photovoltaic systems by introducing expert knowledge. On the other hand, it adopts a hierarchical reward shaping mechanism that dynamically couples the immediate reward with long-term performance indicators to adapt to the multi-timescale dispatch requirements of the day-ahead and intra-day stages. The algorithm can balance both solution efficiency and accuracy, exhibiting superior computational performance.
[1] 国家发展改革委, 国家能源局, 国家数据局. 关于印发《加快构建新型电力系统行动方案(2024—2027年)》的通知[EB/OL]. (2024-07-25)[2025-04-06] . https://www.gov.cn/zhengce/zhengceku/202408/content_6966863.htm.
[2] 国家能源局. 关于印发《配电网高质量发展行动实施方案(2024—2027年)》的通知[EB/OL]. (2024-08-02)[2025-04-06] . https://www.gov.cn/zhengce/zhengceku/202408/content_6969919.htm.
[3] 国家发展改革委, 国家能源局. 关于新形势下配电网高质量发展的指导意见[EB/OL]. (2024-02-06)[2025-04-06] . https://www.gov.cn/zhengce/zhengceku/202403/content_6935790.htm.
[4] 李宗晟, 张璐, 张志刚, 等. 考虑柔性资源多维价值标签的交直流配电网灵活调度[J]. 电工技术学报, 2024, 39(9): 2621-2634.
Li Zongsheng, Zhang Lu, Zhang Zhigang, et al.A flexible scheduling method of AC/DC hybrid distribution network considering the multi-dimensional value tags of flexible resources[J]. Transactions of China Electrotechnical Society, 2024, 39(9): 2621-2634.
[5] 王守相, 尹孜阳, 赵倩宇. 考虑多供电层级耦合的中低压配电网分布式光伏承载力一体化精细评估方法[J]. 电工技术学报, 2025, 40(6): 1930-1944.
Wang Shouxiang, Yin Ziyang, Zhao Qianyu.A precise distributed PV hosting capability evaluation method for MV and LV distribution network considering the coupling of multiple power supply layers[J]. Transactions of China Electrotechnical Society, 2025, 40(6): 1930-1944.
[6] 张剑, 崔明建, 何怡刚. 结合数据驱动与物理模型的主动配电网双时间尺度电压协调优化控制[J]. 电工技术学报, 2024, 39(5): 1327-1339.
Zhang Jian, Cui Mingjian, He Yigang.Dual timescales coordinated and optimal voltages control in distribution systems using data-driven and physical optimization[J]. Transactions of China Electrotechnical Society, 2024, 39(5): 1327-1339.
[7] 于惠钧, 马凡烁, 陈刚, 等. 基于改进灰狼优化算法的含光伏配电网动态无功优化[J]. 电气技术, 2024, 25(4): 7-15, 58.
Yu Huijun, Ma Fanshuo, Chen Gang, et al.Dynamic reactive power optimization of photovoltaic distribution network based on improved gray wolf optimization algorithm[J]. Electrical Engineering, 2024, 25(4): 7-15, 58.
[8] Farivar M, Low S H.Branch flow model: relaxations and convexification: part I[J]. IEEE Transactions on Power Systems, 2013, 28(3): 2554-2564.
[9] Farivar M, Low S H.Branch flow model: relaxations and convexification: part II[J]. IEEE Transactions on Power Systems, 2013, 28(3): 2565-2572.
[10] 高红均, 刘俊勇, 沈晓东, 等. 主动配电网最优潮流研究及其应用实例[J]. 中国电机工程学报, 2017, 37(6): 1634-1645.
Gao Hongjun, Liu Junyong, Shen Xiaodong, et al.Optimal power flow research in active distribution network and its application examples[J]. Proceedings of the CSEE, 2017, 37(6): 1634-1645.
[11] 姚良忠, 徐箭, 赵大伟, 等. 高比例可再生能源电力系统优化运行[M]. 北京: 科学出版社, 2022.
[12] 陈艳波, 张智, 徐井强, 等. 广义快速分解潮流计算方法[J]. 电力系统自动化, 2019, 43(6): 85-91.
Chen Yanbo, Zhang Zhi, Xu Jingqiang, et al.Generalized fast decoupled load flow algorithm[J]. Automation of Electric Power Systems, 2019, 43(6): 85-91.
[13] 刘一兵, 吴文传, 张伯明, 等. 基于混合整数二阶锥规划的三相有源配电网无功优化[J]. 电力系统自动化, 2014, 38(15): 58-64.
Liu Yibing, Wu Wenchuan, Zhang Boming, et al.Reactive power optimization for three-phase distribution networks with distributed generators based on mixed integer second-order cone programming[J]. Automation of Electric Power Systems, 2014, 38(15): 58-64.
[14] 刘一兵, 吴文传, 张伯明, 等. 基于混合整数二阶锥规划的主动配电网有功-无功协调多时段优化运行[J]. 中国电机工程学报, 2014, 34(16): 2575-2583.
Liu Yibing, Wu Wenchuan, Zhang Boming, et al.A mixed integer second-order cone programming based active and reactive power coordinated multi-period optimization for active distribution network[J]. Proceedings of the CSEE, 2014, 34(16): 2575-2583.
[15] 徐添锐, 丁涛, 李立, 等. 适应三相不平衡主动配电网无功优化的二阶锥松弛模型[J]. 电力系统自动化, 2021, 45(24): 81-88.
Xu Tianrui, Ding Tao, Li Li, et al.Second-order cone relaxation model adapting to reactive power optimization for three-phase unbalanced active distribution network[J]. Automation of Electric Power Systems, 2021, 45(24): 81-88.
[16] 巨云涛, 黄炎, 张若思. 基于二阶锥规划凸松弛的三相交直流混合主动配电网最优潮流[J]. 电工技术学报, 2021, 36(9): 1866-1875.
Ju Yuntao, Huang Yan, Zhang Ruosi.Optimal power flow of three-phase hybrid AC-DC in active distribution network based on second order cone programming[J]. Transactions of China Electrotechnical Society, 2021, 36(9): 1866-1875.
[17] 孙乾皓, 张耀, 周一丹, 等. 基于半正定规划的交直流主动配电网三相有功无功联合优化[J]. 电工技术学报, 2024, 39(9): 2608-2620.
Sun Qianhao, Zhang Yao, Zhou Yidan, et al.Three-phase active-reactive power optimization of AC-DC active distribution network based on semi-definite programming[J]. Transactions of China Electrotechnical Society, 2024, 39(9): 2608-2620.
[18] Jacob R A, Paul S, Chowdhury S, et al.Real-time outage management in active distribution networks using reinforcement learning over graphs[J]. Nature Communications, 2024, 15: 4766.
[19] 冯斌, 胡轶婕, 黄刚, 等. 基于深度强化学习的新型电力系统调度优化方法综述[J]. 电力系统自动化, 2023, 47(17): 187-199.
Feng Bin, Hu Yijie, Huang Gang, et al.Review on optimization methods for new power system dispatch based on deep reinforcement learning[J]. Automation of Electric Power Systems, 2023, 47(17): 187-199.
[20] Zhang Ying, Wang Xinan, Wang Jianhui, et al.Deep reinforcement learning based volt-VAR optimization in smart distribution systems[J]. IEEE Transactions on Smart Grid, 2021, 12(1): 361-371.
[21] 李鹏, 钟瀚明, 马红伟, 等. 基于深度强化学习的有源配电网多时间尺度源荷储协同优化调控[J]. 电工技术学报, 2025, 40(5): 1487-1502.
Li Peng, Zhong Hanming, Ma Hongwei, et al.Multi-timescale optimal dispatch of source-load-storage coordination in active distribution network based on deep reinforcement learning[J]. Transactions of China Electrotechnical Society, 2025, 40(5): 1487-1502.
[22] Liang Tao, Zhang Xiaochan, Tan Jianxin, et al.Deep reinforcement learning-based optimal scheduling of integrated energy systems for electricity, heat, and hydrogen storage[J]. Electric Power Systems Research, 2024, 233: 110480.
[23] El Helou R, Kalathil D, Xie Le.Fully decentralized reinforcement learning-based control of photovoltaics in distribution grids for joint provision of real and reactive power[J]. IEEE Open Access Journal of Power and Energy, 2021, 8: 175-185.
[24] 毕刚. 基于深度强化学习的有源配电网协同调压控制方法研究[D]. 南京: 南京邮电大学, 2022.
Bi Gang.Research on coordinated voltage regulation methods of active distribution networks based on deep reinforcement learning[D]. Nanjing: Nanjing University of Posts and Telecommunications, 2022.
[25] Gan Lingwen, Low S H.Convex relaxations and linear approximation for optimal power flow in multiphase radial networks[C]//2014 Power Systems Computation Conference, Wroclaw, Poland, 2014: 1-9.
[26] Horn R A, Johnson C R.Matrix analysis[M]. Cambridge: Cambridge University Press, 2012.
[27] Kocuk B, Dey S S, Sun X A.Strong SOCP relaxations for the optimal power flow problem[J]. Operations Research, 2016, 64(6): 1177-1196.
[28] 国家质量监督检验检疫总局, 中国国家标准化管理委员会. 电能质量三相电压不平衡: GB/T 15543—2008[S]. 北京: 中国标准出版社, 2009.
[29] Goodfellow I, Bengio Y, Courville A.Deep learning[M]. Cambridge: MIT Press, 2016.
[30] Wang Yi, Qiu Dawei, Sun Mingyang, et al.Secure energy management of multi-energy microgrid: a physical-informed safe reinforcement learning approach[J]. Applied Energy, 2023, 335: 120759.
[31] 杨志学, 任洲洋, 孙志媛, 等. 基于近端策略优化算法的新能源电力系统安全约束经济调度方法[J]. 电网技术, 2023, 47(3): 988-998.
Yang Zhixue, Ren Zhouyang, Sun Zhiyuan, et al.Security-constrained economic dispatch of renewable energy integrated power systems based on proximal policy optimization algorithm[J]. Power System Technology, 2023, 47(3): 988-998.