Double-Layers Stacking Estimation Model for Feeder Statistical Line Loss Rate Based on Tree-Based Ensemble Learning and MoE
Wang Shouxiang1,2, Zhang Bingjie1,2, Zhao Qianyu1,2, Guo Luyang1,2, Zhang Sheng1,2
1. Key Laboratory of Smart Grid of Ministry of Education Tianjin University Tianjin 300072 China; 2. Tianjin Key Laboratory of Power System Simulation and Control Tianjin University Tianjin 300072 China
摘要 统计线损率是衡量电力系统经济运行的重要指标。然而,用户用电数据采集异常、数据传输中断等因素会导致统计线损率异常或缺失,这严重阻碍了智能配电网的线损精益化管理与经济高效运行。针对馈线统计线损率合理值的估计问题,本文提出了一种基于集成树和混合专家系统(Mixture of Experts,MoE)的馈线统计线损率双层估计模型。首先,使用最大信息系数以更有效地分析统计线损率与其相关特征间的非线性关系,并采用鲁棒性强的K-Medoids聚类算法对馈线进行精细划分。然后,使用Stacking集成学习框架,基于基估计和元估计双层模型对馈线统计线损率进行两阶段估计,选用决策树和各类集成树模型作为基估计模型对统计线损率进行初步估计,将各基估计模型输出结果输入到元估计模型MoE中进行最终估计,使用方均根误差(RMSE)和平均绝对误差(MAE)来衡量模型所估计统计线损率的合理性。通过算例分析表明,与其他模型相比,本文所提馈线统计线损率双层估计模型具有更低的RMSE和MAE,对馈线统计线损率的估计效果更好。
Abstract:Reducing line loss is important for power grids to save energy and achieve carbon neutrality. Statistical line loss rate is an important indicator for the refined management of line loss in the power grid. However, abnormal data collection of power consumption, interruption of data transmission and other factors lead to the abnormality or missing of statistical line loss rate. At present, the ensemble learning framework is applied to the field of line loss estimation, but models used for estimation are all machine learning models, so the estimation accuracy needs to be improved. In order to improve the accuracy of statistical line loss rate estimation, a double-layers estimation model for feeder statistical line loss rate based on tree-based ensemble learning and Mixture of Experts (MoE) is proposed. Firstly, the maximum information coefficient (MIC) is used to effectively analyze the nonlinear relationship between the statistical line loss rate and its correlated features, so as to build a feature set of statistical line loss rate. Secondly, the feature vector of each feeder is input to the robust K-Medoids clustering algorithm to realize the fine division of feeders. Thirdly, using the Stacking integrated learning framework, the feeder statistical line loss rate is estimated in two stages based on the base estimation and meta estimation double-layer models. The decision tree, gradient boosting decision Tree (GBDT), adaptive boosting (AdaBoost), eXtreme Gradient Boosting (XGBoost), random forest and extremely randomized tree (ExtraTree) are selected as base estimation models for preliminary estimation of the statistical line loss rate, and the output results of each base estimation model are input into the meta estimation model MoE for final estimation. A comprehensive set of experiments has been conducted on a real-world feeder statistical line loss rate dataset: 1) The MIC values of statistical line loss rate and theoretical line loss rate, total length of line, line power supply, line operation time, rated capacity of distribution transformer, operation time of distribution transformer are respectively 0.948, 0.81, 0.701, 0.672, 0.768 and 0.683, which demonstrates the high correlation between each feature and the statistical line loss rate. 2) Feature vectors are fed into the K-medoids algorithm, feeders are divided into three parts. Through clustering, the total RMSE and MAE of statistical line loss rate estimated by the proposed model are decreased by 5% and 7% respectively. 3) Compared with other models, the error distribution of proposed model is concentrated in the low error area, and the between the median value and the mean value is closer, which means the proposed model has better accuracy and stability. The comparison between the proposed model and other ensemble model which has the best performance shows that, the RMSE of each type of feeders estimated by the proposed model are reduced by 4%, 2%, 5% respectively, and the MAE of each type of feeders estimated by the proposed model are reduced by 10%, 3%, 9% respectively. The following conclusions can be drawn from the simulation analysis: (1) The maximum information coefficient is used to verify the rationality of using the theoretical line loss rate and its related features for feeder clustering and statistical line loss rate estimation. (2) Compared with direct estimation, the estimation accuracy of statistical line loss rate can be improved by clustering feeders using K-medoids algorithm. (3) Compared with the existing ensemble estimation model, the estimation model proposed in this paper has lower RMSE and MAE, which means the statistical line loss rate estimated by the proposed model is more reasonable.
王守相, 张丙杰, 赵倩宇, 郭陆阳, 张晟. 基于集成树和MoE的馈线统计线损率双层估计模型[J]. 电工技术学报, 0, (): 112-112.
Wang Shouxiang, Zhang Bingjie, Zhao Qianyu, Guo Luyang, Zhang Sheng. Double-Layers Stacking Estimation Model for Feeder Statistical Line Loss Rate Based on Tree-Based Ensemble Learning and MoE. Transactions of China Electrotechnical Society, 0, (): 112-112.
[1] 南方电网公司. 2021年绿色低碳发展年刊[EB/OL].(2022-06-13) China Southern Power Grid.2021 Green and Low Carbon Development Yearbook.[EB/OL].(2022-06-13) [2] 马喜平, 贾嵘, 梁琛, 等. 高比例新能源接入下电力系统降损研究综述[J]. 电网技术, 2022, 46(11): 4305-4315. Ma Xiping, Jia Rong, Liang Chen, et al.Review of researches on loss reduction in context of high penetration of renewable power generation[J]. Power System Technology, 2022, 46(11): 4305-4315. [3] 王方雨, 刘文颖, 陈鑫鑫, 等. 基于“秩和”近似相等特性的同期线损异常数据辨识方法[J]. 电工技术学报, 2020, 35(22): 4771-4783. Wang Fangyu, Liu Wenying, Chen Xinxin, et al.Abnormal data identification of synchronous line loss based on the approximate equality of rank sum[J]. Transactions of China Electrotechnical Society, 2020, 35(22): 4771-4783. [4] 唐登平, 李俊, 孟展, 等. 统计线损数据准确性研究[J]. 电力系统保护与控制, 2018, 46(24): 33-39. Tang Dengping, Li Jun, Meng Zhan, et al.Research on accuracy of statistical line losses[J]. Power System Protection and Control, 2018, 46(24): 33-39. [5] 黄彦钦, 余浩, 尹钧毅, 等. 电力物联网数据传输方案:现状与基于5G技术的展望[J]. 电工技术学报, 2021, 36(17): 3581-3593. Huang Yanqin, Yu Hao, Yin Junyi, et al.Data transmission schemes of power Internet of Things: present and outlook based on 5G technology[J]. Transactions of China Electrotechnical Society, 2021, 36(17): 3581-3593. [6] 马伟明. 关于电工学科前沿技术发展的若干思考[J]. 电工技术学报, 2021, 36(22): 4627-4636. Ma Weiming.Thoughts on the development of frontier technology in electrical engineering[J]. Transactions of China Electrotechnical Society, 2021, 36(22): 4627-4636. [7] 刘晟源, 章天晗, 林振智, 等. 数据赋能低压配用电系统精益化运行的关键技术与算法[J]. 电力系统自动化, 2023, 47(3): 187-199. Liu Shengyuan, Zhang Tianhan, Lin Zhenzhi, et al.Key technologies and algorithms of data empowerment for lean operation of low-voltage power distribution and consumption system[J]. Automation of Electric Power Systems, 2023, 47(3): 187-199. [8] 李鹏, 习伟, 蔡田田, 等. 数字电网的理念、架构与关键技术[J]. 中国电机工程学报, 2022, 42(14): 5002-5017. Li Peng, Xi Wei, Cai Tiantian, et al.Concept, architecture and key technologies of digital power grids[J]. Proceedings of the CSEE, 2022, 42(14): 5002-5017. [9] 徐焕增, 孔政敏, 王帅, 等. 基于动态线损及FMRLS算法的智能电表误差在线评估模型[J]. 中国电机工程学报, 2021, 41(24): 8349-8358. Xu Huanzeng, Kong Zhengmin, Wang Shuai, et al.Online error evaluation model of smart meter based on dynamic line loss and FMRLS algorithm[J]. Proceedings of the CSEE, 2021, 41(24): 8349-8358. [10] 彭建春, 李春晖, 祁学红, 等. 基于快速独立成分分析和支持向量回归的混合馈线线损估算[J]. 电力系统保护与控制, 2012, 40(3): 51-55. Peng Jianchun, Li Chunhui, Qi Xuehong, et al.Loss estimation of power distribution systems based on fast independent component analysis and support vector regression[J]. Power System Protection and Control, 2012, 40(3): 51-55. [11] 徐茹枝, 王宇飞. 粒子群优化的支持向量回归机计算配电网理论线损方法[J]. 电力自动化设备, 2012, 32(5): 86-89, 93. Xu Ruzhi, Wang Yufei.Theoretical line loss calculation based on SVR and PSO for distribution system[J]. Electric Power Automation Equipment, 2012, 32(5): 86-89, 93. [12] 张义涛, 王泽忠, 刘丽平, 等. 基于灰色关联分析和改进神经网络的10 kV配电网线损预测[J]. 电网技术, 2019, 43(4): 1404-1410. Zhang Yitao, Wang Zezhong, Liu Liping, et al.A 10 kV distribution network line loss prediction method based on grey correlation analysis and improved artificial neural network[J]. Power System Technology, 2019, 43(4): 1404-1410. [13] 周王峰, 李勇, 郭钇秀, 等. 基于DAE-LSTM神经网络的配电网日线损率预测[J]. 电力系统保护与控制, 2021, 49(17): 48-56. Zhou Wangfeng, Li Yong, Guo Yixiu, et al.Daily line loss rate forecasting of a distribution network based on DAE-LSTM[J]. Power System Protection and Control, 2021, 49(17): 48-56. [14] 卢志刚, 杨英杰, 李学平, 等. 基于深度迁移学习理论含风电光伏系统的地区电网网损率计算[J]. 中国电机工程学报, 2020, 40(13): 4102-4111. Lu Zhigang, Yang Yingjie, Li Xueping, et al.A deep migration learning based power loss rate calculation method for distributed power system with wind and solar generation[J]. Proceedings of the CSEE, 2020, 40(13): 4102-4111. [15] 欧阳森, 冯天瑞, 安晓华. 考虑馈线聚类特性的中压配网线损率测算模型[J]. 电力自动化设备, 2016, 36(9): 33-39. Ouyang Sen, Feng Tianrui, An Xiaohua.Line-loss rate calculation model considering feeder clustering features for medium-voltage distribution network[J]. Electric Power Automation Equipment, 2016, 36(9): 33-39. [16] 徐建军, 黄立达, 闫丽梅, 等. 基于层次多任务深度学习的绝缘子自爆缺陷检测[J]. 电工技术学报, 2021, 36(7): 1407-1415. Xu Jianjun, Huang Lida, Yan Limei, et al.Insulator self-explosion defect detection based on hierarchical multi-task deep learning[J]. Transactions of China Electrotechnical Society, 2021, 36(7): 1407-1415. [17] 王守相, 周凯, 苏运. 基于随机森林算法的台区合理线损率估计方法[J]. 电力自动化设备, 2017, 37(11): 39-45. Wang Shouxiang, Zhou Kai, Su Yun.Line loss rate estimation method of transformer district based on random forest algorithm[J]. Electric Power Automation Equipment, 2017, 37(11): 39-45. [18] Wang Shouxiang, Dong Pengfei, Tian Yingjie.A novel method of statistical line loss estimation for distribution feeders based on feeder cluster and modified XGBoost[J]. Energies, 2017, 10(12): 2067. [19] 马良玉, 程善珍. 基于支持向量数据描述和XGBoost的风电机组异常工况预警研究[J]. 电工技术学报, 2022, 37(13): 3241-3249. Ma Liangyu, Cheng Shanzhen.Abnormal state early warning of wind turbine generator based on support vector data description and XGBoost[J]. Transactions of China Electrotechnical Society, 2022, 37(13): 3241-3249. [20] 邓威, 郭钇秀, 李勇, 等. 基于特征选择和Stacking集成学习的配电网网损预测[J]. 电力系统保护与控制, 2020, 48(15): 108-115. Deng Wei, Guo Yixiu, Li Yong, et al.Power losses prediction based on feature selection and Stacking integrated learning[J]. Power System Protection and Control, 2020, 48(15): 108-115. [21] 董美娜,刘丽平,王泽忠,等.基于Stacking集成学习的有源台区线损率评估方法[J/OL].电测与仪表,2021,1-7. Dong Meina, Liu Liping, Wang Zezhong,et al.A line loss rate evaluation method based on Stacking ensemble learning for transformer district with DG[J/OL]. Eleetrical Measurement & Instrumentation, 2021,1-7. [22] Khodayar M, Liu Guangyi, Wang Jianhui, et al.Deep learning in power systems research: a review[J]. CSEE Journal of Power and Energy Systems, 2020, 7(2): 209-220. [23] Shazeer N, Mirhoseini A, Maziarz K, et al. Outrageously large neural networks: the sparsely-gated mixture-of-experts layer[EB/OL].2017: arXiv: 1701.06538. https://arxiv.org/abs/1701.06538 [24] Sambasivam G, Amudhavel J, Sathya G.A predictive performance analysis of vitamin D deficiency severity using machine learning methods[J]. IEEE Access, 2020, 8: 109492-109507. [25] 贾科, 宣振文, 林瑶琦, 等. 基于Adaboost算法的并网光伏发电系统的孤岛检测法[J]. 电工技术学报, 2018, 33(5): 1106-1113. Jia Ke, Xuan Zhenwen, Lin Yaoqi, et al.An islanding detection method for grid-connected photovoltaic power system based on adaboost algorithm[J]. Transactions of China Electrotechnical Society, 2018, 33(5): 1106-1113. [26] 邱高, 刘俊勇, 刘友波, 等. 风电外送通道极限传输能力的自适应向量机估计[J]. 电工技术学报, 2018, 33(14): 3342-3352. Qiu Gao, Liu Junyong, Liu Youbo, et al.Adaptive support vector machine estimation for total transfer capability of wind power exporting corridors[J]. Transactions of China Electrotechnical Society, 2018, 33(14): 3342-3352. [27] 胡聪, 徐敏, 洪德华, 等. 基于改进K-medoids聚类和SVM的异常用电模式在线检测方法[J]. 国外电子测量技术, 2022, 41(2): 53-59. Hu Cong, Xu Min, Hong Dehua, et al.Online detection method for abnormal electricity model behavior based on improved K-medoids clustering and SVM[J]. Foreign Electronic Measurement Technology, 2022, 41(2): 53-59.