Semantic Description Method of High-Speed Railway Contact Net Insulator Defects Based on Diffusion Model Detection
Chen Yong1,2, An Zhuoaobo1, Zhou Jianyu1
1. School of Electronic and Information Engineering Lanzhou Jiaotong University Lanzhou 730070 China; 2. Engineering Research Center for Artificial Intelligence and Graphics & Image Processing Lanzhou Jiaotong University Lanzhou 730070 China
Abstract:The catenary insulator is a critical component of the traction power supply system for high-speed railways. It not only provides electrical control insulation but also plays an essential role in supporting the catenary arm structure. Therefore, the operational safety of the insulator is directly related to the stability of the entire high-speed railway system. However, the detection of insulator defects is often subject to various interferences due to the complex and dynamic railway environment, resulting in low detection accuracy. Moreover, traditional detection methods generally only identify the presence of defects but fail to provide specific semantic descriptions of these defects. This limitation significantly hampers the efficiency of fault diagnosis and maintenance operations. To address these challenges, this paper proposes a defect description method for insulators based on a diffusion model. This method optimizes existing detection technologies in several ways, enabling the model to not only detect insulator defects more accurately but also generate detailed textual descriptions of these defects. Firstly, we designed a large-kernel spatial selection feature extraction network. Compared to traditional feature extraction networks, this network captures the feature information of insulator defects through larger spatial convolution kernels, significantly enhancing the model's ability to extract insulator defect features. The model can accurately identify potential defects in the insulator, even in complex backgrounds. Secondly, we proposed a detection decoder with a fusion diffusion mechanism based on the diffusion model. This decoder generates noise boxes and uses inverse Bayesian diffusion to restore predictions of the insulator's true bounding box, significantly improving the model's resistance to background interference. This innovation allows the model to more effectively isolate background noise in complex environments, thereby improving the accuracy of defect detection. Finally, to address the limitations of traditional detection models in semantic description, we designed an encoder and decoder based on a cross-attention mechanism to achieve cross-modal mapping between images and text. By using the BLIP model driven by a text filtering mechanism, the model can generate corresponding textual descriptions of the defects based on the detection results. The functionality not only provides maintenance personnel with more intuitive references but also greatly enhances the efficiency of fault handling. Experimental results validate the effectiveness of our method. The proposed insulator defect detection model achieved the mAP0.5 of 93.04% and the AR and F1-score of up to 83.22% and 82.91%. The BLEU achieved 83.51%, with CIDEr of 1.94, ROUGE-L of 81.59%, METEOR of 51.50%, and SPICE of 37.88%. The experimental results lead to the following conclusions: (1) Utilizing a large-kernel spatial selection feature extraction network as the image encoder enhances the insulator defect detection network's ability to focus on key features, thereby improving the model's detection accuracy. (2) To address the issue of insulator defect detection being easily disturbed by complex background environments, a detection decoder with a fusion diffusion mechanism was designed. This decoder performs inverse Bayesian diffusion on the noise boxes generated by the decoder, restoring the prediction of the insulator's true bounding box. The model's ability to resist background interference reduces the loss of semantic information related to insulator defects, and enhances the accuracy of the predicted bounding boxes. (3) A cross-modal mapping module was designed to map the relationship between insulator image defect features and text features. The language modeling encoder outputs a textual description of the insulator defects, completing the detection task. Thus, the proposed model not only offers higher detection accuracy but also generates accurate and detailed semantic descriptions of the defects, meeting the actual needs for insulator defect detection and description.
陈永, 安卓奥博, 周建宇. 基于扩散模型检测的高铁接触网绝缘子缺陷语义描述方法[J]. 电工技术学报, 2025, 40(13): 4100-4111.
Chen Yong, An Zhuoaobo, Zhou Jianyu. Semantic Description Method of High-Speed Railway Contact Net Insulator Defects Based on Diffusion Model Detection. Transactions of China Electrotechnical Society, 2025, 40(13): 4100-4111.
[1] 张血琴, 周志鹏, 郭裕钧, 等. 不同材质绝缘子污秽等级高光谱检测方法研究[J]. 电工技术学报, 2023, 38(7): 1946-1955. Zhang Xueqin, Zhou Zhipeng, Guo Yujun, et al.Detection method of contamination grades of insulators with different materials based on hyperspectral technique[J]. Transactions of China Electrotechnical Society, 2023, 38(7): 1946-1955. [2] 余颖, 刘亚东, 李维, 等. 配电线路针式绝缘子早期故障动态特性研究[J]. 电工技术学报, 2023, 38(1): 71-82. Yu Ying, Liu Yadong, Li Wei, et al.Simulation and experimental research on pin insulator incipient fault dynamic characteristic in the distribution network[J]. Transactions of China Electrotechnical Society, 2023, 38(1): 71-82. [3] Tan Ping, Li Xufeng, Xu Jinmei, et al.Catenary insulator defect detection based on contour features and gray similarity matching[J]. Journal of Zhejiang University: Science A, 2020, 21(1): 64-73. [4] 顾桂梅, 陈国翠. 改进GA-BP算法的棒式绝缘子表面缺陷识别[J]. 铁道科学与工程学报, 2022, 19(2): 546-553. Gu Guimei, Chen Guocui.Surface defect recognition of bar insulator based on improved GA-BP algorithm[J]. Journal of Railway Science and Engineering, 2022, 19(2): 546-553. [5] 李斌, 屈璐瑶, 朱新山, 等. 基于多尺度特征融合的绝缘子缺陷检测[J]. 电工技术学报, 2023, 38(1): 60-70. Li Bin, Qu Luyao, Zhu Xinshan, et al.Insulator defect detection based on multi-scale feature fusion[J]. Transactions of China Electrotechnical Society, 2023, 38(1): 60-70. [6] Tan Ping, Li Xufeng, Ding Jin, et al.Mask R-CNN and multifeature clustering model for catenary insulator recognition and defect detection[J]. Journal of Zhejiang University: Science A, 2022, 23(9): 745-756. [7] Wen Feng, Wang Mei, Hu Xiaojie. DFAM-DETR: deformable feature based attention mechanism DETR on slender object detection[J]. IEICE Transactions on Information and Systems, 2023, E106.D(3): 401-409. [8] Chen Yanping, Deng Chong, Sun Qiang, et al.Lightweight detection methods for insulator self-explosion defects[J]. Sensors, 2024, 24(1): 290. [9] Yang Zuopeng, Wang Pengbo, Chu Tianshu, et al.Human-centric image captioning[J]. Pattern Recognition, 2022, 126: 108545. [10] 谢州益, 冯亚枝, 胡彦蓉, 等. 基于ResNet18特征编码器的水稻病虫害图像描述生成[J]. 农业工程学报, 2022, 38(12): 197-206. Xie Zhouyi, Feng Yazhi, Hu Yanrong, et al.Generating image description of rice pests and diseases using a ResNet18 feature encoder[J]. Transactions of the Chinese Society of Agricultural Engineering, 2022, 38(12): 197-206. [11] Li Chunyuan, Wong C, Zhang Sheng, et al. LLaVA-Med: training a large language-and-vision assistant for biomedicine in one day[J/OL]. ArXiv, 2023: 2306.00890. https://arxiv.org/abs/2306.00890v1. [12] Ghandi T, Pourreza H, Mahyar H.Deep learning approaches on image captioning: a review[J]. ACM Computing Surveys, 2023, 56(3): 1-39. [13] Sun Wei, Wang Chunshan, Gu Jingqiu, et al.Veg-DenseCap: dense captioning model for vegetable leaf disease images[J]. Agronomy, 2023, 13(7): 1700. [14] Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models[J/OL]. ArXiv, 2020: 2006.11239. https://arxiv. org/abs/2006.11239v2. [15] 袁志祥, 高永奇. InternDiffuseDet: 结合可变形卷积和扩散模型的目标检测方法[J]. 计算机工程与应用, 2024, 60(12): 203-215. Yuan Zhixiang, Gao Yongqi.Intern diffuse det: object detection method combining deformable convolution and diffusion model[J]. Computer Engineering and Applications, 2024, 60(12): 203-215. [16] Dosovitskiy A, Beyer L, Kolesnikov A, et al.An image is worth 16×16 words: transformers for image recognition at scale[C]//2021 IEEE/CVF International Conference on Learning Representations (ICLR), Onlie, 2021: 11926. [17] Devlin J, Chang Mingwei, Lee K, et al.BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, USA, 2019: 4171-4186. [18] Li Yuxuan, Hou Qibin, Zheng Zhaohui, et al.Large selective kernel network for remote sensing object detection[C]//2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023: 16748-16759. [19] 苟军年, 杜愫愫, 刘力. 基于改进掩膜区域卷积神经网络的输电线路绝缘子自爆检测[J]. 电工技术学报, 2023, 38(1): 47-59. Gou Junnian, Du Susu, Liu Li.Transmission line insulator self-explosion detection based on improved mask region-convolutional neural network[J]. Transa-ctions of China Electrotechnical Society, 2023, 38(1): 47-59. [20] Chen Shoufa, Sun Peize, Song Yibing, et al.Diffusion Det: diffusion model for object detection[C]// 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023: 19773-19786. [21] 张烨, 李博涛, 尚景浩, 等. 基于多尺度卷积注意力机制的输电线路防振锤缺陷检测[J]. 电工技术学报, 2024, 39(11): 3522-3537. Zhang Ye, Li Botao, Shang Jinghao, et al.Defect detection of transmission line damper based on multi-scale convolutional attention mechanism[J]. Transactions of China Electrotechnical Society, 2024, 39(11): 3522-3537. [22] Li Junnan, Li Dongxu, Xiong Caiming, et al.BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation[C]// Proceedings of the 39th International Conference on Machine Learning, Baltimore, USA, 2022: 12888-12900. [23] 张中文, 吐松江·卡日, 张紫薇, 等. 基于双分支特征融合的电力设备缺陷文本挖掘方法[J]. 高压电器, 2024, 60(6): 188-196. Zhang Zhongwen, Tusongjiang K, Zhang Ziwei, et al.Text mining method for power equipment defects based on two-branch feature fusion[J]. High Voltage Apparatus, 2024, 60(6): 188-196. [24] Chen Xinlei, Fang Hao, Lin T Y, et al. Microsoft COCO captions: data collection and evaluation server[J/OL]. ArXiv, 2015: 1504.00325. https://arxiv. org/abs/1504.00325v2. [25] Zhang Lin, Li Boqun, Cui Yang, et al.Research on improved YOLOv8 algorithm for insulator defect detection[J]. Journal of Real-Time Image Processing, 2024, 21(1): 22. [26] Li L H, Zhang Pengchuan, Zhang Haotian, et al.Grounded language-image pre-training[C]//2022 IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022: 10955-10965.