基于扩散模型检测的高铁接触网绝缘子缺陷语义描述方法

doi:10.19595/j.cnki.1000-6753.tces.241036

摘要
图/表
参考文献
相关文章 (7)

全文: PDF (3976 KB) HTML
输出: BibTeX | EndNote (RIS)

摘要高铁接触网绝缘子作为高速铁路牵引供电的重要装置,可为接触网提供电气部件绝缘和腕臂结构支撑,其安全性对于高速铁路行车至关重要。针对绝缘子检测时易受复杂环境背景干扰,导致缺陷检测精度低以及无法提供缺陷语义描述的问题,该文提出一种基于扩散模型检测的绝缘子缺陷描述方法。首先,构建大核空间选择特征提取网络,加强绝缘子缺陷特征信息的提取能力;其次,基于扩散模型设计融合扩散机制的检测解码器,并对解码器生成的噪声框进行逆向贝叶斯扩散,还原绝缘子真值框的预测,提高模型的抗背景干扰能力;最后,设计交叉注意力机制的编码器和解码器,实现图像与文本的跨模态映射,并通过文本过滤机制驱动的多模态语言视觉预训练（BLIP）模型,完成绝缘子缺陷文本描述输出。实验结果表明,所提绝缘子缺陷检测模型的平均准确度达到93.04%,相较于DTER和Faster RCNN的mAP_0.5分别提升4.63%和5.78%,且F1-score高达82.91%,平均双语评估替换评价指标（BLEU）和基于精确率的图像描述评价指标（CIDEr）分别达到83.51%和1.94。与其他方法相比,具有更高的检测精度和缺陷语义描述准确性,能够满足对高速铁路绝缘子缺陷的检测需求。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	陈永
	安卓奥博
	周建宇

关键词 ：高铁接触网, 绝缘子缺陷检测, 缺陷语义描述, 扩散模型, 交叉注意力机制

Abstract：The catenary insulator is a critical component of the traction power supply system for high-speed railways. It not only provides electrical control insulation but also plays an essential role in supporting the catenary arm structure. Therefore, the operational safety of the insulator is directly related to the stability of the entire high-speed railway system. However, the detection of insulator defects is often subject to various interferences due to the complex and dynamic railway environment, resulting in low detection accuracy. Moreover, traditional detection methods generally only identify the presence of defects but fail to provide specific semantic descriptions of these defects. This limitation significantly hampers the efficiency of fault diagnosis and maintenance operations. To address these challenges, this paper proposes a defect description method for insulators based on a diffusion model. This method optimizes existing detection technologies in several ways, enabling the model to not only detect insulator defects more accurately but also generate detailed textual descriptions of these defects.
Firstly, we designed a large-kernel spatial selection feature extraction network. Compared to traditional feature extraction networks, this network captures the feature information of insulator defects through larger spatial convolution kernels, significantly enhancing the model's ability to extract insulator defect features. The model can accurately identify potential defects in the insulator, even in complex backgrounds. Secondly, we proposed a detection decoder with a fusion diffusion mechanism based on the diffusion model. This decoder generates noise boxes and uses inverse Bayesian diffusion to restore predictions of the insulator's true bounding box, significantly improving the model's resistance to background interference. This innovation allows the model to more effectively isolate background noise in complex environments, thereby improving the accuracy of defect detection. Finally, to address the limitations of traditional detection models in semantic description, we designed an encoder and decoder based on a cross-attention mechanism to achieve cross-modal mapping between images and text. By using the BLIP model driven by a text filtering mechanism, the model can generate corresponding textual descriptions of the defects based on the detection results. The functionality not only provides maintenance personnel with more intuitive references but also greatly enhances the efficiency of fault handling. Experimental results validate the effectiveness of our method. The proposed insulator defect detection model achieved the mAP_0.5 of 93.04% and the AR and F1-score of up to 83.22% and 82.91%. The BLEU achieved 83.51%, with CIDEr of 1.94, ROUGE-L of 81.59%, METEOR of 51.50%, and SPICE of 37.88%.
The experimental results lead to the following conclusions: (1) Utilizing a large-kernel spatial selection feature extraction network as the image encoder enhances the insulator defect detection network's ability to focus on key features, thereby improving the model's detection accuracy. (2) To address the issue of insulator defect detection being easily disturbed by complex background environments, a detection decoder with a fusion diffusion mechanism was designed. This decoder performs inverse Bayesian diffusion on the noise boxes generated by the decoder, restoring the prediction of the insulator's true bounding box. The model's ability to resist background interference reduces the loss of semantic information related to insulator defects, and enhances the accuracy of the predicted bounding boxes. (3) A cross-modal mapping module was designed to map the relationship between insulator image defect features and text features. The language modeling encoder outputs a textual description of the insulator defects, completing the detection task. Thus, the proposed model not only offers higher detection accuracy but also generates accurate and detailed semantic descriptions of the defects, meeting the actual needs for insulator defect detection and description.

Key words： High speed railway catenary insulator defect detection defect image caption diffusion model cross-attention

收稿日期: 2024-06-17

PACS:	TM755
	TP389

基金资助:国家自然科学基金（62462043, 61963023）和兰州交通大学重点研发项目（ZDYF2304）资助

通讯作者: 陈永男,1979年生,教授,博士生导师,研究方向为轨道交通异常检测。E-mail：edukeylab@126.com

作者简介: 安卓奥博男,1999年生,硕士研究生,研究方向为计算机视觉。E-mail：123028557@qq.com

引用本文:

陈永, 安卓奥博, 周建宇. 基于扩散模型检测的高铁接触网绝缘子缺陷语义描述方法[J]. 电工技术学报, 2025, 40(13): 4100-4111. Chen Yong, An Zhuoaobo, Zhou Jianyu. Semantic Description Method of High-Speed Railway Contact Net Insulator Defects Based on Diffusion Model Detection. Transactions of China Electrotechnical Society, 2025, 40(13): 4100-4111.

链接本文:

https://dgjsxb.ces-transaction.com/CN/10.19595/j.cnki.1000-6753.tces.241036 https://dgjsxb.ces-transaction.com/CN/Y2025/V40/I13/4100

[1] 张血琴, 周志鹏, 郭裕钧, 等. 不同材质绝缘子污秽等级高光谱检测方法研究[J]. 电工技术学报, 2023, 38(7): 1946-1955.
Zhang Xueqin, Zhou Zhipeng, Guo Yujun, et al.Detection method of contamination grades of insulators with different materials based on hyperspectral technique[J]. Transactions of China Electrotechnical Society, 2023, 38(7): 1946-1955.
[2] 余颖, 刘亚东, 李维, 等. 配电线路针式绝缘子早期故障动态特性研究[J]. 电工技术学报, 2023, 38(1): 71-82.
Yu Ying, Liu Yadong, Li Wei, et al.Simulation and experimental research on pin insulator incipient fault dynamic characteristic in the distribution network[J]. Transactions of China Electrotechnical Society, 2023, 38(1): 71-82.
[3] Tan Ping, Li Xufeng, Xu Jinmei, et al.Catenary insulator defect detection based on contour features and gray similarity matching[J]. Journal of Zhejiang University: Science A, 2020, 21(1): 64-73.
[4] 顾桂梅, 陈国翠. 改进GA-BP算法的棒式绝缘子表面缺陷识别[J]. 铁道科学与工程学报, 2022, 19(2): 546-553.
Gu Guimei, Chen Guocui.Surface defect recognition of bar insulator based on improved GA-BP algorithm[J]. Journal of Railway Science and Engineering, 2022, 19(2): 546-553.
[5] 李斌, 屈璐瑶, 朱新山, 等. 基于多尺度特征融合的绝缘子缺陷检测[J]. 电工技术学报, 2023, 38(1): 60-70.
Li Bin, Qu Luyao, Zhu Xinshan, et al.Insulator defect detection based on multi-scale feature fusion[J]. Transactions of China Electrotechnical Society, 2023, 38(1): 60-70.
[6] Tan Ping, Li Xufeng, Ding Jin, et al.Mask R-CNN and multifeature clustering model for catenary insulator recognition and defect detection[J]. Journal of Zhejiang University: Science A, 2022, 23(9): 745-756.
[7] Wen Feng, Wang Mei, Hu Xiaojie. DFAM-DETR: deformable feature based attention mechanism DETR on slender object detection[J]. IEICE Transactions on Information and Systems, 2023, E106.D(3): 401-409.
[8] Chen Yanping, Deng Chong, Sun Qiang, et al.Lightweight detection methods for insulator self-explosion defects[J]. Sensors, 2024, 24(1): 290.
[9] Yang Zuopeng, Wang Pengbo, Chu Tianshu, et al.Human-centric image captioning[J]. Pattern Recognition, 2022, 126: 108545.
[10] 谢州益, 冯亚枝, 胡彦蓉, 等. 基于ResNet18特征编码器的水稻病虫害图像描述生成[J]. 农业工程学报, 2022, 38(12): 197-206.
Xie Zhouyi, Feng Yazhi, Hu Yanrong, et al.Generating image description of rice pests and diseases using a ResNet18 feature encoder[J]. Transactions of the Chinese Society of Agricultural Engineering, 2022, 38(12): 197-206.
[11] Li Chunyuan, Wong C, Zhang Sheng, et al. LLaVA-Med: training a large language-and-vision assistant for biomedicine in one day[J/OL]. ArXiv, 2023: 2306.00890. https://arxiv.org/abs/2306.00890v1.
[12] Ghandi T, Pourreza H, Mahyar H.Deep learning approaches on image captioning: a review[J]. ACM Computing Surveys, 2023, 56(3): 1-39.
[13] Sun Wei, Wang Chunshan, Gu Jingqiu, et al.Veg-DenseCap: dense captioning model for vegetable leaf disease images[J]. Agronomy, 2023, 13(7): 1700.
[14] Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models[J/OL]. ArXiv, 2020: 2006.11239. https://arxiv. org/abs/2006.11239v2.
[15] 袁志祥, 高永奇. InternDiffuseDet: 结合可变形卷积和扩散模型的目标检测方法[J]. 计算机工程与应用, 2024, 60(12): 203-215.
Yuan Zhixiang, Gao Yongqi.Intern diffuse det: object detection method combining deformable convolution and diffusion model[J]. Computer Engineering and Applications, 2024, 60(12): 203-215.
[16] Dosovitskiy A, Beyer L, Kolesnikov A, et al.An image is worth 16×16 words: transformers for image recognition at scale[C]//2021 IEEE/CVF International Conference on Learning Representations (ICLR), Onlie, 2021: 11926.
[17] Devlin J, Chang Mingwei, Lee K, et al.BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, USA, 2019: 4171-4186.
[18] Li Yuxuan, Hou Qibin, Zheng Zhaohui, et al.Large selective kernel network for remote sensing object detection[C]//2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023: 16748-16759.
[19] 苟军年, 杜愫愫, 刘力. 基于改进掩膜区域卷积神经网络的输电线路绝缘子自爆检测[J]. 电工技术学报, 2023, 38(1): 47-59.
Gou Junnian, Du Susu, Liu Li.Transmission line insulator self-explosion detection based on improved mask region-convolutional neural network[J]. Transa-ctions of China Electrotechnical Society, 2023, 38(1): 47-59.
[20] Chen Shoufa, Sun Peize, Song Yibing, et al.Diffusion Det: diffusion model for object detection[C]// 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023: 19773-19786.
[21] 张烨, 李博涛, 尚景浩, 等. 基于多尺度卷积注意力机制的输电线路防振锤缺陷检测[J]. 电工技术学报, 2024, 39(11): 3522-3537.
Zhang Ye, Li Botao, Shang Jinghao, et al.Defect detection of transmission line damper based on multi-scale convolutional attention mechanism[J]. Transactions of China Electrotechnical Society, 2024, 39(11): 3522-3537.
[22] Li Junnan, Li Dongxu, Xiong Caiming, et al.BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation[C]// Proceedings of the 39th International Conference on Machine Learning, Baltimore, USA, 2022: 12888-12900.
[23] 张中文, 吐松江·卡日, 张紫薇, 等. 基于双分支特征融合的电力设备缺陷文本挖掘方法[J]. 高压电器, 2024, 60(6): 188-196.
Zhang Zhongwen, Tusongjiang K, Zhang Ziwei, et al.Text mining method for power equipment defects based on two-branch feature fusion[J]. High Voltage Apparatus, 2024, 60(6): 188-196.
[24] Chen Xinlei, Fang Hao, Lin T Y, et al. Microsoft COCO captions: data collection and evaluation server[J/OL]. ArXiv, 2015: 1504.00325. https://arxiv. org/abs/1504.00325v2.
[25] Zhang Lin, Li Boqun, Cui Yang, et al.Research on improved YOLOv8 algorithm for insulator defect detection[J]. Journal of Real-Time Image Processing, 2024, 21(1): 22.
[26] Li L H, Zhang Pengchuan, Zhang Haotian, et al.Grounded language-image pre-training[C]//2022 IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022: 10955-10965.