基于正则表达式和Jaccard系数的智能变电站录波通道同源匹配Homologous matching of recording channels in intelligent substations based on regular expression and Jaccard similarity coefficient
王冠南,郭丽娟,彭曙蓉,陈慧霞,黄浩宇
WANG Guannan,GUO Lijuan,PENG Shurong,CHEN Huixia,HUANG Haoyu
摘要(Abstract):
针对220 kV及以上电压等级智能变电站双套录波通道同源匹配问题,提出一种基于正则表达式和Jaccard系数的智能变电站录波通道同源匹配方法。首先,针对录波通道命名不规范的问题,使用正则表达式对通道名称文本进行预处理,统一通道名称的表达形式;同时,使用jieba分词算法和去停用词操作,去除通道名称文本中可能存在的冗余信息。然后,使用Jaccard相似系数匹配算法计算录波通道名称文本之间的相似度,依据相似度大小筛选出同源通道。最后,基于电网实际的录波文件数据进行仿真分析。仿真结果表明:所提方法可有效实现智能变电站录波通道同源匹配。
In addressing the challenge of homologous matching for dual sets of recording channels in intelligent substations of 220 kV and above, this paper presents a novel method employing regular expression and Jaccard index.To overcome the issue of irregular naming of recording channels, regular expressions to preprocess name texts of the channels are employed to ensure a standardized expression format. Furthermore, through Jieba word segmentation algorithm and stopword removal potential redundant information within the name texts of the channels. Subsequently, the Jaccard similarity coefficient matching algorithm is employed to calculate the similarity between recording channel names, screening out homologous channels based on their similarity degrees. To validate the proposed method, simulations are conducted using actual recording file data from the power grid. The results affirm the effectiveness of the proposed method in achieving homologous matching of recording channels in intelligent substations.
关键词(KeyWords):
录波通道同源匹配;文本匹配;正则表达式;Jaccard相似系数
homologous matching of recording channel;text matching;regular expression;Jaccard similarity coefficient
基金项目(Foundation): 国网江西省电力有限公司科技项目(52182022000A);; 湖南省教育厅重点项目(20A021);; 国家自然科学基金面上项目(52177069)
作者(Author):
王冠南,郭丽娟,彭曙蓉,陈慧霞,黄浩宇
WANG Guannan,GUO Lijuan,PENG Shurong,CHEN Huixia,HUANG Haoyu
DOI: 10.19585/j.zjdl.202401003
参考文献(References):
- [1]钱平,杨松伟,张永,等.面向智能变电站信息流可靠性分析方法研究[J].电测与仪表,2021,58(10):106-111.QIAN Ping,YANG Songwei,ZHANG Yong,et al. Research on reliability analysis method of information flow in intelligent substation[J].Electrical Measurement&Instrumentation,2021,58(10):106-111.
- [2]刘曦,戴瑞海,陈磊.110 kV常规变电站智能化改造模式的探讨[J].浙江电力,2012,31(1):14-17.LIU Xi,DAI Ruihai,CHEN Lei.Exploration on intellectualization mode of 110 kV conventional substation[J].Zhejiang Electric Power,2012,31(1):14-17.
- [3]郑翔,殷建军,杜奇伟,等.变电站自动化设备运维管控系统及其应用[J].浙江电力,2021,40(3):42-50.ZHENG Xiang,YIN Jianjun,DU Qiwei,et al.Operation and maintenance control system for substation automatic equipment and its application[J].Zhejiang Electric Power,2021,40(3):42-50.
- [4]陈昊琳,张国庆,郭志忠.故障录波器发展历程及现状分析[J].电力系统保护与控制,2010,38(5):148-152.CHEN Haolin,ZHANG Guoqing,GUO Zhizhong.Development and present situation analysis of fault recorder[J].Power System Protection and Control,2010,38(5):148-152.
- [5]熊小伏,陈星田,翁世杰.支撑大数据分析的发电厂变电站全息录波方法[J].电力系统保护与控制,2015,43(22):17-22.XIONG Xiaofu,CHEN Xingtian,WENG Shijie.A holographic record method supporting big data analysis for power plant and substation[J]. Power System Protection and Control,2015,43(22):17-22.
- [6]李铁成,任江波,刘清泉,等.基于深度学习的智能录波器配置数据自动化映射方法[J].电测与仪表,2022,59(9):76-83.LI Tiecheng,REN Jiangbo,LIU Qingquan,et al. Automatic mapping method of intelligent recorder configuration datasets based on deep learning[J]. Electrical Measurement&Instrumentation,2022,59(9):76-83.
- [7]叶远波,程晓平,张兆云,等.电力系统故障区域录波自动分析关键技术[J].中国电力,2022,55(4):93-99.YE Yuanbo,CHENG Xiaoping,ZHANG Zhaoyun,et al.Key technology of automatic analysis of fault area wave recording of power system[J].Electric Power,2022,55(4):93-99.
- [8]庞亮,兰艳艳,徐君,等.深度文本匹配综述[J].计算机学报,2017,40(4):985-1003.PANG Liang,LAN Yanyan,XU Jun,et al.A survey on deep text matching[J]. Chinese Journal of Computers,2017,40(4):985-1003.
- [9]于鹏.逻辑公式间的Jaccard距离及其应用[J].计算机科学与探索,2020,14(11):1975-1980.YU Peng.Jaccard distance of logical formulas and its application[J]. Journal of Frontiers of Computer Science and Technology,2020,14(11):1975-1980.
- [10]田星,郑瑾,张祖平.基于词向量的Jaccard相似度算法[J].计算机科学,2018,45(7):186-189.TIAN Xing,ZHENG Jin,ZHANG Zuping. Jaccard text similarity algorithm based on word embedding[J]. Computer Science,2018,45(7):186-189.
- [11]赵子涵,刘鑫,叶翔,等.智能变电站二次系统“一键式”安措自动生成方法研究[J].电测与仪表,2019,56(4):15-20.ZHAO Zihan,LIU Xin,YE Xiang,et al.Research on “oneclick” automatic generation method for maintenance safety measures of secondary device in smart substation[J].Electrical Measurement&Instrumentation,2019,56(4):15-20.
- [12]李玫,高庆,马森,等.面向代码相似性检测的相似哈希改进方法[J].软件学报,2021,32(7):2242-2259.LI Mei,GAO Qing,MA Sen,et al.Enhanced simhash algorithm for code similarity detection[J]. Journal of Software,2021,32(7):2242-2259.
- [13]姚舜.基于BM25模型与借阅预测模型的书目检索排序算法研究[J].图书馆杂志,2016,35(10):63-68.YAO Shun. A study into the BM25 model&borrowing behavior prediction model based on book retrieval sorting algorithm[J].Library Journal,2016,35(10):63-68.
- [14]余伟中.基于VSM的中文文本分类算法研究[D].南京:南京邮电大学,2018.YU Weizhong.Research of Chinese text classification algorithms based on VSM[D].Nanjing:Nanjing University of Posts and Telecommunications,2018.
- [15] JACCARD P. The distribution of the flora in the alpine zone.1[J].New Phytologist,1912,11(2):37-50.
- [16] MANKU G S,JAIN A,DAS SARMA A.Detecting nearduplicates for web crawling[C]//Proceedings of the 16th international conference on World Wide Web.May 8-12,2007,Banff,Alberta,Canada. New York:ACM,2007:141-150.
- [17]李晓,解辉,李立杰.基于Word2Vec的句子语义相似度计算研究[J].计算机科学,2017,44(9):256-260.LI Xiao,XIE Hui,LI Lijie.Research on sentence semantic similarity calculation based on Word2Vec[J]. Computer Science,2017,44(9):256-260.
- [18]陈旭,张弛,刘千宽,等.基于深度语义学习的智能录波器自配置方法[J].电力系统保护与控制,2021,49(2):179-187.CHEN Xu,ZHANG Chi,LIU Qiankuan,et al.Automatic configuration method of intelligent recorder based on deep semantic learning[J].Power System Protection and Control,2021,49(2):179-187.
- [19]金宁,赵春江,吴华瑞,等.基于多语义特征的农业短文本匹配技术[J].农业机械学报,2022,53(5):325-331.JIN Ning,ZHAO Chunjiang,WU Huarui,et al. Agricultural short text matching technology based on multisemantic features[J].Transactions of the Chinese Society for Agricultural Machinery,2022,53(5):325-331.
- [20]刘洋,赵庆志,王宏甲,等.基于正则表达式的译码方法研究[J].制造业自动化,2022,44(8):48-50.LIU Yang,ZHAO Qingzhi,WANG Hongjia,et al. Research on decoding method based on regular expression[J].Manufacturing Automation,2022,44(8):48-50.
- [21]胡军伟,秦奕青,张伟.正则表达式在Web信息抽取中的应用[J].北京信息科技大学学报(自然科学版),2011,26(6):86-89.HU Junwei,QIN Yiqing,ZHANG Wei. Regular expression and its applications to web information extraction[J].Journal of Beijing Information Science&Technology University,2011,26(6):86-89.
- [22]王雍,侯慧娟,姚琼琼.运行中智能电能表质量分析及预测方法研究[J].电测与仪表,2022,59(4):34-40.WANG Yong,HOU Huijuan,YAO Qiongqiong. Research on quality analysis and prediction method of smart electricity meter in operation[J]. Electrical Measurement&Instrumentation,2022,59(4):34-40.
- [23]王伟,张彦龙,翟登辉,等.基于OpenCV+SSD深度学习模型的变电站压板状态智能识别[J].电测与仪表,2022,59(1):106-112.WANG Wei,ZHANG Yanlong,ZHAI Denghui,et al.Intelligent identification of substation pressure plate state based on OpenCV+SSD deep learning model[J].Electrical Measurement&Instrumentation,2022,59(1):106-112.
- [24]王娜,何晓明,刘志强,等.一种基于用户播放行为序列的个性化视频推荐策略[J].计算机学报,2020,43(1):123-135.WANG Na,HE Xiaoming,LIU Zhiqiang,et al.Personalized video recommendation strategy based on user’s playback behavior sequence[J].Chinese Journal of Computers,2020,43(1):123-135.
- 录波通道同源匹配
- 文本匹配
- 正则表达式
- Jaccard相似系数
homologous matching of recording channel - text matching
- regular expression
- Jaccard similarity coefficient