基于深度学习的电气二次图纸语义识别方法Small target area extraction and semantic recognition method of electrical secondary drawings based on deep learning
褚雪汝,陈中,吴聪颖,李铁成,冯腾,刘清泉
CHU Xueru,CHEN Zhong,WU Congying,LI Tiecheng,FENG Teng,LIU Qingquan
摘要(Abstract):
图像文字识别及深度学习技术逐步应用在工程图纸识别领域。针对电气二次图纸语义识别中存在的小目标检测、文字背景复杂等问题,首先,提出面向电气有效信息的图纸小目标区域双层提取模型,上层模型为基于自适应阈值及轮廓检测的端子排单连通小目标区域提取,下层模型为基于双层目标检测网络的端子排表格及连接线文字小目标子区域提取。接着,提出基于单元格提取及Sobel算子边缘检测的端子排表格区域文字位置检测算法与基于水平垂直投影分割算法及方向旋转的端子排连接线文字区域的文字位置检测算法。最后利用所提算法对30张有标注图纸进行语义提取测试,测试集平均漏检率与正确臃的加权平均值为91.25%,测试集平均交并比平均值为82.61%,验证了所提算法的有效性及鲁棒性。
Image text recognition and deep learning technology are gradually applied in the field of engineering drawing recognition. The electrical secondary drawing takes the terminal block drawing as an example, which has problems such as small target detection and complex text background. Aiming at the problem of small target detection, a double-layer extraction model of small target area in drawings oriented to electrical effective information is proposed.The upper level model is the extraction of single-connected small target areas based on the adaptive threshold and contour detection, and the lower level model is the extraction of the terminal strip table and the small target sub-area of the connection line text based on the double-layer target detection network. Aiming at the complex text background, the text position detection of terminal row table area based on cell extraction and edge detection of Sobel operator, and the text position detection of terminal row connecting line text area based on the horizontal and vertical projection segmentation algorithm and direction rotation are proposed. The semantic extraction test on 30 marked drawings is conducted by the proposed method, the average F1 value of the test set is 91.25%, and the average intersection over union mean of the test set is 82.61%, which verifies the effectiveness and robustness of the proposed algorithm.
关键词(KeyWords):
区域分割;文字检测;小目标检测;YOLOv5;PaddleOCR
region segmentation;text detection;small target detection;YOLOv5;PaddleOCR
基金项目(Foundation): 国家电网总部科技项目(SGHEDK00JYJS2200012)
作者(Author):
褚雪汝,陈中,吴聪颖,李铁成,冯腾,刘清泉
CHU Xueru,CHEN Zhong,WU Congying,LI Tiecheng,FENG Teng,LIU Qingquan
DOI: 10.19585/j.zjdl.202308001
参考文献(References):
- [1] SMITH S M,BRADY J M.SUSAN—a new approach to low level image processing[J]. International Journal of Computer Vision,1997,23(1):45-78.
- [2] WANG K,BABENKO B,BELONGIE S. End-to-end scene text recognition[C]//2011 International Conference on Computer Vision,November 6-13,2011,Barcelona,Spain. IEEE,2012:1457-1464.
- [3] ALSHARIF O,PINEAU J. End-to-end text recognition with hybrid HMM maxout models[EB/OL].(2013-10-07)[2022-08-20].https://arxiv.org/abs/1310.1811v1.
- [4] JADERBERG M,VEDALDI A,ZISSERMAN A.Deep features for text spotting[C]//European Conference on Computer Vision(ECCV),September 6-12,2014,Zurich,Switzerland. Springer,2014:512-528.
- [5]杜建强,陈月林,刘少媚,等.工程图纸上的字符提取和识别系统[J].计算机工程,1995,21(1):62-65.DU Jianqiang,CHEN Yuelin,LIU Shaomei,et al.Character extraction and recognition system on engineering drawings[J].Computer Engineering,1995,21(1):62-65.
- [6]刁智华,赵春江,吴刚,等.数学形态学在作物病害图像处理中的应用研究[J].中国图象图形学报,2010,15(2):194-199.DIAO Zhihua,ZHAO Chunjiang,WU Gang,et al.Application research of mathematical morphology in image processing of crop disease[J].Journal of Image and Graphics,2010,15(2):194-199.
- [7]周长英.基于改进的模糊BP神经网络图像分割算法[J].计算机仿真,2011,28(4):287-290.ZHOU Changying.Research on image segmentation technology based on improved fuzzy BP neural network[J].Computer Simulation,2011,28(4):287-290.
- [8]阳树洪.灰度图像阈值分割的自适应和快速算法研究[D].重庆:重庆大学,2014.YANG Shuhong.Study on the adaptive and fast algrithm of gray scale image thresholding[D].Chongqing:Chongqing University,2014.
- [9]赵耀,王红星,袁保宗.分形图像编码研究的进展[J].电子学报,2000,28(4):95-101.ZHAO Yao,WANG Hongxing,YUAN Baozong. Advances in fractal image coding[J].Acta Electronica Sinica,2000,28(4):95-101.
- [10]陈一民,姚杰.单幅图像多尺度小波深度提取算法[J].计算机辅助设计与图形学学报,2014,26(11):2023-2030.CHEN Yimin,YAO Jie. Depth extraction algorithm for single image based on multi-scale wavelet[J]. Journal of Computer-Aided Design&Computer Graphics,2014,26(11):2023-2030.
- [11] CUEVAS E,SENCIóN F,ZALDIVAR D,et al.A multithreshold segmentation approach based on Artificial Bee Colony optimization[J]. Applied Intelligence,2012,37(3):321-336.
- [12] SHATNAWI N,FAIDZUL M,SAHRAN S. Optimization of multilevel image thresholding using the bees algorithm[J]. Journal of Applied Sciences,2013,13(3):458-464.
- [13] GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),June 23-28,2014,Columbus,USA. IEEE,2014:580-587.
- [14] GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision(ICCV). December 7-13,2015,Santiago,Chile.IEEE,2016:1440-1448.
- [15] REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
- [16] REDMON J,DIVVALA S,GIRSHICK R,et al. You only look once:unified,real-time object detection[EB/OL].(2015-06-08)[2022-08-21]. https://arxiv. org/abs/1506.02640.
- [17] LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shot MultiBox detector[C]//European Conference on Computer Vision(ECCV),October 11-14,2016,Amsterdam,The Netherlands. Springer,2016:21-37.
- [18]田萱,王亮,丁琪.基于深度学习的图像语义分割方法综述[J].软件学报,2019,30(2):440-468.TIAN Xuan,WANG Liang,DING Qi.Review of image semantic segmentation based on deep learning[J].Journal of Software,2019,30(2):440-468.
- [19]冯海.基于深度学习的中文OCR算法与系统实现[D].深圳:中国科学院大学(中国科学院深圳先进技术研究院),2019.FENG Hai.Chinese OCR algorithm and system implementation based on deep learning[D].Shenzhen:Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences,2019.
- [20]马芳.基于EAST与ASTER的自然场景图像中文本检测与识别研究[D].武汉:武汉邮电科学研究院,2020.MA Fang. Research on text detection and recognition in natural scene images based on EAST and ASTER[D].Wuhan:Wuhan Research Institute of Posts and Telecommunications,2020.