基于多路径特征提取的实时语义分割方法

计算机科学 ›› 2022, Vol. 49 ›› Issue (7): 120-126.doi: 10.11896/jsjkx.210500157

• 计算机图形学&多媒体 • 上一篇下一篇

基于多路径特征提取的实时语义分割方法

程成, 降爱莲

太原理工大学信息与计算机学院山西晋中030600

收稿日期:2021-05-24 修回日期:2021-09-07 出版日期:2022-07-15 发布日期:2022-07-12
通讯作者: 降爱莲(ailianjiang@126.com)
作者简介:(2462074653@qq.com)
基金资助:
山西省回国留学人员科研资助项目(2017-051)

Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction

CHENG Cheng, JIANG Ai-lian

College of Information and Computer,Taiyuan University of Technology,Jinzhong,Shanxi 030600,China

Received:2021-05-24 Revised:2021-09-07 Online:2022-07-15 Published:2022-07-12
About author:CHENG Cheng,born in 1996,postgra-duate,is a member of China Computer Federation.His main research interests include deep learning and semantic segmentation.
JIANG Ai-lian,born in 1969,Ph.D,associate professor,is a member of China Computer Federation.Her main research interests include big data analysis and processing,feature selection,artificial intelligence and computer vision.
Supported by:
Scientific Research Funding Project for Returned Overseas Scholars in Shanxi Province(2017-051).

摘要/Abstract

摘要： 深度学习在图像语义分割领域的应用极大地提升了分割精确度,但由于深度学习网络在速度、内存等方面的限制,其并不能直接应用于嵌入式设备进行实时分割。针对语义分割模型存在的的网络结构复杂和计算开销巨大的问题,提出了结合边缘检测算法的多路径特征提取的实时语义分割算法。模型通过Sobel算子、Scharr算子和Laplacian算子对图像的轮廓信息进行提取。算法设计了空间路径提取图像的空间位置信息、语义路径提取图像高级语义信息,以及通过边缘检测路径提取图像中具有代表性的纹理特征,并采用Ghost轻量化模块来减少模型参数量,提高算法的分割速度。在480像素×360像素的CamVid数据集上的实验结果表明,在3种边缘检测算子上,模型的分割准确率均能得到有效提升,尤其是在加入3×3大小的Sobel算子下算法的性能提升最为明显,在CamVid测试集图像处理速度为349 frames/s的基础上,分割精度达到了42.9%。所提算法在分割精度和分割速度上均取得了较好的效果,在实时性和准确性之间达到了很好的平衡。

关键词: 边缘检测, 多特征提取, 深度学习, 实时语义分割, 特征融合

Abstract: The application of deep learning in the field of image semantic segmentation has greatly improved the accuracy of segmentation,but due to the limitations of speed and memory,these models can not be directly applied to embedded devices for real-time segmentation.Aiming at the problems of complex network structure and huge computation cost of semantic segmentation model,a real-time semantic segmentation algorithm based on multi-path feature extraction combined with edge detection algorithm is proposed.The model uses Sobel operator,Scharr operator and Laplacian operator to extract the contour information of the image.The algorithm designs the spatial path to extract the spatial position information of the image,designs the semantic path to extract the advanced semantic information of the image,and uses the edge detection path to extract the representative texture features of the image.The ghost lightweight module is used to reduce the amount of model parameters and improve the segmentation speed of the algorithm.Experimental results on 480 pixel and 360 pixel CamVid dataset show that the segmentation accuracy of the model can be improved on the three edge detection operators,especially when the Sobel operator with the size of 3×3 is added,the performance of the algorithm is improved most obviously,and the segmentation accuracy reaches 42.9% on the basis of the image processing speed of 349 frames/s on CamVid test set.Both the segmentation accuracy and segmentation speed achieve good results,and achieve a good balance between real-time and accuracy.

Key words: Deep learning, Edge detection, Feature fusion, Multi-feature extraction, Semantic segmentation in real-time

中图分类号:

TP391

程成, 降爱莲. 基于多路径特征提取的实时语义分割方法[J]. 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157

CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction[J]. Computer Science, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157

参考文献

[1]CHEN Q S,TAO Y,SHEN F H,et al.Semantic segmentation of images based on contextual structure[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2020,32(2):287-294.
[2]CHAO Y,XIAO N F.Research on Robotic Grasping Methods Based on Semantic Segmentation and Brain Computer Interface[J].Journal of Chongqing University of Technology(Natural Science),2020,34(3):128 -136,151.
[3]LECUN Y,BOTTOU L.Gradient-based learning applied to docu-ment recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[4]LONG J,SHEKHAMER E,DARRELL T.Fully Convolutional Networks for Semantic Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(4):640-651.
[5]RONNEBERGER O,FISCHER P,BROX T,et al.U-net:Con-volutional networks for biomedical image segmentation[J].Medical Image Computing and Computer Assisted Intervention,2015,28(4):234-241.
[6]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:IEEE Press,2017:2881-2890.
[7]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:Semantic Image Segmentation with Deep Convolutional Nets,Atrous Convolution,and Fully Connected CRFs[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2016,40(4):834-848.
[8]PASZKE A,CHAURASIA A,KIM A.ENet:a deep neural network architecture for real-time semantic segmentation[J].ar-Xiv:1606.02147,2016.
[9]YU C,WANG J,PENG C,et al.Bisenet:Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision.Munich,Germany,2018:325-341.
[10]LI H,XIONG P,FAN H,et al.Dfanet:Deep feature aggregation for real-time semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seoul:IEEE Press,2019:9522-9531.
[11]HAN K,WANG Y,TIAN Q,et al.GhostNet:More Features From Cheap Operations[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle:IEEE Press,2020:1577-1586.
[12]LIN G S,MILAN A,SHEN C H,et al.Refinenet:multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2017:5168-5177.
[13]SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,NV,2016:2818-2826.
[14]CHOLLET F.Xception:deep learning with depth wise separable convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,2017:1800-1807.
[15]SANDLER M,HOWARD A,ZHU M,et al.MobileNetV2:inverted residuals and linear bottlenecks.conference [C]//IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:4510-4520.
[16]ZHUANG L,LI J,SHEN Z,et al.Learning Efficient Convolutional Networks through Network Slimming [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,2017:2755-2763.
[17]ZHAO H,QI X,SHEN X,et al.Icnet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision(ECCV).Munich,Germany,2018:405-420.
[18]MA N,ZHANG X,ZHENG H T,et al.Shufflenet v2:Practical guidelines for efficient CNN architecture design[C]//Procee-dings of the European Conference on Computer Vision.Munich,Germany,2018:116-131.
[19]WANG Y,ZHOU Q,LIU J,et al.LEDNet:A Lightweight Encoder-Decoder Network for Real-Time Semantic Segmentation[C]//2019 IEEE International Conference on Image Processing(ICIP).Taipei:IEEE Press,2019:126-172.
[20]BADRINARAYANAN V,KENDALL A,CIPOLLA R.Segnet:A deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.

相关文章 15

[1]	徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2]	饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3]	汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4]	王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5]	郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6]	姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7]	孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8]	侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[9]	周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[10]	苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫. 小样本雷达辐射源识别的深度学习方法综述 Survey of Deep Learning for Radar Emitter Identification Based on Small Sample 计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[11]	胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[12]	张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[13]	郁舒昊, 周辉, 叶春杨, 王太正. SDFA:基于多特征融合的船舶轨迹聚类方法研究 SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion 计算机科学, 2022, 49(6A): 256-260. https://doi.org/10.11896/jsjkx.211100253
[14]	王君锋, 刘凡, 杨赛, 吕坦悦, 陈峙宇, 许峰. 基于多源迁移学习的大坝裂缝检测 Dam Crack Detection Based on Multi-source Transfer Learning 计算机科学, 2022, 49(6A): 319-324. https://doi.org/10.11896/jsjkx.210500124
[15]	楚玉春, 龚航, 王学芳, 刘培顺. 基于YOLOv4的目标检测知识蒸馏算法研究 Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4 计算机科学, 2022, 49(6A): 337-344. https://doi.org/10.11896/jsjkx.210600204

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed