A Method for Detecting Edge Lines of Traveling Lanes of Urban Roads Based on Grid Classification and Vertical-horizontal Attention
-
摘要: 车道线检测是汽车安全辅助驾驶系统的基础模块,在城市道路场景下车道线存在受碾压致特征缺失、车辆间相互遮挡以及光照环境复杂多变等问题,本文提出基于网格分类与纵横向注意力的车道线检测方法。提取道路图像的全局特征图,将其划分为若干网格,计算网格中车道线的存在概率;通过将车道线检测转化为网格位置的分类,定位每条车道线的特征点;构建基于Ghost模块的主干网络,结合车道线的形状特征,引入纵横向注意力机制,通过增强车道线纹理特征和融合位置信息,获取缺失的细节特征;利用三次多项式,拟合车道线特征点,修正车道线的检测结果。基于TuSimple与CULane数据集,在ResNet18、ResNet34和Dark-Net53中嵌入纵横向注意力模块,并开展对比实验。结果表明:在TuSimple数据集上,嵌入纵横向注意力模块后,模型精度均提升了约0.1%,与其他模型相比,Ghost-VHA模型的准确率为95.96%。在CULane数据集上,嵌入纵横向注意力模块可提升精度约0.65%,与其他模型相比,Ghost-VHA的F1分数为72.84%,提升了0.54%。在TuSimple与CULane数据集上,Ghost-VHA处理尺寸为288 px×800 px的图像仅需4.5 ms,具有良好的准确率和实时性。在CULane数据集上,网格列数量为300时效果最好,在TuSimple数据集上,网格列数量为50时效果最好。Abstract: Detecting edge lines of traveling lanes is fundamental to assisted vehicle safety-assisted driving systems. Due to the lane lines often exhibit missing features due to obstructions from vehicles and the complexities of the lighting conditions under various urban settings, a method for detecting edge lines of traveling lanes of urban roads based on grid classification and vertical-horizontal attention is proposed. The global feature maps are extracted from the road image and divided into multiple grids. Subsequently, the probability of the presence of edge lines of travel-ling lanes within each grid is calculated. By transforming the task of lane line detection into the grid position classifi-cation, the feature points associated with each lane line are accurately identified. The Ghost module is employed as the backbone. Additionally, vertical-horizontal attention (VHA) is introduced, enhancing lane line texture features, incorporating location information, and recovering missing details. The detection results are rectified by fitting the lane line feature points using cubic polynomials. The vertical-horizontal attention modules are embedded in ResNet18, ResNet34, and DarkNet53 to evaluate the proposed approach. The TuSimple and CULane datasets are utilized for conducting comparison experiments. Study results show that based on the TuSimple dataset, embed-ding the VHA module improves the accuracy by about 0.1%. Compared with other models, the accuracy of proposed Ghost-VHA is 95.96%. On the CULane dataset, embedding the VHA improves the accuracy by about 0.65%, and the corresponding F1 score of Ghost-VHA is 72.84%, which is 0.54% higher than other models. Analysis of the re-sults across nine urban scenarios reveals that the "ground sign interference" scenario exhibits the highest F1 score, reaching 85.7%. Furthermore, the Ghost-VHA method demonstrates excellent real-time performance by processing a 288 px×800 px image within a mere 4.5 ms based on the TuSimple and CULane datasets while maintaining satis-factory accuracy. Based on the CULane dataset, this model works best when the number of grid columns is 300 and based on the TuSimple dataset, this model works best when the number of grid columns is 50.
-
表 1 Ghost-VHA主分支的架构
Table 1. Architecture of Ghost-VHA main branch
输入(H × W × C) /px 操作 输出(C) VHA Stride 288×800×3 Conv2d 3x3 16 2 144×400×16 G-bneck 16 1 144×400×16 G-bneck 24 2 72×200×24 G-bneck 24 1 72×200×24 G-bneck 40 1 2 36×100×40 G-bneck 40 1 1 36×100×40 G-bneck 80 2 18×50×80 G-bneck 80 1 18×50×80 G-bneck 80 1 18×50×80 G-bneck 80 1 18×50×80 G-bneck 112 1 1 18×50×112 G-bneck 112 1 1 18×50×112 G-bneck 160 1 2 9×25×160 G-bneck 160 1 9×25×160 G-bneck 160 1 1 9×25×160 G-bneck 160 1 9×25×160 G-bneck 160 1 1 9×25×160 Conv2d 1×1 960 1 9×25×960 AvgPool 7×7 1×1×1 800 Conv2d 1×1 1 800 1 1×1×1 800 FC 1 000 注:G-bneck为Ghost bottleneck;“输出”为输出通道的数量;“VHA”为是否使用VHA模块。 表 2 数据集介绍
Table 2. Dataset introduction
数据集 图片总数/张 训练集/个 验证集/个 测试集/个 分辨率/px 车道数/条 场景 TuSimple 6 408 3 268 358 2 782 1 280×720 ≤ 5 高速公路 CULane 133 235 88 880 9 675 34 680 1 640×590 ≤ 4 城市与高速公路 表 3 不同的主干特征提取网络对模型性能的影响
Table 3. The impact of different backbones on model performance
主干特征提取网络 CULane测试集 TuSimple测试集 F1/% 单帧图像处理时间/ms 准确率/% 单帧图像处理时间/ms A ResNet18 69.42 3.69 95.67 3.82 B ResNet18-VHA 70.21 4.30 95.81 5.20 C ResNet34 71.44 5.22 95.76 5.66 D ResNet34-VHA 72.09 5.50 95.84 6.87 E Darknet53 71.45 6.17 95.72 6.81 F Darknet53-VHA 72.12 7.59 95.88 7.95 G Ghost-VHA 72.84 4.50 95.96 5.31 表 4 不同网络在CULane测试集上F1指标和运行时间对比
Table 4. Comparison of F1 and running time of different networks on the CULane test set
场景 方法 Res50-Seg SCNN PINet(4H) Res-34-SAD Res-101-SAD ResNet34-ultra Ghost-VHA 正常 87.4 90.6 90.3 89.9 90.7 90.7 90.4 拥堵 64.1 69.7 72.3 68.5 70.0 70.2 71.3 夜间 60.6 66.1 67.3 64.6 66.3 66.7 67.7 无车道线 38.1 43.4 49.8 42.2 43.5 44.4 42.4 阴影 60.7 66.9 68.4 67.7 67 69.3 74.8 地面箭头 79.0 84.1 83.7 83.8 84.4 85.7 85.7 眩光 54.1 58.5 66.3 59.9 59.9 59.5 61.8 弯道 59.8 64.4 65.6 66.0 65.7 69.5 61.8 交叉口 2 505 1 990 1 427 1 960 2 052 2 037 1 574 全量数据集 66.7 71.6 72.3 70.7 71.8 72.3 72.84 单帧图像处理时间t/ms 133.5 40 50.5 171.2 5.7 4.5 倍数 1.3× 4.3× 3.4× 1× 30× 38× FPS 7.5 25 19.8 5.8 175.4 222 注:“交叉口”一行中的数据为FP,数值越小检测效果越好。 表 5 不同网络在TuSimple测试集上准确率和运行时间对比
Table 5. Comparison of accuracy and running time of different networks on the TuSimple test set
方法 准确率/% 运行时间t/ms 倍数 ResNet18-Seg 92.69 25.3 5.3× ResNet34-Seg 92.84 50.5 2.6× LaneNet 96.38 19.0 7.0× PINet(4H) 96.75 40.0 3.3× SCNN 96.53 133.5 1.0× ENet-SAD 96.64 13.4 10.0× ResNet34-ultra 95.56 12.7 10.6× Ghost-VHA 95.96 5.3 25× -
[1] 裴玉龙, 迟佰强, 吕景亮, 等. "自动+人工"混合驾驶环境下交通管理研究综述[J]. 交通信息与安全, 2021, 39(5): 1-11. doi: 10.3963/j.jssn.1674-4861.2021.05.001PEI Y L, CHI B Q, LV J L, et al. An overview of traffic management in "Automatic + Manual" driving environment[J]. Journal of Transport Information and Safety, 2021, 39(5): 1-11. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2021.05.001 [2] 梁乐颖. 基于深度学习的车道线检测算法研究[D]. 北京: 北京交通大学, 2018.LIANG L Y. Lane detection algorithm based on deep learning[D]. Beijing: Beijing Jiaotong University, 2018. (in Chinese) [3] 罗杨. 复杂环境下的车道线检测[D]. 成都: 电子科技大学, 2020.LUO Y. Lane detection under complicated environment[D]. Chengdu: University of Electronic Science and Technology of China, 2020. (in Chinese) [4] 陈立潮, 徐秀芝, 曹建芳, 等. 引入辅助损失的多场景车道线检测[J]. 中国图象图形学报, 2020, 293(9): 168-179.CHEN L C, XU X Z, CAO J F, et al. Multi-scenario lane line detection with the auxiliary loss[J]. Chinese Journal of Image Graphics, 2020, 293(9): 168-179. (in Chinese) [5] 甄先通, 黄坚, 王亮, 等. 自动驾驶汽车环境感知[M]. 北京: 清华大学出版社, 2020.ZHEN X T, HUANG J, WANG L, et al. Self-driving vehicle environment perception[M]. Beijing: Tsinghua University Press, 2020. (in Chinese) [6] LEE S, KIM J, YOON J S, et al. VPGNet: vanishing point guided network for lane and road marking detection and recognition[C]. 2017 IEEE International Conference on Computer Vision(ICCV), Venice, Italy: IEEE, 2017. [7] PAN X, SHI J, LUO P, et al. Spatial as deep: spatial CNN for traffic scene un-der-standing[C]. AAAI Conference on Artificial Intelligence, New Orleans, LA, USA: AAAI, 2018. [8] PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[J]. arXiv preprint arXiv: 1606. 02147, 2016. [9] HOU Y, MA Z, LIU C, et al. Learning lightweight lane detection CNNs by self attention distilla-tion[C]. 2019 IEEE/CVF International Confe-rence on Computer Vision(ICCV), Long Beach, CA, USA: IEEE, 2019. [10] 杨鹏强, 张艳伟, 胡钊政. 基于改进RepVGG网络的车道线检测算法[J]. 交通信息与安全, 2022(2): 40. https://www.cnki.com.cn/Article/CJFDTOTAL-JTJS202202009.htmYANG P Q, ZHANG Y W, HU Z Z. A lane detection algorithm based on improved RepVGG network[J]. Journal of Transport Information and Safety, 2022(2): 40. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JTJS202202009.htm [11] DING X, ZHANG X, MA N, et al. RepVGG: making VGG-style convnets great again[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Nashville, TN, USA: IEEE, 2021. [12] 吕川. 基于关键点检测的车道线识别及跟踪方法研究[D]. 咸阳: 西北农林科技大学, 2021.LYU C. Research on lane line identification and tracking method based on key point detection[D]. Xianyang: Northwest Agriculture and Forestry University of Science and Technology, 2021. (in Chinese) [13] KO Y, LEE Y, AZAM S, et al. Key points estimation and point instance segmentation approach for lane detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 23(7): 8949-8958. [14] NEWELL A, YANG K, DENG J. Stacked hour-glass networks for human pose estimation[C]. Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands: Springer, 2016. [15] TABELINI L, BERRIEL R, PAIXAO T M, et al. PolyLaneNet: lane estimation via deep polynomial regression[C]. 2020 25th International Conference on Pattern Recognition(ICPR), Milan, Italy: IEEE, 2021. [16] FENG Z, GUO S, TAN X, et al. Rethinking effi-cient lane detection via curve modeling[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), New Orleans, LA, USA: IEEE, 2022. [17] QIN Z, WANG H, LI X. Ultra fast structure-aware deep lane detection[C]. Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK: Springer, 2020. [18] 梁春婷. 基于深度学习的目标与车道线检测算法研究[D]: 广州: 华南理工大学, 2020.LIANG C T. Research on deep learning based target and lane line detection algorithm[D]. Guangzhou: South China University of Technology, 2020. (in Chinese) [19] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas, NV, USA: IEEE, 2016. [20] HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Seattle, WA, USA: IEEE, 2020. [21] REDMON J, FARHADI A. YOLOv3: An incre-mental improvement[J]. arXiv preprint ar-Xiv: 1804. 02767, 2018. [22] DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA: IEEE, 2009.