融合残差网络和特征金字塔的小尺度行人检测方法

张阳; 张帅锋; 刘伟铭

doi:10.3963/j.jssn.1674-4861.2023.03.012

融合残差网络和特征金字塔的小尺度行人检测方法

doi: 10.3963/j.jssn.1674-4861.2023.03.012

张阳^{1, 2, ,},
张帅锋^{1, 2},
刘伟铭³

1.
福建工程学院交通运输学院福州 350118
2.
福建工程学院智能交通研发中心福州 350118
3.
华南理工大学土木与交通学院广州 510641

基金项目:

国家自然科学基金项目 61976055

福建省自然科学基金项目 2023J01946

详细信息

通讯作者:
张阳（1983—），博士，副教授. 研究方向：智能交通信息处理、交通图像处理等. E-mail: zhang_yang1983@163.com

中图分类号: U495
计量
- 文章访问数: 431
- HTML全文浏览量: 193
- PDF下载量: 17
- 被引次数: 0
出版历程
- 收稿日期: 2022-07-31
- 网络出版日期: 2023-09-16

A Small-scale Pedestrian Detection Method Based on Fused Residual Networks and Feature Pyramids

ZHANG Yang^{1, 2
, ,},
ZHANG Shuaifeng^{1, 2},
LIU Weiming³

1.
School of Transportation, Fujian University of Technology, Fuzhou 350118, China
2.
Intelligent Transportation System Research Center, Fujian University of Technology, Fuzhou 350118, China
3.
School of Civil Engineering & Transportation, South China University of Technology, Guangzhou 510641, China

摘要

摘要: 针对小尺度行人检测中存在的过拟合、特征不易对齐，以及易忽略多尺度特征等问题，研究了1种融合残差网络和特征金字塔的小尺度行人检测方法。考虑到原始残差网络在检测小尺度行人时过于依赖训练集而出现过拟合问题，构建带有丢弃层的残差块代替残差网络结构中的标准残差块来解决这一局限，同时利用丢弃层的正则作用降低计算过程的复杂程度。通过在特征金字塔网络的侧向连接部分嵌入特征选择模块和特征对齐模块，对输入图像中重要的行人特征加强和对齐，提升算法对行人的多尺度特征学习能力，弥补特征金字塔网络出现特征不易对齐和易忽略多尺度特征的缺陷，提高小尺度行人的检测精度。在Caltech Pedestrian数据集上对模型进行训练、测试和验证，实验结果表明：小尺度行人检测精度为73.6%，AP₅₀检测精度为95.6%。在同为50层残差网络和特征金字塔网络下，改进后的模型可以使AP值提高17.2%，AP₅₀提高7.8%，小尺度行人检测精度提高了21.6%；在同为101层残差网络和特征金字塔网络下，可以使AP值提高24.5%，AP₅₀提高8.2%，小尺度行人检测精度提高32.3%。同时与RefindDet512、GHM800算法相比，AP值分别提高20.8%和17.7%，AP₅₀分别提高5.5%和3.6%，小尺度行人检测精度分别提高26.8%和20.6%，由此证明提出的模型性能优于经典检测算法，可以有效地提高小尺度行人检测精度。
- 交通安全 /
- 小尺度行人检测 /
- 多尺度特征融合 /
- 残差网络 /
- 特征金字塔
Abstract: Traditional detection methods for small-scale pedestrians have several issues such as overfitting, misalignment of features, and neglect of multi-scale features. Therefore, a new small-scale pedestrian detection method is proposed by combining residual networks and feature pyramids. To solve the overfitting problem of the residual networks for detecting small-scale pedestrians, a residual block with a dropout layer is applied to replace the standard residual block in the residual network structure. Moreover, the regularization effect of the dropout layer can reduce the computational complexity. The embedding feature selection module and feature alignment module in the lateral connection part of the feature pyramid networks can improve the ability of learning multi-scale features of pedestrians. The feature selection module and feature alignment module make up for the deficiency of misalignment of features and neglect of multi-scale features, which can improve the accuracy of detecting small-scale pedestrians. The proposed model is trained, tested, and validated based on the Caltech Pedestrian dataset. Experiment results show that the detection accuracy for small-scale pedestrians is 73.6% and the AP₅₀ detection accuracy is 95.6%. Compared to the traditional method, the proposed method improves the AP (average precision) by 17.2%, AP₅₀ (average precision when the intersection over union is greater than 0.5) by 7.8%, and detection accuracy for small-scale pedestrians by 21.6% respectively, when the number of layers is set as 50. In addition, the proposed method improves the AP by 24.5%, AP₅₀ by 8.2%, and detection accuracy for small-scale pedestrians by 32.3%, when the number of layers is set as 101. Moreover, compared with RefindDet512 and GHM800 algorithms, the AP is improved by 20.8% and 17.7%, the AP₅₀ is improved by 5.5% and 3.6%, and the detection accuracy for small-scale pedestrians is improved by 26.8% and 20.6%, respectively. Therefore, it can be concluded that the proposed method can effectively improve performance and accuracy of pedestrian detection, when compared to traditional algorithms.
- traffic safety /
- small-scale pedestrian detection /
- multi-scale feature fusion /
- residual network /
- feature pyramid network

HTML全文

图 1 改进残差块结构

Figure 1. The structure of improved residual block

下载: 全尺寸图片幻灯片

图 2 改进特征金字塔结构

Figure 2. The structure of improved feature pyramid

下载: 全尺寸图片幻灯片

图 3 特征选择模块结构

Figure 3. The structure of feature selection module

下载: 全尺寸图片幻灯片

图 4 特征对齐模块结构

Figure 4. The structure of feature alignment module

下载: 全尺寸图片幻灯片

图 5 FRN-FP方法结构

Figure 5. The structure of FRN-FP

下载: 全尺寸图片幻灯片

图 6 损失函数曲线

Figure 6. Loss function curve

下载: 全尺寸图片幻灯片

图 7 消融实验结果

Figure 7. Ablation experiment results

下载: 全尺寸图片幻灯片

图 8 与经典检测算法的对比结果

Figure 8. Comparison detection results with classical detection algorithms

下载: 全尺寸图片幻灯片

表 1 在Caltech Pedestrian数据集上的消融实验结果

Table 1. Ablation experiment results on Caltech Pedestrian Dataset

方法	骨干网络	AP	AP₅₀	AP₇₅	AP_S
ResNet-FPN	ResNet-50-PFN	52.6	87.0	58.2	42.5
ResNet-FPN	ResNet-101-FPN	52.8	87.4	58.5	41.3
IResNet-FPN	IResNet-50-PFN	57.2	89.3	66.8	48.1
IResNet-FPN	IResNet-101-FPN	57.9	91.0	67.0	48.6
ResNet-IFPN	ResNet-50-IPFN	58.8	91.4	68.1	50.0
ResNet-IFPN	ResNet-101-IFPN	59.4	92.0	69.0	50.9
FRN-FP	IResNet-50-IPFN	69.8	94.8	82.7	64.1
FRN-FP	IResNet-101-IFPN	77.3	95.6	89.0	73.6

下载: 导出CSV

表 2 本文方法与经典检测算法的对比结果

Table 2. Comparison results of our algorithm with classical detection algorithms

方法	骨干网络	AP	AP₅₀	AP₇₅	AP_S
RefindDet512	ResNet-101	56.5	90.1	64.5	46.8
GHM800	ResNet-101	59.6	92.0	70.0	53.0
FRN-FP（本文）	IResNet-50-IPFN	69.8	94.8	82.7	64.1
FRN-FP（本文）	IResNet-101-IFPN	77.3	95.6	89.0	73.6

下载: 导出CSV

参考文献(26)

[1]	HOU L, LU K, XUE J. Refined one-stage oriented object detection method for remote sensing images[J]. IEEE Transactions on Image Processing, 2022(31): 1545-1558.
[2]	GE Z, JIE Z, HUANG X, et al. Delving deep into the imbalance of positive proposals in two-stage object detection[J]. Neurocomputing, 2021, 425: 107-116. doi: 10.1016/j.neucom.2020.10.098
[3]	李翔, 何淼, 罗海波. 1种面向遮挡行人检测的改进YOLOv3算法[J]. 光学学报, 2022, 42(14): 160-169. https://www.cnki.com.cn/Article/CJFDTOTAL-GXXB202214021.htm LI X, HE M, LUO H B. An improved yolov3 algorithm for occluded pedestrian detection[J]. Acta Optica Sinica, 2022, 42(14): 160-169. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-GXXB202214021.htm
[4]	王鹏, 神和龙, 尹勇, 等. 基于深度学习的船舶驾驶员疲劳检测算法[J]. 交通信息与安全, 2022, 40(1): 63-71. doi: 10.3963/j.jssn.1674-4861.2022.01.008 WANG P, SHEN H L, YIN Y, et al. A detection algorithm for the fatigue of ship officers based on deep learning technique[J]. Journal of Transport Information and Safety, 2022, 40(1): 63-71. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2022.01.008
[5]	杨鹏强, 张艳伟, 胡钊政. 基于改进RepVGG网络的车道线检测算法[J]. 交通信息与安全, 2022, 40(2): 73-81. doi: 10.3963/j.jssn.1674-4861.2022.02.009 YANG P Q, ZHANG Y W, HU Z Z. A lane detection algorithm based on improved repvgg network[J]. Journal of Transport Information and Safety, 2022, 40(2): 73-81. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2022.02.009
[6]	储珺, 束雯, 周子博, 等. 结合语义和多层特征融合的行人检测[J]. 自动化学报, 2022, 48(1): 282-291. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO202201020.htm CHU J, SHU W, ZHOU Z B, et al. Combining semantics with multi-level feature fusion for pedestrian detection[J]. Acta Automatica Sinica, 2022, 48(1): 282-291. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO202201020.htm
[7]	罗艳, 张重阳, 田永鸿, 等. 深度学习行人检测方法综述[J]. 中国图象图形学报, 2022, 27(7): 2094-2111. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB202207003.htm LUO Y, ZHANG C Y, TIAN Y H, et al. An overview of deep learning based pedestrian detection algorithms[J]. Journal of Image and Graphics, 2022, 27(7): 2094-2111. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB202207003.htm
[8]	RABBI J, RAY N, SCHUBERT M, et al. Small-object detection in remote sensing images with end-to-end edge-enhanced gan and object detector network[J]. Remote Sensing, 2020, 12 (9): 1432. doi: 10.3390/rs12091432
[9]	ZHAI S, SHANG D, WANG S, et al. Df-ssd: An improved ssd object detection algorithm based on densenet and feature fusion[J]. IEEE Access, 2020(8): 24344-24357.
[10]	ROY A M, BOSE R, BHADURI J. A fast accurate fine-grain object detection model based on YOLOv4 deep neural network[J]. Neural Computing and Applications, 2022, 34(5): 3895-3921. doi: 10.1007/s00521-021-06651-x
[11]	YIN Q, YANG W, RAN M, et al. Fd-ssd: An improved ssd object detection algorithm based on feature fusion and dilated convolution[J]. Signal Processing: Image Communication, 2021(98): 116402.
[12]	王程, 刘元盛, 刘圣杰. 基于改进YOLOv4的小目标行人检测算法[J]. 计算机工程, 2023, 49(2): 296-302, 313. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJC202302036.htm WANG C, LIU Y S, LIU S J. Small target pedestrian detection algorithm based on improved yolov4[J]. Computer Engineering, 2023, 49(2): 296-302, 313. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JSJC202302036.htm
[13]	LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. Computer Vision and Pattern Recognition, Hawaii, USA: IEEE, 2017.
[14]	LI J, LIANG X, SHEN S M, et al. Scale-aware fast r-cnn for pedestrian detection[J]. IEEE Transactions on Multimedia, 2017, 20(4): 985-996.
[15]	REN S, HE K, GIRSHICK R, et al. Faster r-cnn: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
[16]	WU M, YUE H, WANG J, et al. Object detection based on rgc mask r-cnn[J]. IET Image Processing, 2020, 14(8): 1502-1508.
[17]	ZHANG L, LIN L, LIANG X, et al. Is faster r-cnn doing well for pedestrian detection?[C]. European Conference on Computer Vision, Amsterdam, Netherlands: Springer, 2016.
[18]	LIU T, STATHAKI T. Faster r-cnn for robust pedestrian detection using semantic segmentation network[J]. Frontiers in Neurorobotics, 2018(12): 1-10.
[19]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]. Computer Vision and Pattern Recognition, Las Vegas, USA: IEEE, 2016.
[20]	SHAO X, WANG Q, YANG W, et al. Multi-scale feature pyramid network: A heavily occluded pedestrian detection network based on resnet[J]. Sensors, 2021, 21(5): 1820.
[21]	HUANG S, LU Z, CHENG R, et al. Fapn: feature-aligned pyramid network for dense image prediction[C]. International Conference on Computer Vision, Montreal, Canada: IEEE, 2021.
[22]	SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: A simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
[23]	TANG L, TANG W, QU X, et al. A scale-aware pyramid network for multi-scale object detection in sar images[J]. Remote Sensing, 2022, 14(4): 973.
[24]	CAI Z, VASCONCELOS N. Cascade r-cnn: high quality object detection and instance segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43 (5): 1483-1498.
[25]	ZHANG S, WEN L, BIAN X, et al. Single-shot refinement neural network for object detection[C]. Computer Vision and Pattern Recognition, Salt Lake City, USA: IEEE, 2018.
[26]	LI B, LIU Y, WANG X. Gradient harmonized single-stage detector[C]. AAAI Conference on Artificial Intelligence, Hawaii, USA: AAAI, 2019.