留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于可解释机器学习框架的高速公路安全风险及影响要素识别

杜渐 杨海益 李洋 郭淼 亓航 魏金强 马浩 胡丹丹 李志宇

杜渐, 杨海益, 李洋, 郭淼, 亓航, 魏金强, 马浩, 胡丹丹, 李志宇. 基于可解释机器学习框架的高速公路安全风险及影响要素识别[J]. 交通信息与安全, 2023, 41(5): 24-34. doi: 10.3963/j.jssn.1674-4861.2023.05.003
引用本文: 杜渐, 杨海益, 李洋, 郭淼, 亓航, 魏金强, 马浩, 胡丹丹, 李志宇. 基于可解释机器学习框架的高速公路安全风险及影响要素识别[J]. 交通信息与安全, 2023, 41(5): 24-34. doi: 10.3963/j.jssn.1674-4861.2023.05.003
DU Jian, YANG Haiyi, LI Yang, GUO Miao, QI Hang, WEI Jinqiang, MA Hao, HU Dandan, LI Zhiyu. Identification of Safety Risk in Freeway and Impact Factors Based on an Interpretable Machine Learning Framework[J]. Journal of Transport Information and Safety, 2023, 41(5): 24-34. doi: 10.3963/j.jssn.1674-4861.2023.05.003
Citation: DU Jian, YANG Haiyi, LI Yang, GUO Miao, QI Hang, WEI Jinqiang, MA Hao, HU Dandan, LI Zhiyu. Identification of Safety Risk in Freeway and Impact Factors Based on an Interpretable Machine Learning Framework[J]. Journal of Transport Information and Safety, 2023, 41(5): 24-34. doi: 10.3963/j.jssn.1674-4861.2023.05.003

基于可解释机器学习框架的高速公路安全风险及影响要素识别

doi: 10.3963/j.jssn.1674-4861.2023.05.003
基金项目: 

国家重点研发计划项目 2019YFB1600500

详细信息
    作者简介:

    杜渐(1971—),博士,高级工程师. 研究方向:交通运输工程、交通信息化. E-mail:dujian@cmhk.com

    通讯作者:

    李洋(1979—),博士,高级工程师. 研究方向:交通安全管理,交通设施管理. E-mail:yang_li009@163.com

  • 中图分类号: U491

Identification of Safety Risk in Freeway and Impact Factors Based on an Interpretable Machine Learning Framework

  • 摘要: 由于交通事故是小概率随机事件,难以在全时空域上开展交通安全分析,也无法基于此制定事故发生前的交通安全风险主动防控策略。为辨识混杂因素干扰下安全风险及其诱发本质,使用激进驾驶行为数据与速度变异系数计算交通秩序指数(traffic order index,TOI),形成事故替代指标,并通过K-means聚类算法将TOI划分为3种交通安全风险等级。在此基础上,利用Catboost算法构建交通流特征、天气条件、道路条件等因素与交通安全风险等级间的关联关系,并基于基尼系数的特征重要性确定高速公路交通安全风险要素。使用部分依赖图算法解析风险要素与交通安全风险的依赖关系,获取风险要素对交通安全风险的边际效应。结果表明:①Catboost算法对风险等级识别的准确率、精确率、召回率依次为85.95%、88.56%、86.75%,证明交通秩序指数与外部风险要素具有较强相关性;②交通流量、拥堵指数对风险识别有较大影响,且与交通安全风险等级呈现非线性关系,交通流量>450 veh/h或拥堵指数>1.5时,交通安全风险均会显著增长,交通安全风险分别上升16.9%、29.5%;③当连续1 km道路内设有1~2个交通标志时,交通安全风险最高,路段识别为高风险的概率为38.1%;匝道出入口和隧道内部道路的交通安全风险最高;④侧风作用会小幅度影响高速公路交通安全风险,当风力等级由0级增至5级时,交通安全风险上升4.99%。

     

  • 图  1  研究路段示意图

    Figure  1.  Schematic diagram of the road section

    图  2  结果分析框架

    Figure  2.  Framework of results analysis

    图  3  Catboost模型的混淆矩阵

    Figure  3.  Confusion matrix for Catboost model

    图  4  Catboost模型特征重要性

    Figure  4.  The feature importance score in catboost model

    图  5  变量相关性矩阵

    Figure  5.  Variable correlation matrix

    图  6  交通流特征的部分依赖图

    Figure  6.  Partial dependence plots of traffic flow characteristics

    图  7  道路条件的部分依赖图

    Figure  7.  Partial dependence plots of road conditions

    图  8  天气条件的部分依赖图

    Figure  8.  Partial dependence plots of weather conditions

    表  1  数据类型与描述

    Table  1.   Data type and Description

    数据类别 主要字段 数据描述
    基础数据 时段 时间区间i h - i + 1 h, i = 0, 1, 2, …, 23
    激进驾驶行为 事件类型 见式(2),事件类型包括急加速、急减速、急左转、急右转、急
    事件坐标 经度、纬度
    拥堵指数 见式(1)
    交通流 平均运行速度 10 min内通过车辆的平均运行速度值
    环境 流量 10 min内通过车辆的总车辆数
    天气条件 晴、阴、多云、雨、雾
    风力等级 0~5级
    路段类型 隧道、桥梁路段、普通路段、匝道出入口
    道路 匝道出入口个数
    平曲线类型 弯道段、直线段
    标志数量 1 000 m路段内路段的交通标志数量,包括指路、指示、警告和禁令标志
    下载: 导出CSV

    表  2  连续变量的描述性统计表

    Table  2.   Descriptive statistics of continuous variables

    类别 变量 最大值 最小值 平均值 标准差
    交通流特征 流量/(veh/10 min) 384 1 90.96 55.58
    拥堵指数 33.85 0.76 1.08 0.44
    道路条件 标志数量/个 17 0 3.22 4.13
    下载: 导出CSV

    表  3  分类变量的描述性统计表

    Table  3.   Descriptive statistics of categorical variables

    类别 变量 变量描述 代码 频数 占比/%
    道路条件 路段类型 普通路段 0 160 408 56.97
    互通立交路段 1 79 240 28.14
    隧道路段 2 41 931 14.89
    匝道出人口个数 无出人口 0 204 173 72.51
    1个 1 41 750 14.83
    2个 2 35 656 12.66
    是否为曲线段 0 79 019 28.06
    1 202 560 71.94
    天气条件 天气状况 0 98 783 35.08
    多云 1 98 404 34.95
    2 66 662 23.67
    3 16 450 5.84
    4 1 280 0.46
    风力等级 无风 0 77 173 27.41
    1级风 1 123 034 43.69
    2级风 2 66 728 23.70
    3级风 3 13 569 4.82
    4级风 4 795 0.28
    5级风 5 280 0.10
    下载: 导出CSV

    表  4  混淆矩阵案例

    Table  4.   Example of confusion matrix

    真实值 预测值
    预测为正值 预测为负值
    真实为正值 TP FN
    真实为负值 FP TN
    下载: 导出CSV

    Table  5.   Optional parameters and final tuning results of Catboost algorithm

    可选参数 调优结果 含义解释
    loss function {RMSE, Logloss, MAE} RMSE 损失函数类型
    iterations {500, 600, 700, …, 1 000} 600 最大树数
    learning rate {0.01, 0.02, 0.03, …, 0.05} 0.04 学习率
    bagging temperature {0.1.0.2, 0.3, …, 1} 0.5 贝叶斯套袋强度
    depth{1, 2, 3, …,10} 7 最大树深度
    下载: 导出CSV
  • [1] YANG B, WU Y, ZHANG W, et al. Modeling collision probability on freeway: accounting for different types and severities in various LOS[J]. Sustainability, 2020, 12 (18): 1-10.
    [2] 赵晓华, 姚莹, 丁阳, 等. 基于导航数据的交叉口进口道安全风险评估及诊断方法[J]. 同济大学学报(自然科学版), 2020, 48 (12): 1733-1741.

    ZHAO X H, YAO Y, DING Y, et al. Navigation-data-based risk evaluation method at intersection entrance[J]. Journal of Tongji University (Natural Science), 2020, 48 (12) : 1733-1741. (in Chinese)
    [3] QI S, ABDEL-ATY M. Big data applications in real-time traffic operation and safety monitoring and improvement on urban expressways[J]. Transportation Research Part C: Emerging Technologies, 2015, 58 (1): 380-394.
    [4] KHAN M N, ANIK D, MOHAMED M A. Non-Parametric association rules mining and parametric ordinal logistic regression for an in-depth investigation of driver speed selection behavior in adverse weather using SHRP2 naturalistic driving study data[J]. Transportation Research Record: Journal of the Transportation Research Board, 2020, 2020 (11): 101-119.
    [5] MA C, HAO W, XIANG W, et al. The impact of aggressive driving behavior on driver-injury severity at highway-rail grade crossings accidents[J]. Journal of Advanced Transportation, 2018, 2018 (58): 1-10.
    [6] 郭延永, 刘攀, 吴瑶, 等. 基于冲突极值模型的非常规信号交叉口安全评价[J]. 中国公路学报, 2022, 35 (1): 1-8.

    GUO Y Y, LIU P, WU Y, et al. Safety evaluation of unconventional signalized intersection based on traffic conflict extreme model[J]. China Journal of Highway and Transport, 2022, 35 (1): 1-8. (in Chinese)
    [7] 郭延永, 刘攀, 徐铖铖, 等. 基于交通冲突模型的信号交叉口右转设施安全分析[J]. 中国公路学报, 2016, 29 (11): 1-8.

    GUO Y Y, LIU P, XU C C, et al. Safety analysis of right-turn facility at signalized intersection using traffic conflict model[J]. China Journal of Highway and Transport, 2016, 29 (11): 1-8. (in Chinese)
    [8] 郭延永, 刘攀, 吴瑶, 等. 基于交通冲突模型的信号交叉口渠化岛设置方法[J]. 交通运输工程学报, 2017, 17 (4): 1-9. doi: 10.3969/j.issn.1671-1637.2017.04.001

    GUO Y Y, LIU P, WU Y, et al. Design approach of channelized island based on traffic conflict models at signalized intersection[J]. Journal of Traffic and Transportation Engineering, 2017, 17 (4): 1-9. (inChinese) doi: 10.3969/j.issn.1671-1637.2017.04.001
    [9] 孟祥海, 林兰平. 高速公路分合流区潜在事故风险研究[J]. 中国安全科学学报, 2015, 25 (8): 1-7.

    MENG X H, LIN L P. Research on potential crash risk in freeway merging and diverging areas[J]. China Safety Science Journal, 2017, 17 (4): 1-9. (in Chinese)
    [10] 蒋若曦, 朱顺应, 王磊, 等. 基于交通冲突的高速公路施工区安全评价[J]. 中国安全科学学报, 2019 (6): 1-6.

    JIANG R X, ZHU S Y, WANG L, et al. Traffic safety assessment of highway workzone based on traffic conflict[J]. China Safety Science Journal, 2019 (6): 1-6. (in Chinese)
    [11] National Highway Traffic Safety Administration. Strategy for vehicle safety strategicplanning for domestic and global integration of vehicle safety[R]. Washington, D.C. : National Highway Traffic Safety Administration, 2013.
    [12] GUO M, ZHAO X H, YAO Y, et al. A study of freeway crash risk prediction and interpretation based on risky driving behavior and traffic flow data[J]. Accident Analysis & Prevention, 2021, 161 (1): 1-10.
    [13] 郑来, 顾鹏, 卢健. 基于T-S模糊故障树和贝叶斯网络的重特大交通事故成因分析[J]. 交通信息与安全, 2021, 39 (4): 43-51, 59. doi: 10.3963/j.jssn.1674-4861.2021.04.006

    ZHENG L, GU P, LU J. A cause analysis of extraordinarily severe traffic crashes based on t-s fuzzy fault tree and Bayesian network[J]. Journal of Transport Information and Safety, 2021, 39 (4): 43-51, 59. doi: 10.3963/j.jssn.1674-4861.2021.04.006
    [14] 戢晓峰, 詹换勤, 普永明, 等. 山区公路穿村镇路段过境车辆事故严重程度推理分析[J]. 交通运输系统工程与信息, 2022, 22 (3): 231-237.

    JI X F, ZHAN H Q, PU Y M, et al. Inferential analysis of vehicle accident severity in mountainous highway crossing village[J]. Journal of Transportation Systems Engineering and Information Technology, 2022, 22 (3): 231-237.
    [15] MARCO T R, SAMEER S, CARLOS G. Why should I trust you? Explaining the predictions of any classifier[C]. The 22nd International Conference on Knowledge Discovery and Data Mining, San Francisco, USA: KDD, 2016.
    [16] KIDANDO E, KITALI A E, KUTELA B, et al. Prediction of vehicle occupants injury at signalized intersections using real-time traffic and signal data[J]. Accident Analysis & Prevention, 2021, 149 (1): 1-14.
    [17] ZHAO X H, YANG H Y, YAO Y, et al. Factors affecting traffic risks on bridge sections of freeways based on partial dependence plots[J]. Physica A: Statistical Mechanics and its Applications. 2022 (1): 1-15.
    [18] PARSA A B, MOVAGEDU A, TAGHIPOUR H, et al. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis[J]. Accident Analysis & Prevention, 2020 (1), 1-10.
    [19] LEVINSON H S, LOMAX T J. Developing a travel time congestion index[J]. Transportation Research Record: Journal of the Transportation Research Board, 1996 (1): 1-10.
    [20] CAI Q, ABDEL-ATY M, YUAN J, et al. Real-time crash prediction on expressways using deep generative models[J]. Transportation Research Part C: Emerging Technologies, 2020 (1): 1-11.
    [21] STIPANCIC J, MIRANDA-MORENO L, SAUNIER N, et al. Network screening for large urban road networks: Using GPS data and surrogate measures to model crash frequency and severity[J]. Accident Analysis & Prevention, 2019(4): 290-301
    [22] YAO Y, ZHAO X H, ZHANG Y F, et al. Development of urban road order index based on driving behavior and speed variation[J]. Transportation Research Record: Journal of the Transportation Research Board, 2019 (7): 466-478.
    [23] DOROGUSH A V, ERSHOV V, GULIN A. Catboost: gradient boosting with categorical features support[J/OL]. (2018- 10-24)[2022-05-30]. https://arxiv.org/abs/1810.11363v1.
    [24] FRIEDMAN J H. Greedy function approximation: A gradient boosting machine[J]. Annals of Statistics, 2001, 29(2): 1189-1232.
    [25] DING C, WU X, YU G, et al. A gradient boosting logit model to investigate driver's stop-or-run behavior at signalized intersections using high-resolution traffic data[J]. Transportation Research Part C: Emerging Technologies, 2016, 72(1): 225-238.
    [26] BASSO F, BASSO L J, BRAVO F, et al. Real-time crash prediction in an urban expressway using disaggregated data[J]. Transportation Research Part C: Emerging Technologies, 2018, 86 (1): 202-219.
    [27] WANG L, ABDEL-ATY M, SHI Q, et al. Real-time crash prediction for expressway weaving segments[J]. Transportation Research Part C: Emerging Technologies, 2015, 61(1): 1-10.
    [28] YUAN J, ABDEL-ATY M, GONG Y, et al. Real-time crash risk prediction using long short-term memory recurrent neural network[J]. Transportation Research Record: Journal of the Transportation Research Board, 2019, 2673 (1): 1-11.
    [29] XU J, SUN L. Conditional autoregressive negative binomial model for analysis of crash count using Bayesian methods[J]. Journal of Southeast University(English Edition), 2014, 30 (1): 96-100.
    [30] 杨奎, 余荣杰, 王雪松. 基于车道集计交通流数据的事故风险评估分析[J]. 同济大学学报(自然科学版), 2016, 44 (10): 1567-1572.

    YANG K, YU R J, WANG X S. Application of aggregated lane traffic data from dual-loop detector to crash risk evaluation[J]. Journal of Tongji University (Natural Science), 2016, 44 (10): 1567-1572. (in Chinese)
    [31] LI G, LAI W, SUI X, et al. Influence of traffic congestion on driver behavior in post-congestion driving[J]. Accident Analysis & Prevention, 2020, 141 (1): 1-10.
    [32] NOLAND R B, QUDDUS M A. Congestion and safety: a spatial analysis of London[J]. Transportation Research Part A: Policy and Practice, 2005, 39 (7): 737-754.
    [33] 丁瑞, 刘俊, 蒋艳, 等. 基于车辆加速度数据的互通立交匝道驾驶风险分析[J]. 交通信息与安全, 2021, 39(1): 17-25. doi: 10.3963/j.jssn.1674-4861.2021.01.0003

    DING R, LIU J, JIANG Y, et al. Driving risks of interchange ramps based on vehicle acceleration data[J]. Journal of Transport Information and Safety, 2021, 39(1): 17-25. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2021.01.0003
    [34] 鞠云杰. 隧道侧壁装饰对驾驶人注意力分散的影响研究[D]. 北京: 北京工业大学, 2021.

    JU Y J. A study of exploring the influence of decorated sidewall in tunnels on driver's distraction[D]. Beijing: Beijing University of Technology, 2021. (in Chinese)
    [35] 程国柱, 刚杰, 程瑞, 等. 公路货运通道路侧事故多发路段判别与线形设计[J]. 哈尔滨工业大学学报, 2022, 54 (3): 1-8.

    CHENG G Z, GANG J, CHENG R, et al. Identification of roadside accident blackspot andgeometric design of dedicated freight corridor on highways[J]. Journal of Harbin Institute of Technology, 2022, 54 (3): 1-8. (in Chinese)
    [36] 陈丰, 彭浩荣, 马小翔, 等. 侧风作用下货车驾驶人反应行为模型[J]. 同济大学学报(自然科学版), 2020, 48(5): 702-709.

    CHEN F, PENG H R, MA X X, et al. Model of driving behavior of truck driver under crosswind[J]. Journal of Tongji University(Natural Science), 2020, 48(5): 702-709. (in Chinese)
    [37] CHEN F, PENG H R, MA X X, et al. Examining the safety of trucks under crosswind at bridge-tunnel section: a driving simulator study[J]. Tunnelling and Underground Space Technology, 2019, 92 (6): 1-7.
    [38] ADBEL-ATY M, EKRAM A A, HUANG H L, et al. A study on crashes related to visibility obstruction due to fog and smoke[J]. Accident Analysis & Prevention, 2011, 43(5): 1730-1737.
  • 加载中
图(8) / 表(5)
计量
  • 文章访问数:  567
  • HTML全文浏览量:  289
  • PDF下载量:  73
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-05-30
  • 网络出版日期:  2024-01-18

目录

    /

    返回文章
    返回