Jinli Liu

Document Type


Date of Award



College of Science, Engineering, and Technology (COSET)

Degree Name

MS in Transportation Planning & Management

Committee Chairperson

Yi Qi

Committee Co-Chairperson

Mehdi Azimi,

Committee Member 1

Fengxiang Qiao


Crash Severity Prediction, Large Truck Crash, Machine Learning Methods


According to the Texas Department of Transportation’s Texas Motor Vehicle Crash Statistics, Texas has had the highest number of severe crashes involving large trucks in the US. As defined by the US Department of Transportation, a large truck is any vehicle with a gross vehicle weight rating greater than 10,000 pounds. Generally, it requires more time and much more space for large trucks to accelerating, slowing down, and stopping. Also, there will be large blind spots when large trucks make wide turns. Therefore, if an unexpected traffic situation comes upon, It would be more difficult for large trucks to take evasive actions than regular vehicles to avoid a collision. Due to their large size and heavy weight, large truck crashes often result in huge economic and social costs. Predicting the severity level of a reported large truck crash with unknown severity or of the severity of crashes that may be expected to occur sometime in the future is useful. It can help to prevent the crash from happening or help rescue teams and hospitals provide proper medical care as fast as possible. To identify the appropriate modeling approaches for predicting the severity of large truck crash, in this research, four representative classification tree-based ML models (e.g., Extreme Gradient Boosting tree (XGBoost), Adaptive Boosting tree(AdaBoost), Random Forest (RF), Gradient Boost Decision Tree (GBDT)), two non-tree-based ML models (e.g., the Support Vector Machines (SVM), k-Nearest Neighbors (kNN)), and LR model were selected. The results indicate that the GBDT model performs best among all of seven models.