Date of Award
College of Science, Engineering, and Technology (COSET)
Ph.D. in Environmental Toxicology
Committee Member 1
Committee Member 2
Qing Li, Lei Yu
Air Pollution, Forecasting, Frequent Pattern Data Mining, Machine Learning, Temporal Characteristics
Ground-level ozone and atmospheric fine particles (PM2.5) have been recognized as critical air pollutants that act as important contributors to the toxicity of anthropogenic air pollution in urban areas. To limit the adverse impacts on public health and ecosystems of ground-level ozone and PM2.5, it is necessary and imperative to identify a practical and effective way to predict the upcoming pollution concentration levels accurately. Under this need, various research was conducted aiming to perform the forecasting of ground-level ozone and PM2.5 that mainly utilized the time-series and neural network analysis. In the meantime, machine learning is also adopted in analysis and forecasting in existing research, which is, however, associated with some limitations that are not easily overcome. (1) The majority of existing forecasting models are highly dependent on time-series inputs without considering the influencing factors of the air pollutants. While a relatively accurate prediction may be provided, the influencing factors of the air pollution level caused by real-world complexity are neglected. (2) The existing forecasting models are mainly focused on the short-term estimation, while some of them need to use the previous prediction as a part of the input, which increased the system complexity and decreased the computational efficiency and accuracy. (3) The accurate annual hourly air pollution level forecasting ability is seldomly achieved. The objective of this research is to propose a systematical methodology to forecast the long-term hourly future air pollution concentration levels through historical data considering the concentration influencing factors. To achieve this research goal, a series of methodologies to analyze the historical air pollution concentration by temporal characteristics and frequent pattern data mining algorithms are introduced. The association rules of air pollution concentration levels and the influencing factors are revealed. A systematical air pollution level forecasting approach based on supervised machine learning algorithms with the ability to predict the annual hourly value is proposed and evaluated. To quantify and validate the results, a case study was conducted in the Houston region with the collection and analysis of ten years of historical environmental, meteorological, and transportation-related data. From the results of this research, (1) the complex correlations between the influencing factors and air pollution concentration levels are quantified and presented. (2) The association rules between each dependant and independent parameters are calculated. (3) The supervised machine learning algorithm pool is created and evaluated. And (4), an accurate long-term hourly air pollution level machine learning forecasting procedure is proposed. The innovative methodology of this research is advanced in computation complexity with high accuracy when compared with the existing models, which could be easily applied to similar regions for various types of air pollution concentration level forecasting.
Copyright © for this work is retained by the author. Any documents and information presented are protected by copyright under US Copyright laws and are the property of the author. All Rights Reserved. For permission to use this content please contact the author or the Graduate School at Texas Southern University (firstname.lastname@example.org).
Du, Jianbang, "The Temporal and Frequent Pattern Mining Analysis and Machine Learning Forecasting on Mobile Sourced Urban Air Pollutants" (2021). Dissertations (2016-Present). 28.