How does AI predict ozone based on environmental features?

In our previous study (Betancourt et al. 2021), we published a benchmark dataset on ozone metrics extracted from the TOAR database along with a machine learning task. We want to tackle the challenge with machine learning to predict the ozone metrics based upon geospatial features. As a baseline, we predicted, for example, the average ozone with different methods, among them a random forest and a shallow neural network.

Although many studies, like ours, reach their goal by just making the predictions – we asked more questions. How exactly do the machine learning algorithms predict average ozone? We analyzed our machine learning models’ functioning to understand their predictions and found out even more than we expected.

By focusing on inaccurate predictions and explaining why these predictions fail, we (i) identified underrepresented samples, (ii) flagged unexpected inaccurate predictions, and (iii) pointed out training samples irrelevant for predictions. We suggested locations for building new measurement stations based on the underrepresented samples. We also showed which training samples do not substantially contribute to the model performance. We can even drop these samples without performance loss! Our study demonstrates the application of explainable machine learning beyond simply explaining the trained model.

On this map, we show non-influential training stations in white, untrustworthy predictions on the test set in blue, and underrepresented test stations in plum. The marker indicates the model. The neural network’s marker is ×, the random forest’s marker is +, and where those two symbols overlap, we get something like an asterisk ∗. We indicate regions where the models recommend building new red, blue, and violet stations. The neural network recommends building new stations in red-colored areas; the random forest recommends building new stations in the blue-colored areas. Violet represents the intersection of regions recommended by both models.

Stadtler et al., Explainable Machine Learning Reveals Capabilities, Redundancy, and Limitations of a Geospatial Air Quality Benchmark Dataset, Machine Learning and Knowledge Extraction. 2022; 4(1):150-171, https://doi.org/10.3390/make4010008