Importance of meteorological and spatial patterns for the interpolation of air quality data using deep neural networks

Authors: Clara Betancourt and Najmeh Kaffashzadeh

For the interpolation of air quality data, the relationship between air quality and geo/meteorological data is crucial. Here, unsupervised methods are applied to find patterns in meta-data which was derived at air quality measurement stations, and to explain how meta-data is connected to the local air quality.

K-means clustering of meta-data at the station and in close surrounding of the station (population density, stable night lights, elevation and NOx column) was used to group the stations into clusters of similar meta data. Each cluster contains stations with distinct characteristics in meta data. Cluster 1 are city centers with a very large population density, Cluster 2 are mostly urban areas, Cluster 3 rural and Cluster 4/5 are remote stations with low/high elevation. Figure 1 shows the cluster assignments of European air quality stations, together with the heuristic station classification by TOAR-1 (Schultz et al, 2017) for comparison. The clusters also have characteristic air quality metrics.

Figure 9: Cluster assignments in Europe

Figure 2 shows the mean number of exceedances of 50 ppb per year for the different clusters. City centers typically have lower exceedance days, as ozone is formed downwind of sources and destroyed by NOx. Rural areas thus have a higher ozone burden (Cluster 3), and remote areas a lower burden as they are far away from pollutant sources (Cluster 4-5).

Figure 10: Mean 2009-2018 nvgt060 ozone metric in different clusters

Finding links between air quality data and meta data is helpful for spatial and temporal interpolation: Meta-data is easy to access and available in global gridded form, while air quality monitoring is costly and thus only done at point locations. In the future, we will extend clustering routines to process meteorological data maps to find weather regimes that are relevant for air quality. This requires linking the clustering algorithm to a neural network for multi-channel image recognition.

Reference: Schultz, M. G. et al. Tropospheric Ozone Assessment Report: Database and Metrics Data of Global Surface Ozone Observations. Elem Sci Anth 5, 58 (2017).