IntelliAQ https://intelliaq.eu Air Quality forecasting with machine learning. Thu, 15 Feb 2024 09:38:28 +0000 en-GB hourly 1 https://wordpress.org/?v=6.5.5 https://intelliaq.eu/wp-content/uploads/2020/05/cropped-intelliaq_logo_stacked_transparent-32x32.png IntelliAQ https://intelliaq.eu 32 32 Final Report of the IntelliAQ Projekt https://intelliaq.eu/2024/02/07/final-report-of-the-intelliaq-projekt/ Wed, 07 Feb 2024 13:26:25 +0000 https://intelliaq.eu/?p=1944 The IntelliAQ Grant project was active from October 2018 to September 2023 and aimed to explore novel deep learning methods for the analysis of air quality data. An important part of the project was the development of a suitable data infrastructure to make global air quality data easier accessible for machine learning tasks. For this, we teamed up with the international Tropospheric Ozone Assessment Report (TOAR) initiative. Five successful years ended with the following results: intelliaq.eu

]]>
Paper, “Graph Machine Learning for Improved Imputation of Missing Tropospheric Ozone Data” was published https://intelliaq.eu/2023/09/06/paper-graph-machine-learning-for-improved-imputation-of-missing-tropospheric-ozone-data-was-published/ Wed, 06 Sep 2023 12:04:52 +0000 https://intelliaq.eu/?p=1908 Gaps in the measurement series of atmospheric pollutants can impede the reliable assessment of their impacts and trends. We propose a new method for missing data imputation of the air pollutant tropospheric ozone by using the graph machine learning algorithm “correct and smooth”. This algorithm uses auxiliary data that characterize the measurement location and, in addition, ozone observations at neighboring sites to improve the imputations of simple statistical and machine learning models. We apply our method to data from 278 stations of the year 2011 of the German Environment Agency (Umweltbundesamt – UBA) monitoring network. The preliminary version of these data exhibits three gap patterns: shorter gaps in the range of hours, longer gaps of up to several months in length, and gaps occurring at multiple stations at once. For short gaps of up to 5 h, linear interpolation is most accurate. Longer gaps at single stations are most effectively imputed by a random forest in connection with the correct and smooth. For longer gaps at multiple stations, the correct and smooth algorithm improved the random forest despite a lack of data in the neighborhood of the missing values. We therefore suggest a hybrid of linear interpolation and graph machine learning for the imputation of tropospheric ozone time series.

https://pubs.acs.org/doi/full/10.1021/acs.est.3c05104

]]>
Clara Betancourt has successfully defended her dissertation within the IntelliAQ project. https://intelliaq.eu/2023/09/06/clara-betancourt-has-successfully-defended-her-dissertation-within-the-intelliaq-project/ Wed, 06 Sep 2023 11:47:49 +0000 https://intelliaq.eu/?p=1902 Clara Betancourt has completed her dissertation within the IntelliAQ project and successfully defended her thesis at the University of Bonn. Her thesis “Mapping and Interpolation of Tropospheric Ozone Data with Machine Learning Methods” develops spatio-temporal mapping and interpolation methods using machine learning techniques with the example application of ozone data. It trains the machine learning models on a large number of ozone measurements available in the Tropospheric Ozone Assessment Report (TOAR) database.

The synthesis of this work is that an interplay of physically sound data selection, uncertainty quantification, and explainability in machine learning can produce trustworthy environmental data products. Another finding is that the accuracy of the data products in a specific region is mainly dependent on good coverage with ozone measurements in that region. Therefore, this work contributes not only to the gapless quantification of ozone concentrations but also to trustworthy machine learning in the environmental sciences.

]]>
Lukas Hubert Leufen has successfully defended his dissertation within the IntelliAQ project. https://intelliaq.eu/2023/07/19/lukas-hubert-leufer-has-successfully-defended-his-dissertation-within-the-intelliaq-project/ Wed, 19 Jul 2023 12:18:52 +0000 https://intelliaq.eu/?p=1891 We are proud to announce that Lukas Hubert Leufen has completed his dissertation within the IntelliAQ project and successfully defended his thesis at the University of Bonn. His thesis “Time Filter Assisted Deep Learning to Predict Air pollution” builds on a time series filtering approach to split up long-term and short-term variations and uses several deep learning networks to accurately predict ground-level ozone air pollution. The neural networks have been trained on large amounts of data from air quality monitoring stations distributed across Central Europe, climatological statistics on air pollutants and meteorological data from numerical weather models. The deep learning models have been integrated into a well-defined workflow for training and validation called MLAir, which ensures the reproducibility of the findings. Results substantiate that the combination of sophisticated DL architectures and time series filtering enables accurate ozone predictions, which are superior to state-of-the-art numerical modelling results.

]]>
Conference IGARSS 2023 (16.-21. July 2023) https://intelliaq.eu/2023/04/12/conference-igarss-2023-16-21-july-2023/ Wed, 12 Apr 2023 11:54:30 +0000 https://intelliaq.eu/?p=1876 We are very proud to present our abstracts:

  • END-TO-END PROCESS ORCHESTRATION OF EARTH OBSERVATION DATA WORKFLOWS WITH APACHE AIRFLOW ON HIGH PERFORMANCE COMPUTING
  • MULTIMODAL SELF-SUPERVISED LEARNING FOR BOOSTING CROP CLASSIFICATION USING SENTINEL2 AND PLANETSCOPE

at IGARSS 2023 in Pasadena, California.

More Information will follow at https://2023.ieeeigarss.org/index.php

]]>
IntelliAQ workshop on Machine Learning for Air Quality 2023 https://intelliaq.eu/2023/01/13/intelliaq-workshop-on-machine-learning-for-air-quality-2023/ Fri, 13 Jan 2023 09:43:00 +0000 https://intelliaq.eu/?p=1868 IntelliAQ is sponsoring two workshops in Cologne in early March 2023. The “IntelliAQ workshop on machine learning for air quality” (https://indico3-jsc.fz-juelich.de/event/68/) aims to bring together researchers from the air quality and machine learning communities for discussion of recent research progress and future priorities. Machine learning (ML) is rapidly gaining momentum as a new toolbox for analysing atmospheric data. While there are now several workshops, fora and conferences to discuss ML applications in the weather and climate domain, discussions on ML applications for air quality remain fragmented. The ERC project IntelliAQ has explored several modern ML concepts for air quality research and we would like to engage in a discussion with the international community about the potential and limitations of ML in this field. The second workshop, immediately following the first one, is a “Tropospheric Ozone Assessment Report (TOAR-II) science workshop” (https://indico3-jsc.fz-juelich.de/event/69/). IntelliAQ supports the development of the TOAR data infrastructure and draws on data and scientific insights from this global initiative.

]]>
Looking Back at Workshop “Transformers for Environmental Science” https://intelliaq.eu/2022/10/21/looking-back-at-workshop-transformers-for-environmental-science/ Fri, 21 Oct 2022 08:28:14 +0000 https://intelliaq.eu/?p=1858 JSC, the Otto von Guericke University of Magdeburg, and the Technical University of Munich jointly organised a workshop on “Transformers for Environmental Science” on September 22 and 23, 2022. The workshop was co-sponsored by the ERC grant IntelliAQ and brought together about 40 participants in Magdeburg and up to 20 additional online participants, who discussed the potential of this new AI technology for environmental applications. The program included lectures on recent advances in transformer architectures and transfer learning as well as on prototype developments focusing largely on atmospheric research and remote sensing. Keynote presentations were given by Peter Düben (ECMWF), Pedram Hassanzadeh (Rice University, Houston), Duncan Watson-Parris (Oxford University), Lucas Beyer (Google Brain) and Jonathan Godwin (Google Deepmind). A poster session and panel discussion provided opportunities for an exchange of ideas. Large transformer models require huge amounts of data and constitute an attractive application for accelerated supercomputers such as JUWELS Booster. Within the atmorep compute time project, first steps towards training such a model for atmospheric research are being taken.

]]>
Paper on Ozone Forecast with Deep Learning published https://intelliaq.eu/2022/07/01/paper-on-ozone-forecast-with-deep-learning-published/ Fri, 01 Jul 2022 11:00:48 +0000 https://intelliaq.eu/?p=1842 Lukas H. Leufen, Felix Kleinert (both FZ Jülich and University of Bonn) and Martin G. Schultz (FZ Jülich) have published their latest research results of the study “Exploring decomposition of temporal patterns to facilitate learning of neural networks for ground-level daily maximum 8-hour average ozone prediction” in the Journal Environmental Data Science.  The study shows how the accuracy of deep neural networks for forecasting ground-level ozone can be improved by splitting long-term and short-term weather patterns. The article is available at https://www.doi.org/10.1017/eds.2022.9 .

]]>
Preprint “Representing chemical history in ozone time-series predictions – a model experiment study building on the MLAir (v1.5) deep learning framework” available https://intelliaq.eu/2022/05/20/preprint-representing-chemical-history-in-ozone-time-series-predictions-a-model-experiment-study-building-on-the-mlair-v1-5-deep-learning-framework-available/ Fri, 20 May 2022 06:52:00 +0000 https://intelliaq.eu/?p=1829 Felix Kleinert et al. submitted their article “Representing chemical history in ozone time-series predictions – a model experiment study building on the MLAir (v1.5) deep learning framework” to the Journal Geoscientific Model Development. It is now available as preprint for public discussion and review at https://gmd.copernicus.org/preprints/gmd-2022-122/ until July 6th 2022.

]]>
Canonical Analysis Workflows – reproducibility and reusability on air quality data https://intelliaq.eu/2022/03/24/canonical-analysis-workflows-reproducibility-and-reusability-on-air-quality-data/ Thu, 24 Mar 2022 08:42:26 +0000 https://intelliaq.eu/?p=1816 The scientific research area suffers from a reproducibility crisis: A nature report from 2016 (https://www.nature.com/articles/d41586-019-00067-3) revealed that 70% of scientists tried to reproduce their research and failed.

Since then, a lot of effort has been undertaken to make workflows more reusable and thereby making results reproducible. With our TOAR-II (Tropospheric Ozone Assessment Report phase II) database infrastructure we are supporting this approach. It has been lifted to a new level of FAIRness (https://www.go-fair.org/fair-principles/) by integrating more of the FAIR principles through redesigning the database and related services. In addition, new concepts were developed to achieve reproducibility and reusability via standardized workflows and objects.

Canonical workflows consist of automated workflows or workflow fragments which allow for reusability of these snippets in different contexts. The development of reusable workflows and software for scientific data analysis depends on reusable data, which must be described appropriately and standardized to ensure reliable and meaningful analysis results.

We, therefore, developed a concept where we focus on two important, indispensable, and inseparable prerequisites for workflow sharing: data harmonization and documentation.

In our concept paper, we show that the necessary data harmonization for establishing online data analysis services goes much deeper than the obvious issues of common data formats, variable names, and measurement units, and we explore how the generation of FAIR Digital Objects (FDO) and Research Objects (RO) together with automatically generated documentation may support Canonical Analysis Workflows for air quality and related data. We are convinced that our experiences from the TOAR database will show that data harmonization alongside with documentation constitutes a big step towards realizing the potential of canonical workflows.

Integrating FDOs/ROs into the TOAR data ingestion workflow. For the TOAR database, an RO-Crate is used. All new data ingestions are registered as FDO packages within the RO over time. As new FDOs are created over time, the RO will be updated with a pointer to the latest FDO and associated data while earlier FDOs remain accessible. By this approach, a kind of snapshot enables the traceability of the database states.

Schröder et al., Enabling Canonical Analysis Workflows – Documented data harmonization on global air quality data, Data Intelligence Journal. 2022; in print

]]>