Preprint “Global, high-resolution mapping of tropospheric ozone” available

The preprint of the article “Global, high-resolution mapping of tropospheric ozone – explainable machine learning and impact of uncertainties”  by Clara Betancourt et al. is now available at It has been accepted for public discussion by the Journal Geoscientific Model Development.
The paper presents a data-driven ozone mapping workflow generating a transparent and reliable product. The global distribution of tropospheric ozone from sparse, irregularly placed measurement stations is mapped to a high-resolution regular grid using machine learning methods.

TOAR won second price of Open Data Impact Award 2021

The Stifterverband awarded the project “TOAR App, Tropospheric Ozone Assessment Report database” with the second price of the Open Data Impact Award 2021 ( “The award highlights the potential of Open Research Data for innovation and society.
The TOAR database is one of the largest collections of global near-surface ozone measurements. To further promote the replication of this data, Clara Betancourt, Jianing Sun and Sabine Schröder from Forschungszentrum Jülich are developing a smartphone app that will allow farmers to quantify ozone-related damage to their crops.”

IntelliAQ at Jülich’s Lecture Evening

Scarlet Stadtler will present developments in the IntelliAQ project at Jülich’s End-of-Year Lecture Evening on 18 Nov. 2021 ( The focus of the presentation will be on her interdisciplinary approach to develop AI techniques to find patterns within climate and air quality data.


As an important milestone on the way to the second Tropospheric Ozone Assessment Report (TOAR), a new TOAR data infrastructure has been developed with a focus on FAIR data ( First data has been uploaded to the TOAR-II database and more data will continuously be added. The REST API provides rich functionality for searching and accessing metadata and data, which are always delivered together. Read access is open to everyone under CC BY 4.0. The database and its metadata is well documented as are the related services (go to for links to the services and the documentation). The TOAR data infrastructure will be assessed with respect to its FAIRness and sustainability through the Core Trust Seal (CTS, The CTS application was submitted on September 29, 2021.

Paper on “Context aware benchmarking and tuning of a TByte-scale air quality database and web service” published

The performance of one of the world’s largest databases of near-surface air quality measurements and its services have been benchmarked and tuned with good results. Specifically the on-demand processing of several air quality metrics directly from the database has been in focus. The work published online by Earth Science Informatics describes explorating and benchmarking in-database approaches for the statistical processing, which resulted in performance enhancements of up to 32%.

High interest in “Can deep learning beat numerical weather prediction?”

We are delighted to see that in less than three months from publishing our paper “Can deep learning beat numerical weather prediction?” in Philosophical Transactions of the Royal Society, it has reached more than 8000 downloads. In addition, HPCwire (one of the leading portals on supercomputing) has covered our paper in a report that discusses in depth the main points raised in our article.

New method to produce high-resolution maps of ground-level ozone burden

A novel data fusion approach to combine the global observations from the Tropospheric Ozone Assessment Report database hosted by JSC with output from several numerical chemistry-transport models has been developed under the lead of the university of North Carolina, USA. The research has been published online March 8 by the journal Environmental Science & Technology ( The new method allows for the production of annual high-resolution maps of ground-level ozone burden, which can be combined with population density to assess the health risk from ozone air pollution. The findings from this study were used by the Global Burden of Disease 2019 (GBD2019) study, which estimated that about 365,000 people around the world died in 2019 from exposure to ozone pollution. The research used the largest compilation of ozone observations ever produced as well as estimates from nine global atmospheric models. By doing a data fusion, the research team was able to combine these different sources of information, making use of the advantages of each.

Paper on “Can deep learning beat numerical weather prediction?” published

Today, the Open Access article by Martin Schultz has been published in the Philosophical Transactions of the Royal Society A, theme issue “Machine learning for weather and climate modelling”. The paper discusses the question, whether it is possible to completely replace  the  current  numerical  weather  models  and data  assimilation  systems  with  deep learning  approaches. It is available at

AQ-Bench preprint available for public discussion

The preprint of the article “AQ-Bench: A Benchmark Dataset for Machine Learning on Global Air Quality Metrics”  by Clara Betancourt et al. is now in the public discussion phase. The manuscript is available at It has been accepted for public discussion by the Inter-Journal Special issue “Benchmark datasets and machine learning algorithms for Earth system science data” of the Journals Earth System Science Data and Geoscientific Model Development.
The AQ-Bench dataset contains air quality data and metadata at more than 5500 air quality observation stations all over the world. It offers a low-threshold entrance to machine learning on a real world environmental dataset. The dataset itself is available at . To start machine learning on the AQ-Bench dataset directly in your browser, visit the code repository ( and launch the binder!