To evaluate the model performance, we base our analysis on the full joint distribution of forecasts and observations (Murphy and Winkler, 1987). We access detailed information on the joint distribution by factorisation into a conditional, and a marginal distribution [Figure 10].

Moreover, we evaluate our model against persistence and a climatological (observational) reference model. A new model only has a genuine added value against those simple reference models, if it outperforms both of them. To analyses this, we use skill scores which are relative relations of scoring rules like the mean squared error (MSE).

Figure 11: Conditional quantile plot for lead time up to four days ((a) to (d)). Conditional percentiles (.10th and .90th, .25th and .75th and .50th) from the conditional distribution p(o|m) are shown as lines in different styles. The reference line indicates a hypothetic perfect forecast. The marginal distribution of the forecast p(m) is shown as log-histogram.

Posted in qa