NCEP has been running this new version of the GEFS in what is known as reforecast mode, which involved running it in hindsight mode for the past two years. This has allowed us to evaluate its performance for the past two cool-seasons (October to March).
The GEFS is an ensemble modeling system, meaning it produces multiple forecasts (known as members), but I find it helpful to begin an evaluation effort focusing on a single member (in this case the Control) to understand the capabilities and limitations of the model before examining how well it predicts probabilities. Below is one measure of model performance, known as an equitable threat score (the higher the better), for 12-36 hour (i.e., day 1) precipitation forecasts of the largest 25% of precipitation events at mountain SNOTEL stations in the western U.S. Although there is considerable spatial variability, the highest equitable threat scores are found in general in the Cascades and Sierras, with lower values at sites in the interior. This implies a general decrease in model accuracy as one moves into the interior. If we use some simple statistical techniques to try and account for unresolved terrain effects (the same used for our NAEFS ensemble products on weather.utah.edu), there is a significant improvement in model performance (right image) at nearly all sites, although one can still see the general decline as one moves inland. If you live in the mountains of Colorado and think your forecasts suck, you're right!
|Equitable threat scores for Day 1, 24-h precipitation forecasts at SNOTEL sites|
|Equitable threat scores for daily precipitation forecasts with increasing lead time (forecast day) at SNOTEL sites.|
I suspect that the ratio of false alarms to hits is nearly as bad for the GFS and the Euro, simply because at day 7, there's a lot of uncertainty in the forecasts. If your modus operandi is to look at one of these models and go with it, you are going to get burned a lot. Long range forecasts are about probabilities rather than certainties and it's better to understand the odds than hope for a single forecast to come through.
Speaking of odds, we're starting to look at how well the full GEFS ensemble does generating probabilities. We should know soon if it is a card shark or an irrational gambler.
Special thanks to Wyndam Lewis, Trevor Alcott, and Jon Rutz for their contributions to this post.
Friends of the Marriott Library Lecture
Instead of skiing Sunday afternoon, catch me at the Marriott Library and learn about the Greatest Snow on Earth.