Thursday, February 21, 2013

How Good Are Snow Forecasts Anyway?

This has been a frustrating winter for me as a forecaster.  We've had some weird storms, and others that just haven't panned out as I thought.  Thus, I thought we would take a look today at how weather forecasts have improved over the past couple of decades and why the Intermountain West continues to be a black hole for weather forecasters.

To do this, we will be examining the accuracy of forecasts produced by the Hydrometeorological Prediction Center (HPC), which provides precipitation forecasts and outlooks for the entire continental United States.

The metric that they use to evaluate the accuracy of their forecasts is known as the equitable threat score.  Without getting into the gory details, the bottom line is the higher the equitable threat score, the more accurate the forecast.

Forecasts produced by computer models (i.e., the NAM and GFS) and by the HPC forecasters show considerable progress going back to the beginning of this record in 1993 (and certainly much progress before that).  For example, for storms producing an inch of precipitation (snow-water equivalent) in 24 hours, the NAM and GFS equatable threat scores have climbed from about .15 in 1993 to about .25 in 2006, with some evidence of a plateau thereafter.

The forecasts produced by the HPC forecasters also improve over time from about 0.22 in 1993 to about .32 from 2008–2012.  Note that the gap between the model and human forecasts hasn't closed much, which suggests that when given better model forecasts, human forecasters continue to find ways to exploit and improve those forecasts.

So, forecasts are better today than they were 10 or 20 years ago, and there have been some major forecast victories (e.g., Hurricane Sandy) that could not have been achieved with the tools and models of 20 years ago.

However, those statistics are for the entire United States.  While it is likely that forecasts have improved everywhere since the early 1990s, they haven't improved uniformly.  If we look at the equitable threat scores for precipitation forecasts from 16 April 2011–15 April 2012 by region, we find they are highest along the U.S. west coast and in the northeastern U.S., and lowest in the western U.S. interior and southeast U.S.  This holds for both modest (a threshold of 0.5") through large (2.0") 24-hour accumulation thresholds, although there is more scatter in the highest bin due to the small number of events.

Equitable threat scores for HPC and calibrated ensemble modeling system
forecasts from 16 April 2011–15 April 2012 by region.  Courtesy
Keith Brill, HPC 
These variations partially reflect differences in the climatology of precipitation systems in these regions.  For example, over the western U.S., precipitation systems tend to be big and broad over the Pacific Coast, but splintered and broken up over the interior, and this is one factor that makes forecasting very difficult.  Over the southeast, difficult to pinpoint thunderstorms contributes to lower accuracy.

Based on these statistics, the most reliable snow forecasts are likely to be for the Cascades and Sierra, followed by the Northeast, and then the interior (e.g., Utah, Wyoming, Montana, Colorado).  Wasatch powderhounds who feel battered and bruised by forecasts have every right to complain to their brethren to the west.  However, while we don't know precisely when or where a storm will produce, we don't have to worry as much about snow quality.  

Of course, there are some caveats to this analysis.  It would be good to examine statistics strictly for winter.  Snow forecasts in the Sierra and Cascades also depend on the snow level forecast, and this can be a challenge in some storms.  Finally, there is the issue of converting from snow-water equivalent to snowfall amount, which contributes to errors in the forecast of snow amount. Nevertheless, if these factors are considered, I think we will find that the interior west remains the most difficult ski region in which to forecast snow.


  1. So, what does a 0.1 point improvement in the threat score mean to a layman like me? Is this significant?

    1. Yes, it is a significant gain in forecast accuracy. However, whether or not that gain yields real value depends on the user. Sometimes gains in accuracy don't yield much value until they reach a threshold whereby the number of "correct" forecasts is high enough and the number of false alarms is low enough that the end user can make decisions that help their business or day-to-day planning. I think that we have crossed that threshold for most major winter storms in the east and in the Pacific States (although there is still room for improvement, especially extending the lead time for warnings and major storms), but that we have yet to cross that threshold in Utah.

      Hopefully this helps.