Friday, December 20, 2019

Limitations of Purple Air Observations

British statistician George Box once wrote "all models are wrong, some are useful."  When it comes to observations, I like to paraphrase that to "all observations are bad, some are useful."  This statement reflects the fact that all observations contain errors and uncertainty, but they can still be useful.

PurpleAir uses low-cost laser particle counters to estimate PM2.5 concentrations.  The sensors can be purchased and operated by anyone, with data available at  Many groups and individuals have installed PurpleAir sensors across the Salt Lake Valley, northern Utah, and other parts of the nation (and even world).  In the Salt Lake Valley, there is a remarkably high density of stations.  Below is a PurpleAir map of PM2.5 concentrations from 7:09 AM MST this morning.  With such a high density of stations, one can see some of the spatial variability in pollution, including the relatively low values of PM2.5 on the east bench compared to the central and northwest valley.

It should be noted, however, that while useful for examining the spatial patterns of pollution, PurpleAir sensors can have low absolute accuracy.  What this means is that while one can see that the east side has relatively clean air compared to the central and northwest valley in the map above, the actual values for PM2.5 concentrations may be off.  Kelly et al. (2017) examined the performance of PurpleAir sensors compared to research grade instruments and while they found good correlation, they also found that it overestimated particulate matter concentrations during cold air pools.  In other words, during what many Utahns refer to as inversions.  More recently, Tryner et al. (2020) also found PurpleAir sensors overestimated PM2.5  concentrations in the field. 

Data from PurpleAir sensors is now being used on local news broadcasts and by the National Weather Service.  However, the limitations of these observations needs to be recognized.  Last night, the National Weather Service tweeted that air quality was in the red across much of the Wasatch Front, Tooele Valley, and Cache Valley.  In their tweet, they included maps with PurpleAir observations. 

However, data from Utah Division of Air Quality sensors, as well as sensors operated by the University of Utah, showed PM2.5 concentrations to be much lower.  At Hawthorne Elementary, hourly PM2.5 concentrations peaked at 37 ug/m3, on the low end of unhealthy for sensitive groups.
Source: DAQ
Elsewhere last night, DAQ sensors in Davis County peaked at 30 ug/m3, Tooele County at 39 ug/m3 (although there was a spike to 54 at 11 AM), and Weber County at 23 ug/m3.  These observations are consistent with air quality in the moderate or unhealthy for sensitive groups depending on location.  In Cache County, PM2.5 concentrations were highest, but still DAQ sensors peaked at 47 ug/m3, still in the unhealthy for sensitive groups category. 
Thus, DAQ sensors did not indicate PM2.5 concentrations were as high as indicated by PurpleAir and the tweet issued by the National Weather Service that air quality was in the red was not consistent with DAQ observations.  

Finally, we could examine observations collected by the University of Utah on Trax Trains and at various sites in the valley.  Below is a map for the period from 6:29-7:29 AM this morning.  Again, values are lower than indicated by the PurpleAir sensors (compare with the first graphic in this post). 

I think that the PurpleAir network is wonderful in the sense that it helps us to identify the spatial patterns of pollution, but it is important that their tendency to overestimate PM2.5 concentrations be recognized by those communicating with the public.


  1. It's worth pointing out that observations on the Purple Air map that are circled in black denote *indoor* sensors, which tend to be cleaner. And, there are a lot more indoor sensors in the north/east part of the valley than in other areas.

    If only outdoor sensors are displayed, the pollution distribution is less extreme. Though, I do agree that the elevated areas of the valley are cleaner than the lowlands.

    1. Thanks. I should have noted that, although my comment about the clean east Bench was supported by the outdoor sensors near foothill and I-215 where several have readings ≤10.

  2. There is a conversion factor available in the legend on the PurpleAir map called AQandU created by Kerry Kelly specifically for PurpleAir sensor data for Salt Lake valley during the winter time.

    1. Thanks for pointing that out. I assume you mean this site: If so, I don't think what is being used for television broadcasts and by many people is the conversion factor data.

  3. I've appreciated PurpleAir (PA) ever since discovering them during the 2017 fires in northern California. I agree with your observations, but they're missing a couple of very significant workarounds, that I wish PA -- or at least their marketing department -- should be pointing out. (Generally I highly dislike marketing department strategies, but PA's seems to be too humble?)

    Do not use the default settings the one first sees in the lower left control panel. 1) Instead of using "none" as a conversion factor, I use the LRAPA standard. 2) Uncheck "Inside Sensors"; unchecked should be the default, except for the rare person who's only looking at the map for personal data. 3) Ten minute averaging gives an OK, overall picture of wide areas, but "Show Realtime" does, too, plus it shows up-to-the-minute data that I've found extremely useful for my local (neighborhood) area.

    It's important to note, that even if these instruments are not 100% accurate in 100% of the instruments, I've concluded they're close to 100% consistent when compared to each other, and when one mentally filters out the very few instruments that are failing. Most failures are immediately obvious, and easily discounted when one can see the errors as compared to other, obviously functional instruments.

    PA's data filtering on their map is too weak when it comes to discounting the obvious failures mentioned above, but again, it seems to lean to their customer's point of view rather than the more public point of view. E.g. when one of the two lasers is obviously failing, the data from the failing laser is still included in the air quality averages. Also, when the whole instrument is not functioning, a gray dot for that instrument is shown on the map... I just saw one that's been appearing on the map for over a hundred days!

    But even the gray dots can provide useful data in some cases, e.g. when power goes out in an area with a few or more instruments, one can see just by looking at that area of the map not only where power was lost, but when it was lost.

    Finally, I've personally benefited immensely from PA maps during fire season. With the settings I recommended above, and in my area that now has dozens if not a hundred (in a wider area) instruments, I can SEE smoke travelling minute by minute across the city and the county. The qualitative (even if not a specific-standard, quantitative) accuracy of is astounding! Alongside real-time wind direction maps (like, I've successfully predicted when it's safe to spend time outside, and for approximately how long.

    Granted, typical observers will not (at least immediately) know how to interpret PA and Windy to make their own assessments as I do, practically routinely. But the potential for mass consumption of raw but useful air quality data that's already available is, I have to say again, astounding!

    (BTW, clicking on the "Preview" button just brought up a blank block, making me fear that my slow, 20 minutes of writing just disappeared into space. Fortunately it was all still there when I clicked on the back arrow to return to the the previous web page. Got mad at myself for not saving my work to the clipboard before clicking that Preview button. FYI, right after I clicked the Preview button I saw an error message (that looked like it came from Blogger), but it popped up then disappeared too quickly to read. (I am signed into Blogger.) Now we'll see if the Publish button works...