Nardi, J., Feldman, N., Poppick, A., Baker, A., & Hammerling, D. (2018). Statistical Analysis of Compressed Climate Data (No. NCAR/TN-547+STR). doi:10.5065/D6HQ3XQJ
[Note: this Technical Note was updated on 2020-08-17 per the authors' request to correct an error. See the description of the change on page two of the document's front matter.] The data storage burden resulting from large climate model experiments only continues to grow. Lossy data compression m... Show more[Note: this Technical Note was updated on 2020-08-17 per the authors' request to correct an error. See the description of the change on page two of the document's front matter.] The data storage burden resulting from large climate model experiments only continues to grow. Lossy data compression methods are required to alleviate this burden, but lossy methods introduce the possibility that key climate variable fields could be altered to the point of affecting scientific conclusions. It is therefore important to develop a detailed understanding of how compressed climate model output differs from the original for different compression algorithms and compression rates. In this work, we evaluate the effects of two leading compression algorithms, sz and zfp, on daily average and monthly maximum temperature data, and daily average precipitation rate data, from a historical run of CESM1 CAM5.2. While both algorithms show promising fidelity with the original model output, detectable artifacts are introduced even at relatively low error tolerances. Examples for temperature data include biases in temperature gradient fields, temporal autocorrelation, and seasonal cycles; precipitation data show, for example, biases in the number of rainy days. We highlight the need for evaluation methods that are sensitive to errors at different spatiotemporal scales and specific to the particular climate variable of interest. Show less