NCAR RDA Dataset d583147

A Statistical Analysis of Lossily Compressed CESM-LENS Data

d583147

| DOI: 10.5065/5SQY-ZF23

Abstract:

The data storage burden resulting from CESM simulations continues to grow, and lossy data compression methods can alleviate this burden, provided that key climate variables are not altered to the point of affecting scientific conclusions. This dataset was generated to evaluate the effects of two leading lossy compression algorithms, SZ and ZFP, on daily output data from the CESM-LENS dataset. In particular, it contains daily data for variables TS (surface temperature) and PRECT (precipitation rate) from the historical forcing period (1920-2005) for CESM-LENS ensemble member 30. The provided data has been compressed and reconstructed via two popular compressors: SZ 1.4.13 and ZFP 0.5.3 with a number of different absolute error tolerances. Errors due to compression can be determined by comparing these reconstructed files to the original CESM-LENS timeseries data, and statistical methods can evaluate the errors at different spatiotemporal scales. While both compression algorithms show promising fidelity with the original output, detectable artifacts are introduced even at relatively tight error tolerances.

Variables:

Precipitation Rate

Surface Temperature

Data Types:

Model Simulation

Data Contributors:

UCAR/NCAR/CISL

Computational & Information Systems, National Center for Atmospheric Research, University Corporation for Atmospheric Research

Total Volume:

98.18 GB

Data Formats:

HDF5/NetCDF4

Metadata Record:

Data License:

This work is licensed under a Creative Commons Attribution 4.0 International License.

Citation counts are compiled through information provided by publicly-accessible APIs according to the guidelines developed through the https://makedatacount.org/ project. If journals do not provide citation information to these publicly-accessible services, then this citation information will not be included in RDA citation counts. Additionally citations that include dataset DOIs are the only types included in these counts, so legacy citations without DOIs, references found in publication acknowledgements, or references to a related publication that describes a dataset will not be included in these counts.