Analyze Dataset
An example on how to use the analyzer to analyze a dataset.
Imports
For this example we will use only xarray and analyze_dataset from enstools-compression.
[1]:
import xarray
from enstools.compression.analyzer.analyzer import analyze_dataset
WARNING: eccodes c-library not found, grib file support not available!
[2]:
dataset_name = "air_temperature"
dataset = xarray.tutorial.open_dataset(dataset_name)
dataset
[2]:
<xarray.Dataset>
Dimensions: (lat: 25, time: 2920, lon: 53)
Coordinates:
* lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
* lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
* time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
air (time, lat, lon) float32 ...
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...Analyze dataset using default constrains
Use analyze_dataset to obtain the compression specification that guarantee quality constrains while maximising compression ratios. In this case if the argument constrains is not provided it will use the default ones, which are "correlation_I:5,ssim_I:2".
Note:
correlation_Iis computed like:-log10(1-pearson_correlation). i.e. number of nines of correlation
correlation_I:5 == correlation:0.99999Similarly
ssim_Iis computed like:-log10(1-ssim).
[3]:
encoding, metrics = analyze_dataset(dataset=dataset)
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
The function returns two dictionaries, one containing the best encoding and another containing the resulting metrics.
[4]:
encoding
[4]:
{'air': 'lossy,sz,pw_rel,0.000549'}
[5]:
metrics
[5]:
{'air': {'correlation_I': 5.064244624947207,
'ssim_I': 3.856431344802704,
'compression_ratio': 7.868889904065783}}
Analyze dataset using custom constrains
If we want to specify different constrains we can do it like this:
[6]:
encoding, metrics = analyze_dataset(dataset=dataset,constrains="correlation_I:3,ssim_I:1")
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested
WARNING: Only absolute and norm2 modes properly tested