Configuring ironArray

ironArray offers a lot of configuration parameters for creating and operating with arrays. However, setting the same compression and storage-related properties every time you create an array object can become tedious and repetitive, specially when you are dealing with datasets with stable properties. ironArray offers you the option to set default properties in either the global configuration or within a context.

Global configuration

If you will always use the same configuration parameters in your script, it might be a good idea to set default global properties as part of your script initialization:

[1]:
import iarray as ia

ia.set_config_defaults(codec=ia.Codec.ZSTD, clevel=1, btune=False)
[1]:
Config(codec=<Codec.ZSTD: 5>, clevel=1, favor=<Favor.BALANCE: 0>, filters=[<Filter.SHUFFLE: 1>], fp_mantissa_bits=0, use_dict=False, nthreads=32, eval_method=<Eval.AUTO: 1>, seed=1, random_gen=<RandomGen.MERSENNE_TWISTER: 0>, btune=False, dtype=<class 'numpy.float64'>, split_mode=<SplitMode.AUTO_SPLIT: 3>, chunks=None, blocks=None, urlpath=None, mode='w-', contiguous=None)

You can verify that the new default properties are now set: the default compression codec has changed to ZSTD, and the default compression level has changed to 1.

[2]:
cfg = ia.Config()
print(cfg)
Config(codec=<Codec.ZSTD: 5>, clevel=1, favor=<Favor.BALANCE: 0>, filters=[<Filter.SHUFFLE: 1>], fp_mantissa_bits=0, use_dict=False, nthreads=32, eval_method=<Eval.AUTO: 1>, seed=1, random_gen=<RandomGen.MERSENNE_TWISTER: 0>, btune=False, dtype=<class 'numpy.float64'>, split_mode=<SplitMode.AUTO_SPLIT: 3>, chunks=None, blocks=None, urlpath=None, mode='w-', contiguous=None)

These will be the defaults for all the ironArray functions that are called in your script (except for copy, load, save and open which have different config rules, see the documentation for more info about this).

Contextual Configuration

Sometimes you want different configuration profiles for different kinds of arrays. In this case, you can create ia.config objects with custom settings that can be applied to selected arrays. This is an example of contextual configuration:

[3]:
shape = [1000, 1000]
with ia.config(clevel=9, codec=ia.Codec.LZ4):
    a1 = ia.linspace(shape, -1, 0)
a2 = ia.linspace(shape, -1, 0)
print(f"a1 cratio: {a1.cratio:.4f}")
print(f"a2 cratio: {a2.cratio:0.4f}")
a1 cratio: 5.8390
a2 cratio: 9.5427

In this case, a1 and a2 have different compression ratios, as they have different compression levels and compression codecs set as default properties on their array configurations. a1 is using the LZ4 codec with compression level 3, whereas a2 is using ZSTD and compression level 1, the global defaults that we set in the previous example.

Conclusion

Make sure to use the advanced global and contextual configurations to set often-used configuration profiles for your arrays.