Slicing Datasets and Creating Views#

In this benchmark we will use the data in the Slicing Datasets and Creating Views tutorial to measure and compare ironArray’s performance with Zarr, HDF5 and TileDB, another chunked and compressed array format. In the in memory scenario, we will also compare ironArray’s performance with NumPy.

Opening the Dataset#

We will first open the dataset and use the optimized chunk and block shapes obtained in the corresponding tutorial.

[1]:
import zarr
import h5py
import hdf5plugin as h5plugin
import numpy as np
import iarray as ia
from numcodecs import blosc, Blosc
import random
import os
import tiledb

ia.set_config_defaults(favor=ia.Favor.SPEED)
# Activate multithreading for Blosc in Zarr
blosc.set_nthreads(ia.get_config_defaults().nthreads)

%load_ext memprofiler
[2]:
%%time
ia_urlpath = "../tutorials/precip1.iarr"
precip_ia = ia.open(ia_urlpath)
print(precip_ia.info)
type   : IArray
shape  : (720, 721, 1440)
chunks : (128, 128, 256)
blocks : (16, 32, 64)
cratio : 14.62

CPU times: user 156 µs, sys: 273 µs, total: 429 µs
Wall time: 397 µs
[3]:
chunks = (precip_ia.shape[0], 8, precip_ia.shape[2])
blocks = (64, 4, 128)

precip_ia_op = precip_ia.copy(chunks=chunks, blocks=blocks)
print(precip_ia_op.info)
type   : IArray
shape  : (720, 721, 1440)
chunks : (720, 8, 1440)
blocks : (64, 4, 128)
cratio : 9.28

We will perform some random slices from each dimension of the array:

[4]:
ind = random.sample(range(min(precip_ia.shape)), 10)

In-memory performance#

Let’s start by measuring performance when data is in memory.

ironArray#

[5]:
%%mprof_run 1.iarray::dim0

for i in ind:
    ia_0 = precip_ia_op[i, :, :].data
memprofiler: used 13.03 MiB RAM (peak of 15.11 MiB) in 0.2010 s, total RAM usage 649.09 MiB
[6]:
%%mprof_run 1.iarray::dim1

for i in ind:
    ia_1 = precip_ia_op[:, i, :].data
memprofiler: used 0.07 MiB RAM (peak of 13.24 MiB) in 0.0471 s, total RAM usage 649.18 MiB
[7]:
%%mprof_run 1.iarray::dim2

for i in ind:
    ia_2 = precip_ia_op[:, :, i].data
memprofiler: used 0.03 MiB RAM (peak of 3.20 MiB) in 0.2046 s, total RAM usage 649.47 MiB

Zarr#

[8]:
compressor = Blosc(cname='lz4', clevel=5, shuffle=Blosc.SHUFFLE)
precip = precip_ia.data
precip_zarr = zarr.array(precip, chunks=chunks, compressor=compressor)
[9]:
%%mprof_run 2.zarr::dim0

for i in ind:
    zarr_0 = (precip_zarr[i, :, :])
memprofiler: used 4.69 MiB RAM (peak of 4.69 MiB) in 0.6793 s, total RAM usage 3864.16 MiB
[10]:
%%mprof_run 2.zarr::dim1

for i in ind:
    zarr_1 = precip_zarr[:, i, :]
memprofiler: used 3.24 MiB RAM (peak of 3.24 MiB) in 0.0282 s, total RAM usage 3867.42 MiB
[11]:
%%mprof_run 2.zarr::dim2

for i in ind:
    zarr_2 = precip_zarr[:, :, i]
memprofiler: used 2.02 MiB RAM (peak of 2.02 MiB) in 0.6705 s, total RAM usage 3869.45 MiB

HDF5#

[12]:
h5_urlpath = "precip1.hdf5"

if not os.path.exists(h5_urlpath):
    with h5py.File(h5_urlpath, "w") as f:
        h5_precip = f.create_dataset("h5_precip", precip_ia.shape, dtype=precip_ia.dtype,
                                     chunks=chunks, **h5plugin.Blosc())
        precip_ia.copyto(h5_precip)


h5_file = h5py.File(h5_urlpath, "r", driver='core', backing_store=False)
precip_h5 = h5_file['h5_precip']
[13]:
%%mprof_run 3.hdf5::dim0
for i in ind:
    h5_0 = precip_h5[i, :, :]
memprofiler: used 48.47 MiB RAM (peak of 48.47 MiB) in 6.2221 s, total RAM usage 4290.18 MiB
[14]:
%%mprof_run 3.hdf5::dim1
for i in ind:
    h5_1 = precip_h5[:, i, :]
memprofiler: used 0.00 MiB RAM (peak of 0.00 MiB) in 0.0930 s, total RAM usage 4290.20 MiB
[15]:
%%mprof_run 3.hdf5::dim2
for i in ind:
    h5_2 = precip_h5[:, :, i]
memprofiler: used 0.01 MiB RAM (peak of 0.01 MiB) in 5.9200 s, total RAM usage 4290.22 MiB

TileDB#

[16]:
%%time
shape = precip_ia_op.shape
adom = tiledb.Domain(
        tiledb.Dim(name="rows", domain=(0, shape[0] - 1), dtype=np.int32, tile=chunks[0]),
        tiledb.Dim(name="cols", domain=(0, shape[1] - 1), dtype=np.int32, tile=chunks[1]),
        tiledb.Dim(name="else", domain=(0, shape[2] - 1), dtype=np.int32, tile=chunks[2]),
    )

filters = tiledb.FilterList([tiledb.ByteShuffleFilter(), tiledb.LZ4Filter(5)])
schema = tiledb.ArraySchema(
    domain=adom,  sparse=False, attrs=[tiledb.Attr(name="a", dtype=np.float32, filters=filters)]
)

# Create the (empty) array on disk.
mem_tiledb_urlpath = "mem://mem_precip_tiledb"

tiledb.DenseArray.create(mem_tiledb_urlpath, schema)
with tiledb.DenseArray(mem_tiledb_urlpath, mode="w") as precip_tile:
    precip_tile[:] = precip_ia_op.data
CPU times: user 34.5 s, sys: 6.13 s, total: 40.7 s
Wall time: 2.38 s
[17]:
%%mprof_run 4.tiledb::dim0
with tiledb.DenseArray(mem_tiledb_urlpath, mode="r") as precip_tile:
    for i in ind:
        tiledb_0 = precip_tile[i, :, :]
memprofiler: used 300.14 MiB RAM (peak of 3520.30 MiB) in 6.2238 s, total RAM usage 5390.23 MiB
[18]:
%%mprof_run 4.tiledb::dim1
with tiledb.DenseArray(mem_tiledb_urlpath, mode="r") as precip_tile:
    for i in ind:
        tiledb_1 = precip_tile[:, i, :]
memprofiler: used -179.11 MiB RAM (peak of 0.00 MiB) in 0.1446 s, total RAM usage 5211.13 MiB
[19]:
%%mprof_run 4.tiledb::dim2
with tiledb.DenseArray(mem_tiledb_urlpath, mode="r") as precip_tile:
    for i in ind:
        tiledb_2 = precip_tile[:, :, i]
memprofiler: used 196.96 MiB RAM (peak of 3435.93 MiB) in 6.3202 s, total RAM usage 5408.10 MiB

NumPy#

[20]:
precip_np = precip_ia_op.data

NumPy uses views for slices, so we are forcing a copy to compare apples with apples:

[21]:
%%mprof_run 5.numpy::dim0

for i in ind:
    np_0 = precip_np[i, :, :].copy()
memprofiler: used 0.00 MiB RAM (peak of 0.00 MiB) in 0.0078 s, total RAM usage 8258.30 MiB
[22]:
%%mprof_run 5.numpy::dim1

for i in ind:
    np_1 = precip_np[:, i, :].copy()
memprofiler: used 0.02 MiB RAM (peak of 0.02 MiB) in 0.0086 s, total RAM usage 8258.33 MiB
[23]:
%%mprof_run 5.numpy::dim2

for i in ind:
    np_2 = precip_np[:, :, i].copy()
memprofiler: used 0.00 MiB RAM (peak of 0.00 MiB) in 0.0319 s, total RAM usage 8258.33 MiB

Results#

Let’s make sure that all the results are correct:

[24]:
np.testing.assert_almost_equal(ia_0, zarr_0)
np.testing.assert_almost_equal(ia_1, zarr_1)
np.testing.assert_almost_equal(ia_2, zarr_2)
np.testing.assert_almost_equal(ia_0, h5_0)
np.testing.assert_almost_equal(ia_1, h5_1)
np.testing.assert_almost_equal(ia_2, h5_2)
np.testing.assert_almost_equal(ia_0, np_0)
np.testing.assert_almost_equal(ia_1, np_1)
np.testing.assert_almost_equal(ia_2, np_2)
np.testing.assert_almost_equal(ia_0, tiledb_0["a"])
np.testing.assert_almost_equal(ia_1, tiledb_1["a"])
np.testing.assert_almost_equal(ia_2, tiledb_2["a"])

And now, the execution times:

[25]:
%mprof_barplot -t "Slicing Performance (with an optimized dimension)" --variable time  .*zarr::.* .*hdf5::.* .*iarray::.* .*numpy::.* .*tiledb::.*