Skip to main content

One post tagged with "linalg"

View All Tags

Matrices, Blosc2 and PyTorch

· 7 min read
Luke Shaw
Product Manager at ironArray SLU
Francesc Alted
CEO ironArray SLU

One of the core functions of any numerical computing library is linear algebra, which is ubiquitous in scientific and industrial applications. Much image processing can be reduced to matrix-matrix or matrix-vector operations; and it is well-known that the majority of the computational effort expended in evaluating neural networks is due to batched matrix multiplications.

At the same time, the data which provide the operands for these transformations must be appropriately handled by the library in question - being able to rapidly perform the floating-point operations (FLOPs) internal to the matrix multiplication is of little use if the data cannot be fed to the compute engine (and then whisked away after computation) with sufficient speed and without overburdening memory.

In this space, PyTorch has proven to be one of the most popular libraries, backed by high-performance compiled C++ code, optional GPU acceleration, and an extensive library with a huge array of efficient functions for matrix multiplication, creation and management. It is also one of the most array API-compliant libraries out there.

Blosc2 and Linear Algebra

However PyTorch does not offer an interface for on-disk data. This means that when working with large datasets which do not fit in memory, all data must be fetched in batches into memory, computed with, and then saved back to disk, using another library such as h5py. This secondary library also handles compression so as to reduce storage space (and increase the I/O speed with which data is sent to disk).

Blosc2 is an integrated solution which provides highly efficient storage via compression and marries it to a powerful compute engine. One can easily write and read compressed data to disk with a succint syntax, with decompression and computation handled efficiently by the computational workhorse. In addition, the library automatically selects optimal chunking parameters for the data, without any of the ad-hoc experimentation required to find 'good' batch sizes.