A lightweight, flexible and fast toolkit for managing your data
Unleash the full power of your hardware to snappily process your extremely large multidimensional datasets in memory or on disk.
We do believe that Big Data handling can play better with our environment by making a more intelligent use of the energy, so we are making ironArray in such a way that it allows computations with large datasets to make a more effective use of modern, cost-effective multi-core CPUs and fast, local storage.
- High performance matrix and vector computations
- Automatic data compression and decompression
- Contiguous or sparse storage
- Tunable performance optimizations that leverage your specific CPUs caches, memory and disks
The data container is based on the Blosc2 library and format and leverages the Caterva library for fast manipulation of multidimensional data.
Open data format
ironArray uses the simple, expandible and well documented and open-source format from the Blosc2 library.
Why it works
ironArray organizes your data into chunks that fit into the cache of your CPU, then uses standard map, reduce, filter, and collect algorithms to perform calculations on large arrays directly in the high-speed CPU cache.
ironArray allows you to express your
performance and storage preferences
Based on your preferences, it will tune your configuration using state-of-the-art machine learning algorithms, then create new arrays and perform calculations using the applied optimizations.
For example, you can set a preference for high compression ratios, or for high-speed computation, or for some balance between the two.