Skip to main content

3 posts tagged with "big-data"

View All Tags

Cat2Cloud is Here: Stop Moving Data, Start Using It!

· 4 min read
Francesc Alted
CEO, ironArray SLU
Luke Shaw
Product Manager, ironArray SLU
David Ibáñez
CTO, ironArray SLU

For too long, working with large datasets has meant one thing: waiting. Waiting for data to download. Waiting for it to load into memory. Waiting for computations to finish. Today, we're excited to announce that the waiting is over.

After two years of intensive development and six months of successful beta testing, we are thrilled to launch Cat2Cloud! It's a groundbreaking platform that lets you store, visualize, and compute on massive datasets directly in the cloud, without ever needing to download the data first.

Imagine interacting with terabytes of data as if they were on your local machine. That's the power of Cat2Cloud.

Figure: Cat2Cloud block diagram

The Magic Behind Cat2Cloud

At its core, Cat2Cloud is powered by the legendary compression, chunking and computation technologies of Blosc2 and our own Caterva2 web server engine. This foundation allows for unprecedented efficiency. By leveraging Blosc2, Cat2Cloud not only minimizes your storage footprint but also accelerates data transmission and enables lightning-fast computations on datasets that are too large to fit in memory.

This means that nearly all the powerful features and APIs from Caterva2's documentation and Python-Blosc2 are at your fingertips within the Cat2Cloud ecosystem.

Figure: Caterva2 block diagram

But Cat2Cloud is more than just a backend. We've built a rich, user-friendly experience on top, including native support for Jupyter notebooks right in the web interface, so you can go from data to insight in a single place.

What Can You Do with Cat2Cloud?

Cat2Cloud is designed to supercharge your data workflows. Here's how it will change the way you work:

Computing Expressions in Blosc2

· 7 min read
Oumaima Ech Chdig
Intern, ironArray SLU

What expressions are?

The forthcoming version of Blosc2 will bring a powerful tool for performing mathematical operations on pre-compressed arrays, that is, on arrays whose data has been reduced in size using compression techniques. This functionality provides a flexible and efficient way to perform a wide range of operations, such as addition, subtraction, multiplication and other mathematical functions, directly on compressed arrays. This approach saves time and resources, especially when working with large data sets.

An example of expression computation in Blosc2 might be:

dtype = np.float64
shape = [30_000, 4_000]
size = shape[0] * shape[1]
a = np.linspace(0, 10, num=size, dtype=dtype).reshape(shape)
b = np.linspace(0, 10, num=size, dtype=dtype).reshape(shape)
c = np.linspace(0, 10, num=size, dtype=dtype).reshape(shape)

# Convert numpy arrays to Blosc2 arrays
a1 = blosc2.asarray(a, cparams=cparams)
b1 = blosc2.asarray(b, cparams=cparams)
c1 = blosc2.asarray(c, cparams=cparams)

# Perform the mathematical operation
expr = a1 + b1 * c1 # LazyExpr expression
expr += 2 # expressions can be modified
output = expr.compute(cparams=cparams) # compute! (output is compressed too)

Compressed arrays ( a1, b1, c1) are created from existing numpy arrays ( a, b, c) using Blosc2, then mathematical operations are performed on these compressed arrays using general algebraic expressions. The computation of these expressions is lazy, in that they are not evaluated immediately, but are meant to be evaluated later. Finally, the resulting expression is actually computed (via .compute()) and the desired output (compressed as well) is obtained.

How it works

Unlocking Big Data Potential with Blosc Compression

· 3 min read
Francesc Alted
CEO, ironArray SLU

Dear valued community,

Two years ago, ironArray embarked on an ambitious journey with the launch of our groundbreaking ironArray product, designed to revolutionize computations with compressed data. While our aspirations were high, we faced challenges in gaining traction and failed to meet our sales targets.

However, every setback is an opportunity for growth and transformation. Today, we are thrilled to announce a strategic shift in our business focus towards consulting services, leveraging the power of compression for big data, specifically through the acclaimed Blosc compressor.