Client class#

A client is a remote repository that can be subscribed to. It is the main entry point for using the Caterva2 API.

class caterva2.Client(urlbase, auth=None, timeout=5)#

Methods

append(remotepath, data)

Appends data to the remote location.

concatenate(srcs, dst, axis)

Concatenate the srcs along axis to a new location dst.

copy(src, dst)

Copies a dataset or directory to a new location.

get_info(path)

Retrieves information about a specified dataset.

get_slice(path[, key, as_blosc2, field])

Get a slice of a File/Dataset.

move(src, dst)

Moves a dataset or directory to a new location.

remove(path)

Removes a dataset or the contents of a directory from a remote repository.

stack(srcs, dst, axis)

Stack the files in srcs along new axis to a new location dst.

unfold(remotepath)

Unfolds a dataset in the remote repository.

Special Methods:

__init__(urlbase[, auth, timeout])

Creates a client for server in urlbase.

get(path)

Returns an object for the given path.

get_roots()

Retrieves the list of available roots.

get_list(path)

Lists datasets in a specified path.

subscribe(root)

Subscribes to a specified root.

fetch(path[, slice_])

Retrieves the entire content (or a specified slice) of a dataset.

get_chunk(path, nchunk)

Retrieves a specified compressed chunk from a file.

download(dataset[, localpath])

Downloads a dataset to local storage.

upload(localpath, dataset)

Uploads a local dataset to a remote repository.

adduser(newuser[, password, superuser])

Adds a user to the subscriber.

deluser(user)

Deletes a user from the subscriber.

listusers([username])

Lists the users in the subscriber.

lazyexpr(name, expression[, operands, compute])

Creates a lazy expression dataset in personal space.

Constructor#

__init__(urlbase, auth=None, timeout=5)#

Creates a client for server in urlbase.

Parameters:
  • urlbase (str, optional) – Base URL of the subscriber to query. Default to caterva2.sub_urlbase_default.

  • auth (tuple, BasicAuth, optional)

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client("https://cat2.cloud/demo")
>>> auth_client = cat2.Client("https://cat2.cloud/demo", ("joedoe@example.com", "foobar"))

Getting roots, files, datasets, subscribing…#

get(path)#

Returns an object for the given path.

Parameters:

path (Path) – Path to the root, file or dataset.

Returns:

Object – Object representing the root, file or dataset.

Return type:

Root, File, Dataset

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://demo.caterva2.net')
>>> root = client.get('example')
>>> root.name
'example'
>>> file = client.get('example/README.md')
>>> file.name
'README.md'
>>> ds = client.get('example/ds-1d.b2nd')
>>> ds[:10]
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
get_roots()#

Retrieves the list of available roots.

Returns:

Dictionary mapping available root names to their details: - name: the root name - http: the HTTP endpoint - subscribed: whether it is subscribed or not.

Return type:

dict

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://demo.caterva2.net')
>>> roots_dict = client.get_roots()
>>> sorted(roots_dict.keys())
['@public', 'b2tests', 'example', 'h5example', 'h5lung_j2k', 'h5numbers_j2k']
>>> client.subscribe('b2tests')
'Ok'
>>> roots_dict['b2tests']
{'name': 'b2tests', 'http': 'localhost:8014', 'subscribed': True}
get_list(path)#

Lists datasets in a specified path.

Parameters:

path (str) – Path to a root, directory or dataset.

Returns:

List of dataset names as strings, relative to the specified path.

Return type:

list

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://demo.caterva2.net')
>>> client.subscribe('example')
'Ok'
>>> client.get_list('example')[:3]
['README.md', 'dir1/ds-2d.b2nd', 'dir1/ds-3d.b2nd']
subscribe(root)#

Subscribes to a specified root.

Parameters:

root (str) – Name of the root to subscribe to.

Returns:

Server response as a string.

Return type:

str

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://demo.caterva2.net')
>>> root_name = 'h5numbers_j2k'
>>> client.subscribe(root_name)
'Ok'
>>> client.get_roots()[root_name]
{'name': 'h5numbers_j2k', 'http': 'localhost:8011', 'subscribed': True}

Fetch / download / upload datasets#

fetch(path, slice_=None)#

Retrieves the entire content (or a specified slice) of a dataset.

Parameters:
  • path (str) – Path to the dataset.

  • urlbase (str, optional) – Base URL to query. Defaults to caterva2.sub_urlbase_default.

  • slice_ (int, slice, tuple of ints and slices, or None) – Specifies the slice to fetch. If None, the whole dataset is fetched.

Returns:

The requested slice of the dataset as a Numpy array.

Return type:

numpy.ndarray

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://demo.caterva2.net')
>>> client.fetch('example/ds-2d-fields.b2nd', (slice(0, 2), slice(0, 2))
array([[(0.0000000e+00, 1.       ), (5.0002502e-05, 1.00005  )],
       [(1.0000500e-02, 1.0100005), (1.0050503e-02, 1.0100505)]],
      dtype=[('a', '<f4'), ('b', '<f8')])
get_chunk(path, nchunk)#

Retrieves a specified compressed chunk from a file.

Parameters:
  • path (str) – Path of the dataset.

  • nchunk (int) – ID of the unidimensional chunk to retrieve.

Returns:

The compressed chunk data.

Return type:

bytes obj

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://demo.caterva2.net')
>>> client.subscribe('example')
'Ok'
>>> info_schunk = client.get_info('example/ds-2d-fields.b2nd')['schunk']
>>> info_schunk['nchunks']
1
>>> info_schunk['cratio']
6.453000645300064
>>> chunk = client.get_chunk('example/ds-2d-fields.b2nd', 0)
>>> info_schunk['chunksize'] / len(chunk)
6.453000645300064
download(dataset, localpath=None)#

Downloads a dataset to local storage.

Note: If the dataset is a regular file and Blosc2 is installed, it will be downloaded and decompressed. Otherwise, it will remain compressed in its .b2 format.

Parameters:
  • dataset (Path) – Path to the dataset.

  • localpath (Path, optional) – Local path to save the downloaded dataset. Defaults to the current working directory if not specified.

Returns:

The path to the downloaded file.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> path = 'example/ds-2d-fields.b2nd'
>>> client = cat2.Client('https://demo.caterva2.net')
>>> client.download(path)
PosixPath('example/ds-2d-fields.b2nd')
upload(localpath, dataset)#

Uploads a local dataset to a remote repository.

Note: If localpath is a regular file without a .b2nd, .b2frame or .b2 extension, it will be automatically compressed with Blosc2 on the server, adding a .b2 extension internally.

Parameters:
  • localpath (Path) – Path to the local dataset.

  • dataset (Path) – Remote path to upload the dataset to.

Returns:

Path of the uploaded file on the server.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To upload a file you need to be authenticated as an already registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> newpath = f'@personal/dir{np.random.randint(0, 100)}/ds-4d.b2nd'
>>> uploaded_path = client.upload('root-example/dir2/ds-4d.b2nd', newpath)
>>> str(uploaded_path) == newpath
True

User management#

adduser(newuser, password=None, superuser=False)#

Adds a user to the subscriber.

Parameters:
  • newuser (str) – Username of the user to add.

  • password (str, optional) – Password for the user to add.

  • superuser (bool, optional) – Indicates if the user is a superuser.

Returns:

An explanatory message about the operation’s success or failure.

Return type:

str

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To add a user you need to be a superuser
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> username = f'user{np.random.randint(0, 100)}@example.com'
>>> message = client.adduser(username, 'foo')
>>> f"User added: username='{username}' password='foo' superuser=False" == message
True
deluser(user)#

Deletes a user from the subscriber.

Parameters:

username (str) – Username of the user to delete.

Returns:

An explanatory message about the operation’s success or failure.

Return type:

str

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To delete a user you need to be a superuser
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> username = f'user{np.random.randint(0, 100)}@example.com'
>>> _ = client.adduser(username, 'foo')
>>> message = client.deluser(username)
>>> message == f"User deleted: {username}"
True
listusers(username=None)#

Lists the users in the subscriber.

Parameters:

username (str, optional) – Username of the specific user to list.

Returns:

A list of user dictionaries in the subscriber.

Return type:

list of dict

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To list the users you need to be a superuser
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> users = client.listusers()
>>> sorted(users[0].keys())
['email', 'hashed_password', 'id', 'is_active', 'is_superuser', 'is_verified']
>>> username = f'user{np.random.randint(0, 100)}@example.com'
>>> _ = client.adduser(username, 'foo')
>>> updated_users = client.listusers()
>>> len(users) + 1 == len(updated_users)
True
>>> user_info = client.listusers(username)
>>> user_info[0]['is_superuser']
False
>>> superuser_info = client.listusers('superuser@example.com')
>>> superuser_info[0]['is_superuser']
True

Evaluating expressions#

lazyexpr(name, expression, operands=None, compute=False)#

Creates a lazy expression dataset in personal space.

A dataset with the specified name will be created or overwritten if already exists.

Parameters:
  • name (str) – Name of the dataset to be created (without extension).

  • expression (str) – Expression to be evaluated, which must yield a lazy expression.

  • operands (dict) – Mapping of variables in the expression to their corresponding dataset paths.

  • compute (bool, optional) – If false, generate lazyexpr and do not compute anything. If true, compute lazy expression on creation and save (full) result. Default false.

Returns:

Path of the created dataset.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To create a lazyexpr you need to be a registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> src_path = f'@personal/dir{np.random.randint(0, 100)}/ds-4d.b2nd'
>>> path = client.upload('root-example/dir1/ds-2d.b2nd', src_path)
>>> client.lazyexpr('example-expr', 'a + a', {'a': path})
PurePosixPath('@personal/example-expr.b2nd')
>>> 'example-expr.b2nd' in client.get_list('@personal')
True

Utility methods#

append(remotepath, data)#

Appends data to the remote location.

Parameters:
  • remotepath (Path) – Remote path of the dataset to enlarge.

  • data (blosc2.NDArray, np.ndarray, sequence) – The data to append.

Returns:

The new shape of the dataset.

Return type:

tuple

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To upload a file you need to be authenticated as an already registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> path = '@personal/ds-1d.b2nd'
>>> client.copy('@public/examples/ds-1d.b2nd', path)
PurePosixPath('@personal/ds-1d.b2nd')
>>> ndarray = blosc2.arange(0, 10)
>>> client.append(path, ndarray)
(1010,)
concatenate(srcs, dst, axis)#

Concatenate the srcs along axis to a new location dst.

Parameters:
  • srcs (list of Paths) – Source files to be concatenated

  • dst (Path) – The destination path for the file.

  • axis (int) – Axis along which to concatenate.

Returns:

The new path of the concatenated file.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # For concatenating a file you need to be a registered user
>>> client = cat2.Client("https://cat2.cloud/demo", ("joedoe@example.com", "foobar"))
>>> root = client.get('@personal')
>>> root.upload('root-example/dir2/ds-4d.b2nd', "a.b2nd")
<Dataset: @personal/a.b2nd>
>>> root.upload('root-example/dir2/ds-4d.b2nd', "b.b2nd")
<Dataset: @personal/b.b2nd>
>>> client.concatenate(['@personal/a.b2nd', '@personal/b.b2nd'], '@personal/c.b2nd', axis=0)
PurePosixPath('@personal/c.b2nd')
copy(src, dst)#

Copies a dataset or directory to a new location.

Parameters:
  • src (Path) – Source path of the dataset or directory.

  • dst (Path) – Destination path for the dataset or directory.

Returns:

New path of the copied dataset or directory.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To copy a file you need to be a registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> src_path = f'@personal/dir{np.random.randint(0, 100)}/ds-4d.b2nd'
>>> uploaded_path = client.upload('root-example/dir2/ds-4d.b2nd', src_path)
>>> copy_path = f'@personal/dir{np.random.randint(0, 100)}/ds-4d-copy.b2nd'
>>> copied_path = client.copy(src_path, copy_path)
>>> str(copied_path) == copy_path
True
>>> datasets = client.get_list('@personal')
>>> src_path.replace('@personal/', '') in datasets
True
>>> copy_path.replace('@personal/', '') in datasets
True
get_info(path)#

Retrieves information about a specified dataset.

Parameters:

path (str) – Path to the dataset.

Returns:

Dictionary of dataset properties, mapping property names to their values.

Return type:

dict

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://demo.caterva2.net')
>>> client.subscribe('example')
'Ok'
>>> path = 'example/ds-2d-fields.b2nd'
>>> info = client.get_info(path)
>>> info.keys()
dict_keys(['shape', 'chunks', 'blocks', 'dtype', 'schunk', 'mtime'])
>>> info['shape']
[100, 200]
get_slice(path, key=None, as_blosc2=True, field=None)#

Get a slice of a File/Dataset.

Parameters:
  • key (int, slice, sequence of slices or str) – The slice to retrieve. If a single slice is provided, it will be applied to the first dimension. If a sequence of slices is provided, each slice will be applied to the corresponding dimension. If str, is interpreted as filter.

  • as_blosc2 (bool) – If True (default), the result will be returned as a Blosc2 object (either a SChunk or NDArray). If False, it will be returned as a NumPy array (equivalent to self[key]).

  • field (str) – Shortcut to access a field in a structured array. If provided, key is ignored.

Returns:

A new Blosc2 object containing the requested slice.

Return type:

NDArray or SChunk or numpy.ndarray

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://demo.caterva2.net')
>>> client.get_slice('example/ds-2d-fields.b2nd', (slice(0, 2), slice(0, 2))[:]
array([[(0.0000000e+00, 1.       ), (5.0002502e-05, 1.00005  )],
       [(1.0000500e-02, 1.0100005), (1.0050503e-02, 1.0100505)]],
      dtype=[('a', '<f4'), ('b', '<f8')])
move(src, dst)#

Moves a dataset or directory to a new location.

Parameters:
  • src (Path) – Source path of the dataset or directory.

  • dst (Path) – The destination path for the dataset or directory.

Returns:

New path of the moved dataset or directory.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To move a file you need to be a registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> path = f'@personal/dir{np.random.randint(0, 100)}/ds-4d.b2nd'
>>> uploaded_path = client.upload('root-example/dir2/ds-4d.b2nd', path)
>>> newpath = f'@personal/dir{np.random.randint(0, 100)}/ds-4d-moved.b2nd'
>>> moved_path = client.move(path, newpath)
>>> str(moved_path) == newpath
True
>>> path.replace('@personal/', '') in client.get_list('@personal')
False
remove(path)#

Removes a dataset or the contents of a directory from a remote repository.

Note: When a directory is removed, only its contents are deleted; the directory itself remains. This behavior allows for future uploads to the same directory. It is subject to in future versions.

Parameters:

path (Path) – Path of the dataset or directory to remove.

Returns:

The path that was removed.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To remove a file you need to be a registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> path = f'@personal/dir{np.random.randint(0, 100)}/ds-4d.b2nd'
>>> uploaded_path = client.upload('root-example/dir2/ds-4d.b2nd', path)
>>> removed_path = client.remove(path)
>>> removed_path == path
True
stack(srcs, dst, axis)#

Stack the files in srcs along new axis to a new location dst.

Parameters:
  • srcs (list of Paths) – Source files accessible by client to be stacked

  • dst (Path) – The destination path for the file.

  • axis (int) – Axis along which to stack.

Returns:

The new path of the stacked file.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # For stacking a file you need to be a registered user
>>> client = cat2.Client("https://cat2.cloud/demo", ("joedoe@example.com", "foobar"))
>>> root = client.get('@personal')
>>> root.upload('root-example/dir2/ds-4d.b2nd', "a.b2nd")
<Dataset: @personal/a.b2nd>
>>> root.upload('root-example/dir2/ds-4d.b2nd', "b.b2nd")
<Dataset: @personal/b.b2nd>
>>> client.stack(['@personal/a.b2nd', '@personal/b.b2nd'], '@personal/c.b2nd', axis=0)
PurePosixPath('@personal/c.b2nd')
unfold(remotepath)#

Unfolds a dataset in the remote repository.

The container is always unfolded into a directory with the same name as the container, but without the extension.

Parameters:

remotepath (Path) – Path of the dataset to unfold.

Returns:

The path of the unfolded dataset.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To unfold a file you need to be a registered user
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> client.unfold('@personal/dir/data.h5')
PurePosixPath('@personal/dir/data')