Client class#

A client is a remote repository. It is the main entry point for using the Caterva2 API.

class caterva2.Client(urlbase, auth=None, timeout=5)#

Methods

append(remotepath, data)

Appends data to the remote location.

close()

Close httpx.Client instance associated with Caterva2 Client.

copy(src, dst)

Copies a dataset or directory to a new location.

get_info(path)

Retrieves information about a specified dataset.

get_slice(path[, key, as_blosc2, field])

Get a slice of a File/Dataset.

load_from_url(urlpath, dataset)

Loads a remote dataset to a remote repository.

move(src, dst)

Moves a dataset or directory to a new location.

remove(path)

Removes a dataset or the contents of a directory from a remote repository.

unfold(remotepath)

Unfolds a dataset in the remote repository.

Special Methods:

__init__(urlbase[, auth, timeout])

Creates a client for server in urlbase.

get(path)

Returns an object for the given path or object.

get_roots()

Retrieves the list of available roots.

get_list(path)

Lists datasets in a specified path.

fetch(path[, slice_])

Retrieves the entire content (or a specified slice) of a dataset.

get_chunk(path, nchunk)

Retrieves a specified compressed chunk from a file.

download(dataset[, localpath])

Downloads a dataset to local storage.

upload(local_dset, remotepath[, compute])

Uploads a local dataset to a remote repository.

adduser(newuser[, password, superuser])

Adds a user to the server.

deluser(user)

Deletes a user from the server.

listusers([username])

Lists the users in the server.

Constructor#

__init__(urlbase, auth=None, timeout=5)#

Creates a client for server in urlbase.

Parameters:
  • urlbase (str) – Base URL of the server to query.

  • auth (tuple, BasicAuth, optional)

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client("https://cat2.cloud/demo")
>>> auth_client = cat2.Client("https://cat2.cloud/demo", ("joedoe@example.com", "foobar"))

Getting roots, files, datasets#

get(path)#

Returns an object for the given path or object.

Parameters:

path (Path | Dataset | File | Root) – Either the desired object, or Path to the root, file or dataset.

Returns:

Object – Object representing the root, file or dataset.

Return type:

Root, File, Dataset

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://cat2.cloud/demo')
>>> root = client.get('example')
>>> root.name
'example'
>>> file = client.get('example/README.md')
>>> file.name
'README.md'
>>> ds = client.get('example/ds-1d.b2nd')
>>> ds[:10]
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
get_roots()#

Retrieves the list of available roots.

Returns:

Dictionary mapping available root names to their details: - name: the root name

Return type:

dict

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://cat2.cloud/demo')
>>> roots_dict = client.get_roots()
>>> sorted(roots_dict.keys())
['@public', 'b2tests', 'example', 'h5example', 'h5lung_j2k', 'h5numbers_j2k']
>>> roots_dict['b2tests']
{'name': 'b2tests'}
get_list(path)#

Lists datasets in a specified path.

Parameters:

path (str) – Path to a root, directory or dataset.

Returns:

List of dataset names as strings, relative to the specified path.

Return type:

list

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://cat2.cloud/demo')
>>> client.get_list('example')[:3]
['README.md', 'dir1/ds-2d.b2nd', 'dir1/ds-3d.b2nd']

Fetch / download / upload datasets#

fetch(path, slice_=None)#

Retrieves the entire content (or a specified slice) of a dataset.

Parameters:
  • path (str | Dataset) – Path or reference to the dataset.

  • slice_ (int, slice, tuple of ints and slices, or None) – Specifies the slice to fetch. If None, the whole dataset is fetched.

Returns:

The requested slice of the dataset as a Numpy array.

Return type:

numpy.ndarray

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://cat2.cloud/demo')
>>> client.fetch('example/ds-2d-fields.b2nd', (slice(0, 2), slice(0, 2))
array([[(0.0000000e+00, 1.       ), (5.0002502e-05, 1.00005  )],
       [(1.0000500e-02, 1.0100005), (1.0050503e-02, 1.0100505)]],
      dtype=[('a', '<f4'), ('b', '<f8')])
get_chunk(path, nchunk)#

Retrieves a specified compressed chunk from a file.

Parameters:
  • path (str | Dataset) – Path of the dataset or a Dataset instance.

  • nchunk (int) – ID of the unidimensional chunk to retrieve.

Returns:

The compressed chunk data.

Return type:

bytes obj

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://cat2.cloud/demo')
>>> info_schunk = client.get_info('example/ds-2d-fields.b2nd')['schunk']
>>> info_schunk['nchunks']
1
>>> info_schunk['cratio']
6.453000645300064
>>> chunk = client.get_chunk('example/ds-2d-fields.b2nd', 0)
>>> info_schunk['chunksize'] / len(chunk)
6.453000645300064
download(dataset, localpath=None)#

Downloads a dataset to local storage.

Note: If the dataset is a regular file and Blosc2 is installed, it will be downloaded and decompressed. Otherwise, it will remain compressed in its .b2 format.

Parameters:
  • dataset (Path) – Path to the dataset.

  • localpath (Path, optional) – Local path to save the downloaded dataset. Defaults to the current working directory if not specified.

Returns:

The path to the downloaded file on local disk.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> path = 'example/ds-2d-fields.b2nd'
>>> client = cat2.Client('https://cat2.cloud/demo')
>>> client.download(path)
PosixPath('example/ds-2d-fields.b2nd')
upload(local_dset, remotepath, compute=None)#

Uploads a local dataset to a remote repository.

Note: If localpath is a regular file without a .b2nd, .b2frame or .b2 extension, it will be automatically compressed with Blosc2 on the server, adding a .b2 extension internally.

Parameters:
  • local_dset (Path | in-memory object) – Path to the local dataset or an in-memory object (convertible to blosc2.SChunk).

  • remotepath (Path) – Remote path to upload the dataset to.

  • compute (None | bool) – For LazyArray objects, boolean flag indicating whether to compute the result eagerly or not.

Returns:

Object – Object representing the file or dataset.

Return type:

File, Dataset

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To upload a file you need to be authenticated as an already registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> newpath = f'@personal/dir{np.random.randint(0, 100)}/ds-4d.b2nd'
>>> uploaded_path = client.upload('root-example/dir2/ds-4d.b2nd', newpath)
>>> str(uploaded_path) == newpath
True

User management#

adduser(newuser, password=None, superuser=False)#

Adds a user to the server.

Parameters:
  • newuser (str) – Username of the user to add.

  • password (str, optional) – Password for the user to add.

  • superuser (bool, optional) – Indicates if the user is a superuser.

Returns:

An explanatory message about the operation’s success or failure.

Return type:

str

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To add a user you need to be a superuser
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> username = f'user{np.random.randint(0, 100)}@example.com'
>>> message = client.adduser(username, 'foo')
>>> f"User added: username='{username}' password='foo' superuser=False" == message
True
deluser(user)#

Deletes a user from the server.

Parameters:

username (str) – Username of the user to delete.

Returns:

An explanatory message about the operation’s success or failure.

Return type:

str

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To delete a user you need to be a superuser
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> username = f'user{np.random.randint(0, 100)}@example.com'
>>> _ = client.adduser(username, 'foo')
>>> message = client.deluser(username)
>>> message == f"User deleted: {username}"
True
listusers(username=None)#

Lists the users in the server.

Parameters:

username (str, optional) – Username of the specific user to list.

Returns:

A list of user dictionaries in the server.

Return type:

list of dict

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To list the users you need to be a superuser
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> users = client.listusers()
>>> sorted(users[0].keys())
['email', 'hashed_password', 'id', 'is_active', 'is_superuser', 'is_verified']
>>> username = f'user{np.random.randint(0, 100)}@example.com'
>>> _ = client.adduser(username, 'foo')
>>> updated_users = client.listusers()
>>> len(users) + 1 == len(updated_users)
True
>>> user_info = client.listusers(username)
>>> user_info[0]['is_superuser']
False
>>> superuser_info = client.listusers('superuser@example.com')
>>> superuser_info[0]['is_superuser']
True

Evaluating expressions#

Utility methods#

append(remotepath, data)#

Appends data to the remote location.

Parameters:
  • remotepath (Path) – Remote path of the dataset to enlarge.

  • data (blosc2.NDArray, np.ndarray, sequence) – The data to append.

Returns:

out – Object representing the modified dataset.

Return type:

Dataset

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To upload a file you need to be authenticated as an already registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> path = '@personal/ds-1d.b2nd'
>>> client.copy('@public/examples/ds-1d.b2nd', path)
PurePosixPath('@personal/ds-1d.b2nd')
>>> ndarray = blosc2.arange(0, 10)
>>> client.append(path, ndarray)
(1010,)
close()#

Close httpx.Client instance associated with Caterva2 Client.

Return type:

None

copy(src, dst)#

Copies a dataset or directory to a new location.

Parameters:
  • src (Path | File instance) – Path of the source dataset (or dataset itself) of the dataset or directory.

  • dst (Path) – Destination path for the dataset or directory.

Returns:

Object – Reference to copied object in copy location.

Return type:

Dataset, File

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To copy a file you need to be a registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> src_path = f'@personal/dir{np.random.randint(0, 100)}/ds-4d.b2nd'
>>> uploaded_path = client.upload('root-example/dir2/ds-4d.b2nd', src_path)
>>> copy_path = f'@personal/dir{np.random.randint(0, 100)}/ds-4d-copy.b2nd'
>>> copied_path = client.copy(src_path, copy_path)
>>> str(copied_path) == copy_path
True
>>> datasets = client.get_list('@personal')
>>> src_path.replace('@personal/', '') in datasets
True
>>> copy_path.replace('@personal/', '') in datasets
True
get_info(path)#

Retrieves information about a specified dataset.

Parameters:

path (str | Dataset | File) – Path to the dataset.

Returns:

Dictionary of dataset properties, mapping property names to their values.

Return type:

dict

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://cat2.cloud/demo')
>>> path = 'example/ds-2d-fields.b2nd'
>>> info = client.get_info(path)
>>> info.keys()
dict_keys(['shape', 'chunks', 'blocks', 'dtype', 'schunk', 'mtime'])
>>> info['shape']
[100, 200]
get_slice(path, key=None, as_blosc2=True, field=None)#

Get a slice of a File/Dataset.

Parameters:
  • path (str, Dataset, File) – Desired object to slice.

  • key (int, slice, sequence of slices or str) – The slice to retrieve. If a single slice is provided, it will be applied to the first dimension. If a sequence of slices is provided, each slice will be applied to the corresponding dimension. If str, is interpreted as filter.

  • as_blosc2 (bool) – If True (default), the result will be returned as a Blosc2 object (either a SChunk or NDArray). If False, it will be returned as a NumPy array (equivalent to self[key]).

  • field (str) – Shortcut to access a field in a structured array. If provided, key is ignored.

Returns:

A new Blosc2 object containing the requested slice.

Return type:

NDArray or SChunk or numpy.ndarray

Examples

>>> import caterva2 as cat2
>>> client = cat2.Client('https://cat2.cloud/demo')
>>> client.get_slice('example/ds-2d-fields.b2nd', (slice(0, 2), slice(0, 2))[:]
array([[(0.0000000e+00, 1.       ), (5.0002502e-05, 1.00005  )],
       [(1.0000500e-02, 1.0100005), (1.0050503e-02, 1.0100505)]],
      dtype=[('a', '<f4'), ('b', '<f8')])
load_from_url(urlpath, dataset)#

Loads a remote dataset to a remote repository.

Parameters:
  • urlpath (Path) – Url to the remote third party dataset.

  • dataset (Path) – Remote path to place the dataset into.

Returns:

Object – Object representing the file or dataset.

Return type:

File, Dataset

move(src, dst)#

Moves a dataset or directory to a new location.

Parameters:
  • src (Path | File instance) – Path of the source dataset (or dataset itself) of the dataset or directory.

  • dst (Path) – The destination path for the dataset or directory.

Returns:

Object – Reference to object in new location.

Return type:

Dataset, File

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To move a file you need to be a registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> path = f'@personal/dir{np.random.randint(0, 100)}/ds-4d.b2nd'
>>> uploaded_path = client.upload('root-example/dir2/ds-4d.b2nd', path)
>>> newpath = f'@personal/dir{np.random.randint(0, 100)}/ds-4d-moved.b2nd'
>>> moved_path = client.move(path, newpath)
>>> str(moved_path) == newpath
True
>>> path.replace('@personal/', '') in client.get_list('@personal')
False
remove(path)#

Removes a dataset or the contents of a directory from a remote repository.

Note: When a directory is removed, only its contents are deleted; the directory itself remains. This behavior allows for future uploads to the same directory. It is subject to in future versions.

Parameters:

path (Path | File instance) – Path of the dataset (or dataset itself) or directory to remove.

Returns:

The path that was removed.

Return type:

Path

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To remove a file you need to be a registered used
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> path = f'@personal/dir{np.random.randint(0, 100)}/ds-4d.b2nd'
>>> uploaded_path = client.upload('root-example/dir2/ds-4d.b2nd', path)
>>> removed_path = client.remove(path)
>>> removed_path == path
True
unfold(remotepath)#

Unfolds a dataset in the remote repository.

The container is always unfolded into a directory with the same name as the container, but without the extension.

Parameters:

remotepath (Path | File) – Path of the dataset to unfold.

Returns:

out – Root of the unfolded dataset.

Return type:

str

Examples

>>> import caterva2 as cat2
>>> import numpy as np
>>> # To unfold a file you need to be a registered user
>>> client = cat2.Client('https://cat2.cloud/demo', ("joedoe@example.com", "foobar"))
>>> client.unfold('@personal/dir/data.h5')
PurePosixPath('@personal/dir/data')