malva.utils module¶

exception malva.utils.FormatError(message)[source]¶

Bases: Exception

Exception raised for errors in the input format.

__init__(message)[source]¶

malva.utils.check_cell_string(cell='r1[2:27]')[source]¶

Validates and parses the ‘cell’ string parameter to ensure it follows the expected format and extracts the read group and index range.

Parameters:

cell (str) – A string specifying the read group and index range
'r1[start (in the format) – end]’ or ‘r2[start:end]’. Default is ‘r1[2:27]’.

Returns:

A tuple containing the read group (str) and the start (int) and end (int) indices parsed from the ‘cell’ string.

Return type:

tuple

Raises:

FormatError – If the ‘cell’ string does not match the expected format.

malva.utils.conditional_track(sequence, description=None, silent=False)[source]¶

malva.utils.safety_check_eval(s, danger='();.')[source]¶

malva.utils.get_module_path()[source]¶

malva.utils.save_pickle(obj, file_path)[source]¶

Save an object to a pickle file.

Parameters:

obj (any) – The object to be saved.
file_path (str) – The path to the pickle file.

Returns:

None

malva.utils.load_pickle(file_path)[source]¶

Load an object from a pickle file.

Parameters:: file_path (str) – The path to the pickle file.
Returns:: The loaded object.
Return type:: any

malva.utils.check_file_exists(f, except_when=None)[source]¶

Check whether the file exists.

Parameters:

f (str) – Path to the input file.
except_when (bool) – Throw exception when file exists (or not). Default: None

Raises:

FileNotFoundError – If the file does not exist.

Return type:

bool

malva.utils.check_directory_exists(path, except_when=None)[source]¶

Check if a file exists, or if its parent directory exists.

Parameters:

path (str) – Path to the file or directory.
except_when (bool) – Throw exception when file exists (or not). Default: None

Returns:

True if the parent directory exists or if the file exists, False otherwise.

Return type:

bool

malva.utils.check_adata_structure(f)[source]¶

Check the validity of the input Open-ST h5 object.

Parameters:: f (str) – Path to the input Open-ST h5 object.
Raises:: KeyError – If required properties are not found in the file.

malva.utils.load_properties_from_adata(f, properties=['obsm/spatial'], backed=False)[source]¶

Load specified properties from an AnnData file (h5py format).

Parameters:

f (str) – Path to the AnnData h5py file.
properties (list, optional) – List of property paths to load from the file.
backed (bool, optional) – If True, data will not be read into memory.

Returns:

A dictionary containing the loaded properties.

For each property path specified in the ‘properties’ list:
- The dictionary key is the property path.
- The value is the corresponding parsed property data.

Return type:

dict

Notes

This function loads specified properties from an AnnData h5py file.
The ‘properties’ list should consist of property paths within the file.
Returns a dictionary where keys are property paths and values are the loaded data.

malva.utils.check_obs_unique(adata, obs_key='tile_id')[source]¶

Check if the values in a specified observation key in an AnnData object are unique.

Parameters:

adata (AnnData) – AnnData object to check for unique observations.
obs_key (str, optional) – The name of the observation key to check for uniqueness. Defaults to “tile_id”.

Returns:

True if the specified observation key has unique values, False otherwise.

Return type:

bool

Raises:

ValueError – If the specified observation key exists in the AnnData object but is not unique.

malva.utils.copytree2(source, dest)[source]¶

Recursively copy the contents of a source directory to a destination directory.

Parameters:

source (str) – The source directory to be copied.
dest (str) – The destination directory where the contents will be copied to.

Returns:

The path to the destination directory where the contents were copied.

Return type:

str

Notes

This function creates the destination directory and its parent directories if they do not exist.
It checks if the source and destination directories already exist and have the same size. If so, it skips copying.
If the source and destination directories differ in size or do not exist, it performs a recursive copy.

malva.utils.get_package_path()[source]¶

Get the absolute path of the directory containing the current Python package.

Returns:: Absolute path of the directory containing the current Python package.
Return type:: str

malva.utils.get_absolute_package_path(relative_path)[source]¶

Get the absolute path by concatenating the package path and the relative path.

Parameters:: relative_path (str) – Relative path from the package directory.
Returns:: Absolute path.
Return type:: str

malva.utils.h5_to_dict(adata)[source]¶

Recursively converts an h5py.Group object and its nested datasets into a nested dictionary structure.

Parameters:

adata (h5py.Group) – An h5py Group object to be converted.

Returns:

A nested dictionary representing the structure of the h5py Group object.: Leaf nodes contain strings representing the type and shape (if applicable) of the datasets. Non-leaf nodes contain nested dictionaries representing their child groups and datasets.

Return type:

dict

Notes

Leaf nodes in the resulting dictionary contain strings formatted as “{type}_{shape}”. If the dataset has no shape attribute (e.g., scalar dataset), shape will be None. Example: “<class ‘numpy.ndarray’>_(10,)”
Non-leaf nodes in the resulting dictionary contain nested dictionaries representing their child groups and datasets.

malva.utils.write_key_to_h5(adata, key, data, delete_before=False)[source]¶

malva.utils.binary_search(arr, low, high, x)[source]¶

malva.utils.group_intervals(arr, min_interval)[source]¶

malva.utils.defragment_hdf5_file(input_file, output_file, dataset_name, chunk_size=None, compression=None)[source]¶

Defragment an HDF5 file by copying the dataset to a new file with optimized chunks and compression.

Parameters:

input_file (str) – The path to the original HDF5 file.
output_file (str) – The path to the new optimized HDF5 file.
dataset_name (str) – The name of the dataset to be defragmented.
chunk_size (tuple, optional) – The chunk size to be used for the new dataset. Defaults to (1000,).
compression (str, optional) – The compression method to be used for the new dataset. Defaults to None.

Returns:

None

malva.utils.download_url_to_file(url, dst, progress=True)[source]¶

Download object at the given URL to a local path.: Thanks to torch & cellpose

Parameters:

url (string) – URL of the object to download
dst (string) – Full path where object will be saved, e.g. /tmp/temporary_file
progress (bool, optional) – whether or not to display a progress bar to stderr Default: True

malva.utils.get_reference_cache(reference)[source]¶

Get the path to a cached reference file, downloading it if necessary.

Parameters:¶

referencestr: Name of the reference to retrieve, must be in EXISTING_REFERENCES

Returns:¶

: str

Path to the cached reference file

malva.utils.convert_to_bytes(max_mem)[source]¶

Convert a memory size string to its equivalent in bytes.

Return type:: int

Args: max_mem (str): A string representing memory size, e.g., ‘100M’, ‘2G’, ‘500K’.

Supports units ‘K’, ‘M’, ‘G’, ‘T’ (case-insensitive). The ‘B’ suffix for bytes is optional. If no unit is specified, the input is assumed to be in bytes.

Returns: int: The equivalent size in bytes.

Raises: ValueError: If the input string format is invalid.

Examples: >>> convert_to_bytes(‘100M’) 104857600 >>> convert_to_bytes(‘2G’) 2147483648 >>> convert_to_bytes(‘500K’) 512000 >>> convert_to_bytes(‘1024’) 1024