malva.utils module

exception malva.utils.FormatError(message)[source]

Bases: Exception

Exception raised for errors in the input format.

__init__(message)[source]
malva.utils.check_cell_string(cell='r1[2:27]')[source]

Validates and parses the ‘cell’ string parameter to ensure it follows the expected format and extracts the read group and index range.

Parameters:
  • cell (str) – A string specifying the read group and index range

  • 'r1[start (in the format) – end]’ or ‘r2[start:end]’. Default is ‘r1[2:27]’.

Returns:

A tuple containing the read group (str) and the start (int) and end (int) indices parsed from the ‘cell’ string.

Return type:

tuple

Raises:

FormatError – If the ‘cell’ string does not match the expected format.

malva.utils.conditional_track(sequence, description=None, silent=False)[source]
malva.utils.safety_check_eval(s, danger='();.')[source]
malva.utils.get_module_path()[source]
malva.utils.save_pickle(obj, file_path)[source]

Save an object to a pickle file.

Parameters:
  • obj (any) – The object to be saved.

  • file_path (str) – The path to the pickle file.

Returns:

None

malva.utils.load_pickle(file_path)[source]

Load an object from a pickle file.

Parameters:

file_path (str) – The path to the pickle file.

Returns:

The loaded object.

Return type:

any

malva.utils.check_file_exists(f, except_when=None)[source]

Check whether the file exists.

Parameters:
  • f (str) – Path to the input file.

  • except_when (bool) – Throw exception when file exists (or not). Default: None

Raises:

FileNotFoundError – If the file does not exist.

Return type:

bool

malva.utils.check_directory_exists(path, except_when=None)[source]

Check if a file exists, or if its parent directory exists.

Parameters:
  • path (str) – Path to the file or directory.

  • except_when (bool) – Throw exception when file exists (or not). Default: None

Returns:

True if the parent directory exists or if the file exists, False otherwise.

Return type:

bool

malva.utils.check_adata_structure(f)[source]

Check the validity of the input Open-ST h5 object.

Parameters:

f (str) – Path to the input Open-ST h5 object.

Raises:

KeyError – If required properties are not found in the file.

malva.utils.load_properties_from_adata(f, properties=['obsm/spatial'], backed=False)[source]

Load specified properties from an AnnData file (h5py format).

Parameters:
  • f (str) – Path to the AnnData h5py file.

  • properties (list, optional) – List of property paths to load from the file.

  • backed (bool, optional) – If True, data will not be read into memory.

Returns:

A dictionary containing the loaded properties.
  • For each property path specified in the ‘properties’ list:
    • The dictionary key is the property path.

    • The value is the corresponding parsed property data.

Return type:

dict

Notes

  • This function loads specified properties from an AnnData h5py file.

  • The ‘properties’ list should consist of property paths within the file.

  • Returns a dictionary where keys are property paths and values are the loaded data.

malva.utils.check_obs_unique(adata, obs_key='tile_id')[source]

Check if the values in a specified observation key in an AnnData object are unique.

Parameters:
  • adata (AnnData) – AnnData object to check for unique observations.

  • obs_key (str, optional) – The name of the observation key to check for uniqueness. Defaults to “tile_id”.

Returns:

True if the specified observation key has unique values, False otherwise.

Return type:

bool

Raises:

ValueError – If the specified observation key exists in the AnnData object but is not unique.

malva.utils.copytree2(source, dest)[source]

Recursively copy the contents of a source directory to a destination directory.

Parameters:
  • source (str) – The source directory to be copied.

  • dest (str) – The destination directory where the contents will be copied to.

Returns:

The path to the destination directory where the contents were copied.

Return type:

str

Notes

  • This function creates the destination directory and its parent directories if they do not exist.

  • It checks if the source and destination directories already exist and have the same size. If so, it skips copying.

  • If the source and destination directories differ in size or do not exist, it performs a recursive copy.

malva.utils.get_package_path()[source]

Get the absolute path of the directory containing the current Python package.

Returns:

Absolute path of the directory containing the current Python package.

Return type:

str

malva.utils.get_absolute_package_path(relative_path)[source]

Get the absolute path by concatenating the package path and the relative path.

Parameters:

relative_path (str) – Relative path from the package directory.

Returns:

Absolute path.

Return type:

str

malva.utils.h5_to_dict(adata)[source]

Recursively converts an h5py.Group object and its nested datasets into a nested dictionary structure.

Parameters:

adata (h5py.Group) – An h5py Group object to be converted.

Returns:

A nested dictionary representing the structure of the h5py Group object.

Leaf nodes contain strings representing the type and shape (if applicable) of the datasets. Non-leaf nodes contain nested dictionaries representing their child groups and datasets.

Return type:

dict

Notes

  • Leaf nodes in the resulting dictionary contain strings formatted as “{type}_{shape}”. If the dataset has no shape attribute (e.g., scalar dataset), shape will be None. Example: “<class ‘numpy.ndarray’>_(10,)”

  • Non-leaf nodes in the resulting dictionary contain nested dictionaries representing their child groups and datasets.

malva.utils.write_key_to_h5(adata, key, data, delete_before=False)[source]
malva.utils.group_intervals(arr, min_interval)[source]
malva.utils.defragment_hdf5_file(input_file, output_file, dataset_name, chunk_size=None, compression=None)[source]

Defragment an HDF5 file by copying the dataset to a new file with optimized chunks and compression.

Parameters:
  • input_file (str) – The path to the original HDF5 file.

  • output_file (str) – The path to the new optimized HDF5 file.

  • dataset_name (str) – The name of the dataset to be defragmented.

  • chunk_size (tuple, optional) – The chunk size to be used for the new dataset. Defaults to (1000,).

  • compression (str, optional) – The compression method to be used for the new dataset. Defaults to None.

Returns:

None

malva.utils.download_url_to_file(url, dst, progress=True)[source]
Download object at the given URL to a local path.

Thanks to torch & cellpose

Parameters:
  • url (string) – URL of the object to download

  • dst (string) – Full path where object will be saved, e.g. /tmp/temporary_file

  • progress (bool, optional) – whether or not to display a progress bar to stderr Default: True

malva.utils.get_reference_cache(reference)[source]

Get the path to a cached reference file, downloading it if necessary.

Parameters:

referencestr

Name of the reference to retrieve, must be in EXISTING_REFERENCES

Returns:

: str

Path to the cached reference file

malva.utils.convert_to_bytes(max_mem)[source]

Convert a memory size string to its equivalent in bytes.

Return type:

int

Args: max_mem (str): A string representing memory size, e.g., ‘100M’, ‘2G’, ‘500K’.

Supports units ‘K’, ‘M’, ‘G’, ‘T’ (case-insensitive). The ‘B’ suffix for bytes is optional. If no unit is specified, the input is assumed to be in bytes.

Returns: int: The equivalent size in bytes.

Raises: ValueError: If the input string format is invalid.

Examples: >>> convert_to_bytes(‘100M’) 104857600 >>> convert_to_bytes(‘2G’) 2147483648 >>> convert_to_bytes(‘500K’) 512000 >>> convert_to_bytes(‘1024’) 1024