index¶

Build a single Malva Index (k-mer index) from single-cell or spatial transcriptomic sequencing reads.

In a single Malva Index, each k-mer is colored by each cell in which it appears.

usage: malva index [-h] --reads-in READS_IN READS_IN --spatial-bc-in SPATIAL_BC_IN --index-out INDEX_OUT [--flavor FLAVOR] [--kmer-length KMER_LENGTH] [--bulk-id BULK_ID] [--chunksize CHUNKSIZE]
                   [--overlapping] [--merge-chunks] [--threads THREADS]

Named Arguments¶

--reads-in

Pair of FASTQ files containing the transcriptomic information,: UMI and cell (spatial) barcode (in R1/R2 structure, paired-end)

--spatial-bc-in

Tabular file containing columns BC,X,Y:: BC: the cell (spatial) barcode sequence X: x spatial coordinate (any units) Y: y spatial coordinate (any units)

--index-out

Valid directory where the malva index (and metadata) will be written into.: If the directory exists, it must not contain files called malva_index.h5. Otherwise, an exception will be thrown.

--flavor

Spatial transcriptomics technology.

These are default configurations to read from the paired FASTQ (or BAM) files. Other configurations can be provided as a properly formatted .yaml file - see documentation.

Currently, flavors ‘openst’, ‘stereo_seq’, ‘slide_seq’, ‘visium’, ‘seq_scope_v1’, ‘sc_10x_v1’, ‘sc_10x_v3’, ‘bulk’, or a path to a .yaml file, are supported

Default: 'openst'

--kmer-length

Length (in nucleotides) of indexed k-mers, non-overlapping.

Default: 24

--bulk-id

When the technology is bulk, will set all reads to have this ID - also for smart-seq or other well-based technologies.

Default: 1

--chunksize

Consecutive chunk that will be accumulated into RAM before writing.: Consider reducing this number to reduce RAM usage (indexing might be slower).

Default: 100000000

--overlapping

By default, the index stores non-overlapping k-mers.: With this option, overlapping k-mers are indexed, increasing sensitivity against mutation events during query time, but also increases time to build the index and its size.

Default: False

--merge-chunks

When the chunk size is less than the number of total reads, there will be: several separate chunks in the index file. When this option is provided, the different chunks are merged into a single one, which will reduce index size and improve query speed. This adds a bit of time to the overall processing.

Default: False

--threads

Number of threads used for parallel processing

Default: 1