index¶
Build a single Malva Index (k-mer index) from single-cell or spatial transcriptomic sequencing reads.
In a single Malva Index, each k-mer is colored by each cell in which it appears.
usage: malva index [-h] --reads-in READS_IN READS_IN --spatial-bc-in SPATIAL_BC_IN --index-out INDEX_OUT [--flavor FLAVOR] [--kmer-length KMER_LENGTH] [--bulk-id BULK_ID] [--chunksize CHUNKSIZE]
[--overlapping] [--merge-chunks] [--threads THREADS]
Named Arguments¶
- --reads-in
- Pair of FASTQ files containing the transcriptomic information,
UMI and cell (spatial) barcode (in R1/R2 structure, paired-end)
- --spatial-bc-in
- Tabular file containing columns BC,X,Y:
BC: the cell (spatial) barcode sequence X: x spatial coordinate (any units) Y: y spatial coordinate (any units)
- --index-out
- Valid directory where the malva index (and metadata) will be written into.
If the directory exists, it must not contain files called malva_index.h5. Otherwise, an exception will be thrown.
- --flavor
- Spatial transcriptomics technology.
These are default configurations to read from the paired FASTQ (or BAM) files. Other configurations can be provided as a properly formatted .yaml file - see documentation.
Currently, flavors ‘openst’, ‘stereo_seq’, ‘slide_seq’, ‘visium’, ‘seq_scope_v1’, ‘sc_10x_v1’, ‘sc_10x_v3’, ‘bulk’, or a path to a .yaml file, are supported
Default:
'openst'- --kmer-length
Length (in nucleotides) of indexed k-mers, non-overlapping.
Default:
24- --bulk-id
When the technology is bulk, will set all reads to have this ID - also for smart-seq or other well-based technologies.
Default:
1- --chunksize
- Consecutive chunk that will be accumulated into RAM before writing.
Consider reducing this number to reduce RAM usage (indexing might be slower).
Default:
100000000- --overlapping
- By default, the index stores non-overlapping k-mers.
With this option, overlapping k-mers are indexed, increasing sensitivity against mutation events during query time, but also increases time to build the index and its size.
Default:
False- --merge-chunks
- When the chunk size is less than the number of total reads, there will be
several separate chunks in the index file. When this option is provided, the different chunks are merged into a single one, which will reduce index size and improve query speed. This adds a bit of time to the overall processing.
Default:
False- --threads
Number of threads used for parallel processing
Default:
1