malva.complexity module¶
- malva.complexity.overlapping_windows(sequence, L)[source]¶
Returns overlapping windows of size L from sequence sequence :type sequence: :param sequence: the nucleotide or protein sequence to scan over :type L: :param L: the length of the windows to yield
- malva.complexity.compute_rep_vector(sequence, N)[source]¶
Computes the repetition vector (as seen in Wooton, 1993) from a given sequence of a biopolymer with N possible residues.
- Parameters:
sequence – the nucleotide or protein sequence to generate a repetition vector for.
N – the total number of possible residues in the biopolymer sequence belongs to.
- malva.complexity.complexity(sequence, N)[source]¶
Computes the Shannon Entropy of a given sequence of a biopolymer with N possible residues. See (Wooton, 1993) for more.
- Parameters:
sequence – the nucleotide or protein sequence whose Shannon Entropy is to calculated.
N – the total number of possible residues in the biopolymer sequence belongs to.
- malva.complexity.mask_low_complexity(seq_rec, maskchar='N', N=20, L=12)[source]¶
Masks low-complexity nucleic/amino acid sequences with a given mask character.
- Parameters:
seq_rec – a string
maskchar – Character to mask low-complexity residues with.
N – Number of residues to expect in the sequence. (20 for AA, 4 for DNA)
L – Length of sliding window that reads the sequence.