Seqgroup class

class SeqGroup(sequences=None, format='fasta', fix_duplicates=True, **kwargs)

Bases: object

SeqGroup class can be used to store a set of sequences (aligned or not).

Parameters:
  • sequences – Path to the file containing the sequences or, alternatively, the text string containing the same information.
  • format (fasta) – the format in which sequences are encoded. Current supported formats are: fasta, phylip (phylip sequencial) and iphylip (phylip interleaved). Phylip format forces sequence names to a maximum of 10 chars. To avoid this effect, you can use the relaxed phylip format: phylip_relaxed and iphylip_relaxed.
msf = ">seq1\nAAAAAAAAAAA\n>seq2\nTTTTTTTTTTTTT\n"
seqs = SeqGroup(msf, format="fasta")
print seqs.get_seq("seq1")
get_entries()

Returns the list of entries currently stored.

get_seq(name)

Returns the sequence associated to a given entry name.

iter_entries()

Returns an iterator over all sequences in the collection. Each item is a tuple with the sequence name, sequence, and sequence comments

set_seq(name, seq, comments=None)

Updates or adds a sequence

write(format='fasta', outfile=None)

Returns the text representation of the sequences in the supplied given format (default=FASTA). If “oufile” argument is used, the result is written into the given path.