New in version 2.1.

# Phyloxml Module¶

## Phyloxml classes linked to ETE¶

class Phyloxml(*args, **kargs)
class PhyloxmlTree(phyloxml_clade=None, phyloxml_phylogeny=None, **kargs)

PhyloTree object supporting phyloXML format.

## Generic Phyloxml classes¶

class Accession(source=None, valueOf_=None)

Element Accession is used to capture the local part in a sequence identifier (e.g. ‘P17304’ in ‘UniProtKB:P17304’, in which case the ‘source’ attribute would be ‘UniProtKB’).

class Annotation(source=None, type_=None, ref=None, evidence=None, desc=None, confidence=None, property=None, uri=None, valueOf_=None)

The annotation of a molecular sequence. It is recommended to annotate by using the optional ‘ref’ attribute (some examples of acceptable values for the ref attribute: ‘GO:0008270‘, ‘KEGG:Tetrachloroethene degradation’, ‘EC:1.1.1.1’). Optional element ‘desc’ allows for a free text description. Optional element ‘confidence’ is used to state the type and value of support for a annotation. Similarly, optional attribute ‘evidence’ is used to describe the evidence for a annotation as free text (e.g. ‘experimental’). Optional element ‘property’ allows for further, typed and referenced annotations from external resources.

class BinaryCharacters(lost_count=None, absent_count=None, present_count=None, type_=None, gained_count=None, gained=None, lost=None, present=None, absent=None, valueOf_=None)

The names and/or counts of binary characters present, gained, and lost at the root of a clade.

class BranchColor(red=None, green=None, blue=None, valueOf_=None)

This indicates the color of a clade when rendered (the color applies to the whole clade unless overwritten by the color(s) of sub clades).

class Clade(id_source=None, branch_length_attr=None, name=None, branch_length=None, confidence=None, width=None, color=None, node_id=None, taxonomy=None, sequence=None, events=None, binary_characters=None, distribution=None, date=None, reference=None, property=None, clade=None, valueOf_=None)

Element Clade is used in a recursive manner to describe the topology of a phylogenetic tree. The parent branch length of a clade can be described either with the ‘branch_length’ element or the ‘branch_length’ attribute (it is not recommended to use both at the same time, though). Usage of the ‘branch_length’ attribute allows for a less verbose description. Element ‘confidence’ is used to indicate the support for a clade/parent branch. Element ‘events’ is used to describe such events as gene-duplications at the root node/parent branch of a clade. Element ‘width’ is the branch width for this clade (including parent branch). Both ‘color’ and ‘width’ elements apply for the whole clade unless overwritten in-sub clades. Attribute ‘id_source’ is used to link other elements to a clade (on the xml-level).

class CladeRelation(id_ref_0=None, id_ref_1=None, type_=None, distance=None, confidence=None, valueOf_=None)

This is used to express a typed relationship between two clades. For example it could be used to describe multiple parents of a clade.

class Confidence(type_=None, valueOf_=None)

A general purpose confidence element. For example this can be used to express the bootstrap support value of a clade (in which case the ‘type’ attribute is ‘bootstrap’).

class Date(unit=None, desc=None, value=None, minimum=None, maximum=None, valueOf_=None)

A date associated with a clade/node. Its value can be numerical by using the ‘value’ element and/or free text with the ‘desc’ element’ (e.g. ‘Silurian’). If a numerical value is used, it is recommended to employ the ‘unit’ attribute to indicate the type of the numerical value (e.g. ‘mya’ for ‘million years ago’). The elements ‘minimum’ and ‘maximum’ are used the indicate a range/confidence interval

class Distribution(desc=None, point=None, polygon=None, valueOf_=None)

The geographic distribution of the items of a clade (species, sequences), intended for phylogeographic applications. The location can be described either by free text in the ‘desc’ element and/or by the coordinates of one or more ‘Points’ (similar to the ‘Point’ element in Google’s KML format) or by ‘Polygons’.

class DomainArchitecture(length=None, domain=None, valueOf_=None)

This is used describe the domain architecture of a protein. Attribute ‘length’ is the total length of the protein

class Events(type_=None, duplications=None, speciations=None, losses=None, confidence=None, valueOf_=None)

Events at the root node of a clade (e.g. one gene duplication).

class Id(provider=None, valueOf_=None)

A general purpose identifier element. Allows to indicate the provider (or authority) of an identifier.

class MolSeq(is_aligned=None, valueOf_=None)

Element ‘mol_seq’ is used to store molecular sequences. The ‘is_aligned’ attribute is used to indicated that this molecular sequence is aligned with all other sequences in the same phylogeny for which ‘is aligned’ is true as well (which, in most cases, means that gaps were introduced, and that all sequences for which ‘is aligned’ is true must have the same length).

class Phylogeny(rerootable=None, branch_length_unit=None, type_=None, rooted=None, name=None, id=None, description=None, date=None, confidence=None, clade=None, clade_relation=None, sequence_relation=None, property=None, valueOf_=None)

Element Phylogeny is used to represent a phylogeny. The required attribute ‘rooted’ is used to indicate whether the phylogeny is rooted or not. The attribute ‘rerootable’ can be used to indicate that the phylogeny is not allowed to be rooted differently (i.e. because it is associated with root dependent data, such as gene duplications). The attribute ‘type’ can be used to indicate the type of phylogeny (i.e. ‘gene tree’). It is recommended to use the attribute ‘branch_length_unit’ if the phylogeny has branch lengths. Element clade is used in a recursive manner to describe the topology of a phylogenetic tree.

subclass

alias of PhyloxmlTree

class Point(geodetic_datum=None, alt_unit=None, lat=None, long=None, alt=None, valueOf_=None)

The coordinates of a point with an optional altitude (used by element ‘Distribution’). Required attributes are the ‘geodetic_datum’ used to indicate the geodetic datum (also called ‘map datum’, for example Google’s KML uses ‘WGS84’). Attribute ‘alt_unit’ is the unit for the altitude (e.g. ‘meter’).

class Polygon(point=None, valueOf_=None)

A polygon defined by a list of ‘Points’ (used by element ‘Distribution’).

class Property(datatype=None, id_ref=None, ref=None, applies_to=None, unit=None, valueOf_=None, mixedclass_=None, content_=None)

Property allows for typed and referenced properties from external resources to be attached to ‘Phylogeny’, ‘Clade’, and ‘Annotation’. The value of a property is its mixed (free text) content. Attribute ‘datatype’ indicates the type of a property and is limited to xsd-datatypes (e.g. ‘xsd:string’, ‘xsd:boolean’, ‘xsd:integer’, ‘xsd:decimal’, ‘xsd:float’, ‘xsd:double’, ‘xsd:date’, ‘xsd:anyURI’). Attribute ‘applies_to’ indicates the item to which a property applies to (e.g. ‘node’ for the parent node of a clade, ‘parent_branch’ for the parent branch of a clade). Attribute ‘id_ref’ allows to attached a property specifically to one element (on the xml-level). Optional attribute ‘unit’ is used to indicate the unit of the property. An example: <property datatype=”xsd:integer” ref=”NOAA:depth” applies_to=”clade” unit=”METRIC:m”> 200 </property>

class ProteinDomain(to=None, confidence=None, fromxx=None, id=None, valueOf_=None)

To represent an individual domain in a domain architecture. The name/unique identifier is described via the ‘id’ attribute. ‘confidence’ can be used to store (i.e.) E-values.

class Reference(doi=None, desc=None, valueOf_=None)

A literature reference for a clade. It is recommended to use the ‘doi’ attribute instead of the free text ‘desc’ element whenever possible.

class Sequence(id_source=None, id_ref=None, type_=None, symbol=None, accession=None, name=None, location=None, mol_seq=None, uri=None, annotation=None, domain_architecture=None, valueOf_=None)

Element Sequence is used to represent a molecular sequence (Protein, DNA, RNA) associated with a node. ‘symbol’ is a short (maximal ten characters) symbol of the sequence (e.g. ‘ACTM’) whereas ‘name’ is used for the full name (e.g. ‘muscle Actin’). ‘location’ is used for the location of a sequence on a genome/chromosome. The actual sequence can be stored with the ‘mol_seq’ element. Attribute ‘type’ is used to indicate the type of sequence (‘dna’, ‘rna’, or ‘protein’). One intended use for ‘id_ref’ is to link a sequence to a taxonomy (via the taxonomy’s ‘id_source’) in case of multiple sequences and taxonomies per node.

class SequenceRelation(id_ref_0=None, id_ref_1=None, type_=None, distance=None, confidence=None, valueOf_=None)

This is used to express a typed relationship between two sequences. For example it could be used to describe an orthology (in which case attribute ‘type’ is ‘orthology’).

class Taxonomy(id_source=None, id=None, code=None, scientific_name=None, authority=None, common_name=None, synonym=None, rank=None, uri=None, valueOf_=None)

Element Taxonomy is used to describe taxonomic information for a clade. Element ‘code’ is intended to store UniProt/Swiss-Prot style organism codes (e.g. ‘APLCA’ for the California sea hare ‘Aplysia californica’) or other styles of mnemonics (e.g. ‘Aca’). Element ‘authority’ is used to keep the authority, such as ‘J. G. Cooper, 1863’, associated with the ‘scientific_name’. Element ‘id’ is used for a unique identifier of a taxon (for example ‘6500’ with ‘ncbi_taxonomy’ as ‘provider’ for the California sea hare). Attribute ‘id_source’ is used to link other elements to a taxonomy (on the xml-level).

class Uri(type_=None, desc=None, valueOf_=None)

A uniform resource identifier. In general, this is expected to be an URL (for example, to link to an image on a website, in which case the ‘type’ attribute might be ‘image’ and ‘desc’ might be ‘image of a California sea hare’).

class PhyloxmlTree(phyloxml_clade=None, phyloxml_phylogeny=None, **kargs)

PhyloTree object supporting phyloXML format.