EvolTree class

class EvolNode(newick=None, alignment=None, alg_format='fasta', sp_naming_function=<function _parse_species>, format=0, binpath='', **kwargs)

Bases: ete3.phylo.phylotree.PhyloNode

Re-implementation of the standart TreeNode instance. It adds attributes and methods to work with phylogentic trees.

Parameters:
  • newick – path to tree in newick format, can also be a string
  • alignment – path to alignment, can also be a string.
  • alg_format (fasta) – alignment format.
  • sp_naming_function – function to infer species name.
  • format – type of newick format
  • binpath – path to binaries, in case codeml or SLR are not in global path.
change_dist_to_evol(evol, model, fill=False)

change dist/branch length of the tree to a given evolutionary variable (dN, dS, w or bL), default is bL.

Parameters:
  • evol – evolutionary variable
  • model – Model object from which to retrieve evolutionary variables
  • fill (False) – do not affects only dist parameter, each node will be annotated with all evolutionary variables (nodel.dN, node.w...).
get_descendant_by_node_id(idname)

returns node list corresponding to a given idname

get_evol_model(modelname)

returns one precomputed model

Parameters:modelname – string of the name of a model object stored
Returns:Model object
get_most_likely(altn, null)

Returns pvalue of LRT between alternative model and null model.

usual comparison are:

Alternative Null Test
M2 M1 PS on sites (M2 prone to miss some sites) (Yang 2000)
M3 M0 test of variability among sites
M8 M7 PS on sites (Yang 2000)
M8 M8a RX on sites?? think so....
bsA bsA1 PS on sites on specific branch (Zhang 2005)
bsA M1 RX on sites on specific branch (Zhang 2005)
bsC M1 different omegas on clades branches sites ref: Yang Nielsen 2002
bsD M3 different omegas on clades branches sites (Yang Nielsen 2002, Bielawski 2004)
b_free b_neut
foreground branch not neutral (w != 1)
  • RX if P<0.05 (means that w on frg=1)
  • PS if P>0.05 and wfrg>1
  • CN if P>0.05 and wfrg>1

(Yang Nielsen 2002)

b_free M0 different ratio on branches (Yang Nielsen 2002)
Parameters:
  • altn – model with higher number of parameters (np)
  • null – model with lower number of parameters (np)

same function as for phyloTree, but translate sequences if nucleotides nucleotidic sequence is kept under node.nt_sequence

Parameters:
  • alignment (True) – path to alignment or string
  • alg_format – one of fasta phylip or paml
  • alignment – set to False in case we want to keep it untranslated
link EvolTree to evolutionary model
  • free-branch model (“fb”) will append evol values to tree
  • Site models (M0, M1, M2, M7, M8) will give evol values by site and likelihood
Parameters:
  • path – path to outfile containing model computation result
  • model – either the name of a model, or a Model object (usually empty)
mark_tree(node_ids, verbose=False, **kargs)

function to mark branches on tree in order that paml could interpret it. takes a “marks” argument that should be a list of #1,#1,#2 e.g.:

t=Tree.mark_tree([2,3], marks=["#1","#2"])
Parameters:
  • node_ids – list of node ids (have a look to node.node_id)
  • verbose (False) – warn if marks do not correspond to codeml standard
  • kargs – mainly for the marks key-word which needs a list of marks (marks=[‘#1’, ‘#2’])
render(file_name, layout=None, w=None, h=None, tree_style=None, header=None, histfaces=None)

call super show adding up and down faces

Parameters:
  • layout – a layout function
  • tree_style (None) – tree_style object
  • Nonehistface – an histogram face function. This is only to plot selective pressure among sites
run_model(model_name, ctrl_string='', keep=True, **kwargs)

To compute evolutionnary models. e.g.: b_free_lala.vs.lele, will launch one free branch model, and store it in “WORK_DIR/b_free_lala.vs.lele” directory

WARNING: this functionality needs to create a working directory in “rep”

WARNING: you need to have codeml and/or SLR in your path

The models available are:

Model name Description Model kind
M1 relaxation site
M10 beta and gamma + 1 site
M11 beta and normal > 1 site
M12 0 and 2 normal > 2 site
M13 3 normal > 0 site
M2 positive-selection site
M3 discrete site
M4 frequencies site
M5 gamma site
M6 2 gamma site
M7 relaxation site
M8 positive-selection site
M8a relaxation site
M9 beta and gamma site
SLR positive/negative selection site
M0 negative-selection null
fb_anc free-ratios branch_ancestor
bsA positive-selection branch-site
bsA1 relaxation branch-site
bsB positive-selection branch-site
bsC different-ratios branch-site
bsD different-ratios branch-site
b_free positive-selection branch
b_neut relaxation branch
fb free-ratios branch
XX User defined Unknown
Parameters:
  • model_name – a string like “model-name[.some-secondary-name]” (e.g.: “fb.my_first_try”, or just “fb”) * model-name is compulsory, is the name of the model (see table above for the full list) * the second part is accessory, it is to avoid over-writing models with the same name.
  • ctrl_string – list of parameters that can be used as control file.
  • kwargs – extra parameters should be one of: verbose, CodonFreq, ncatG, cleandata, fix_blength, NSsites, fix_omega, clock, seqfile, runmode, fix_kappa, fix_alpha, Small_Diff, method, Malpha, aaDist, RateAncestor, outfile, icode, alpha, seqtype, omega, getSE, noisy, Mgene, kappa, treefile, model, ndata.
sep = '\n'
show(layout=None, tree_style=None, histfaces=None)

call super show of PhyloTree histface should be a list of models to be displayes as histfaces

Parameters:
  • layout – a layout function
  • tree_style (None) – tree_style object
  • Nonehistface – an histogram face function. This is only to plot selective pressure among sites
write(features=None, outfile=None, format=10)

Inherits from Tree but adds the tenth format, that allows to display marks for CodeML. TODO: internal writting format need to be something like 0

Returns the newick representation of current node. Several arguments control the way in which extra data is shown for every node:

Parameters:
  • features – a list of feature names to be exported using the Extended Newick Format (i.e. features=[“name”, “dist”]). Use an empty list to export all available features in each node (features=[])
  • outfile – writes the output to a given file
  • format (10) – defines the newick standard used to encode the tree. See tutorial for details.
  • format_root_node (False) – If True, it allows features and branch information from root node to be exported as a part of the newick text string. For newick compatibility reasons, this is False by default.
  • is_leaf_fn – See TreeNode.traverse() for documentation.

Example:

t.get_newick(features=["species","name"], format=1)
x = 'XX'
EvolTree

alias of EvolNode