Cookbook index

Using ete-build to run multiple workflows at once

This recipe shows how to exploit the potential of executing several phylogenetic workflows at the same dataset. It can be used for checking results from different aligners or inference programs.

The following commands illustrate this feature by testing several aligners with a single command.

Requirements

Recipe

1.Get your sequences ready as a FASTA file

We will use a toy example with 25 sequences. Clean FASTA headers are recommended.

In [5]:
%%bash
wget http://etetoolkit.org/static/data/NUP62.aa.fa 2>/dev/null
head NUP62.aa.fa -n14
>Phy003I7ZJ_CHICK
TMSQFNFSSAPAGGGFSFSTPKTAASTTAATGFSFTPAPSSGFTFGGAAPTPASSQPVTP
FSFSTPASSALPTAFSFGTPATATTAAPAASVFPLGGNAPKLNFGGTSTTQATGITGGFG
FGTSAPTSVPSSQAAAPSGFMFGTAATTTTTTTAAQPGTTGGFTFSSGTTTQAGTTGFNI
GATSTAAPQAVPTGLTFGAAPAAAATTTASLGSTTQPAATPFSLGGQSSATLTASTSQGP
TLSFGSKLGVTTTASTTTAASTAPLLGSTGPVLFASIASSSAPASSTSTGLSLGAPSTGT
TGLGTSGFGLKPPGTTAAATSTATSTSASSFALNLKPLTTTGTIGAVTSTAAITTTTPSA
PPVMTYAQLESLINKWSLELEDQEKHFLHQATQVNAWDRTLIENGEKITSLHREVEKVKL
DQKRLDQELDFILSQQKELEDLLTPLEESVKEQSGTIYLQHADEERERTYKLAENIDAQL
KRMAQDLKDIIEHLNTSGRPADTSDPLQQICKILNAHMDSLQWIDQNSALLQRKVEEVTK
VCESRRKEQERSFRITFD
>Phy0054BO3_MELGA
GNAPKLNFGGTSTTQATGITGGFGFGTSAPTSVPSSQAAAPSGFMFGSATATTTTTTAAQ
PGTTGGFTFSSGTTTQAGTTGFNIGTTSTAAPQAAPTGLTFGAAPAAAAATTTASLGSTA

2. Check available aligners in ete3

In [3]:
%%bash 
ete3 build apps aligners
--------------------------------------------------------------------------------------------------------------------
                                                      aligners                                                      
--------------------------------------------------------------------------------------------------------------------
name                 | app type   | desc.                                                                           
==================== | ========== | ================================================================================
metaligner_phylomedb | metaligner | Meta-aligning based on head/tail alignments produced by muscle, mafft and dialig
                     |            | n-tx, scanned with M-Coffee. Unconsistent columns are removed and final alignmen
                     |            | t is cleaned with trimAl                                                        
metaligner_trimmed   | metaligner | Meta-aligning based on head/tail alignments produced by muscle, mafft and clusta
                     |            | lomega, scanned with M-Coffee. Unconsistent columns are removed and final alignm
                     |            | ent is cleaned with trimAl                                                      
metaligner_default   | metaligner | Meta-aligning based on head/tail alignments produced by muscle, mafft and clusta
                     |            | lomega, scanned with M-Coffee. Unconsistent columns are removed                 
tcoffee_default      | tcoffee    | (EXPERIMENTAL) tcoffee alignment with default paramerters                       
mcoffee_ensembl      | tcoffee    | (EXPERIMENTAL) mcoffee alignment as used in the Ensembl database                
muscle_default       | muscle     | muscle alignment with default parameters                                        
mafft_default        | mafft      | mafft alignment with default parameters                                         
mafft_einsi          | mafft      | mafft alignment using the E-INS-i mode                                          
mafft_linsi          | mafft      | mafft alignment using the L-INS-i mode                                          
mafft_ginsi          | mafft      | mafft alignment using the G-INS-i mode                                          
clustalo_default     | clustalo   | clustalo with default parameters                                                

3. Compose the names of the target workflows

You can do this by hand, or with some bash scripting.

In [8]:
%%bash
for aligner in mafft_einsi mafft_linsi mafft_ginsi clustalo_default;
    do echo $aligner-none-none-fasttree_default; 
done;
mafft_einsi-none-none-fasttree_default
mafft_linsi-none-none-fasttree_default
mafft_ginsi-none-none-fasttree_default
clustalo_default-none-none-fasttree_default

4. Run all workflows at once

In [10]:
%%bash 
ete3 build -a NUP62.aa.fa -o test_aligners --clearall -w mafft_einsi-none-none-fasttree_default \
    mafft_linsi-none-none-fasttree_default \
    mafft_ginsi-none-none-fasttree_default \
    clustalo_default-none-none-fasttree_default;
Toolchain path: /Users/jhc/anaconda/bin/ete3_apps 
Toolchain version: 2.0.3


      --------------------------------------------------------------------------------
                  ETE build - reproducible phylogenetic workflows 
                                    unknown, unknown.

      If you use ETE in a published work, please cite:

        Jaime Huerta-Cepas, Joaquín Dopazo and Toni Gabaldón. ETE: a python
        Environment for Tree Exploration. BMC Bioinformatics 2010,
        11:24. doi:10.1186/1471-2105-11-24

      (Note that a list of the external programs used to complete all necessary
      computations will be also shown after execution. Those programs should
      also be cited.)
      --------------------------------------------------------------------------------

    
INFO -  Testing x86-64  portable applications...
       clustalo: OK - 1.2.1
Dialign-tx not supported in OS X
       fasttree: OK - FastTree Version 2.1.8 Double precision (No SSE3), OpenMP (1 threads)
         kalign: OK - Kalign version 2.04, Copyright (C) 2004, 2005, 2006 Timo Lassmann
          mafft: OK - MAFFT v6.861b (2011/09/24)
         muscle: OK - MUSCLE v3.8.31 by Robert C. Edgar
          phyml: OK - . This is PhyML version 20160115.
     pmodeltest: OK - pmodeltest.py v1.4
          prank: OK - prank v.100802. Minimal usage: 'prank sequence_file'
       probcons: OK - PROBCONS version 1.12 - align multiple protein sequences and print to standard output
          raxml: OK - This is RAxML version 8.1.20 released by Alexandros Stamatakis on April 18 2015.
 raxml-pthreads: OK - This is RAxML version 8.1.20 released by Alexandros Stamatakis on April 18 2015.
         readal: OK - readAl v1.4.rev6 build[2012-02-02]
         statal: OK - statAl v1.4.rev6 build[2012-02-02]
        tcoffee: OK - PROGRAM: T-COFFEE Version_11.00.8cbe486 (2014-08-12 22:05:29 - Revision 8cbe486 - Build 477)
         trimal: OK - trimAl v1.4.rev6 build[2012-02-02]
INFO -  Starting ETE-build execution at Fri Feb  5 20:23:18 2016
INFO -  Output directory /Users/jhc/_Devel/cookbook/recipes/test_aligners
INFO -  Erasing all existing npr data...
INFO -  Reading aa sequences from NUP62.aa.fa...
WRNG -  25 target sequences
INFO -  ETE build starts now!
INFO -   Updating tasks status: (Fri Feb  5 20:23:18 2016)
INFO -  Thread mafft_ginsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (W) MultiSeqTask (25 aa seqs, MSF, /mafft_gins...ee_default)
INFO -   (D) MultiSeqTask (25 aa seqs, MSF, /mafft_gins...ee_default)
INFO -  Waiting 2 seconds
INFO -   Updating tasks status: (Fri Feb  5 20:23:20 2016)
INFO -  Thread mafft_ginsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (W) AlgTask (25 aa seqs, Mafft, /mafft_gins...ee_default)
INFO -   (Q) MultiSeqTask (25 aa seqs, MSF, /clustalo_d...ee_default)
INFO -   (D) MultiSeqTask (25 aa seqs, MSF, /clustalo_d...ee_default)
INFO -  Waiting 2 seconds
INFO -   Updating tasks status: (Fri Feb  5 20:23:22 2016)
INFO -  Thread mafft_ginsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (W) AlgTask (25 aa seqs, Mafft, /mafft_gins...ee_default)
INFO -   (D) AlgTask (25 aa seqs, Mafft, /mafft_gins...ee_default)
INFO -   (W) AlgTask (25 aa seqs, Clustal-Omega, /clustalo_d...ee_default)
INFO -   (Q) MultiSeqTask (25 aa seqs, MSF, /mafft_lins...ee_default)
INFO -   (D) MultiSeqTask (25 aa seqs, MSF, /mafft_lins...ee_default)
INFO -  Waiting 2 seconds
INFO -   Updating tasks status: (Fri Feb  5 20:23:24 2016)
INFO -  Thread mafft_ginsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (W) TreeTask (25 aa seqs, FastTree, /mafft_gins...ee_default)
INFO -   (W) AlgTask (25 aa seqs, Clustal-Omega, /clustalo_d...ee_default)
INFO -   (W) AlgTask (25 aa seqs, Mafft, /mafft_lins...ee_default)
INFO -   (Q) MultiSeqTask (25 aa seqs, MSF, /mafft_eins...ee_default)
INFO -   (D) MultiSeqTask (25 aa seqs, MSF, /mafft_eins...ee_default)
INFO -  Waiting 2 seconds
INFO -   Updating tasks status: (Fri Feb  5 20:23:26 2016)
INFO -  Thread mafft_ginsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (W) TreeTask (25 aa seqs, FastTree, /mafft_gins...ee_default)
INFO -   (Q) AlgTask (25 aa seqs, Clustal-Omega, /clustalo_d...ee_default)
INFO -   (D) AlgTask (25 aa seqs, Clustal-Omega, /clustalo_d...ee_default)
INFO -   (W) AlgTask (25 aa seqs, Mafft, /mafft_lins...ee_default)
INFO -   (W) AlgTask (25 aa seqs, Mafft, /mafft_eins...ee_default)
INFO -  Waiting 2 seconds
INFO -   Updating tasks status: (Fri Feb  5 20:23:28 2016)
INFO -  Thread mafft_ginsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (R) TreeTask (25 aa seqs, FastTree, /mafft_gins...ee_default)
INFO -   (W) TreeTask (25 aa seqs, FastTree, /clustalo_d...ee_default)
INFO -   (Q) AlgTask (25 aa seqs, Mafft, /mafft_lins...ee_default)
INFO -   (W) AlgTask (25 aa seqs, Mafft, /mafft_eins...ee_default)
INFO -  Waiting 2 seconds
INFO -   Updating tasks status: (Fri Feb  5 20:23:30 2016)
INFO -  Thread mafft_ginsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (R) TreeTask (25 aa seqs, FastTree, /mafft_gins...ee_default)
INFO -   (D) TreeTask (25 aa seqs, FastTree, /mafft_gins...ee_default)
INFO -   (W) TreeTask (25 aa seqs, FastTree, /clustalo_d...ee_default)
INFO -   (Q) AlgTask (25 aa seqs, Mafft, /mafft_lins...ee_default)
INFO -   (D) AlgTask (25 aa seqs, Mafft, /mafft_lins...ee_default)
INFO -   (Q) AlgTask (25 aa seqs, Mafft, /mafft_eins...ee_default)
INFO -  Waiting 2 seconds
INFO -   Updating tasks status: (Fri Feb  5 20:23:32 2016)
INFO -  Thread mafft_ginsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (W) TreeMergeTask (25 aa seqs, TreeMerger, /mafft_gins...ee_default)
INFO -   (D) TreeMergeTask (25 aa seqs, TreeMerger, /mafft_gins...ee_default)
INFO -   (Q) TreeTask (25 aa seqs, FastTree, /clustalo_d...ee_default)
INFO -   (W) TreeTask (25 aa seqs, FastTree, /mafft_lins...ee_default)
INFO -   (R) AlgTask (25 aa seqs, Mafft, /mafft_eins...ee_default)
INFO -   (D) AlgTask (25 aa seqs, Mafft, /mafft_eins...ee_default)
INFO -  Waiting 2 seconds
INFO -  Assembling final tree...
INFO -  Done thread mafft_ginsi-none-none-fasttree_default in 1 iteration(s)
INFO -  Writing final tree for mafft_ginsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_ginsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.nw
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_ginsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.nwx (newick extended)
INFO -  Writing final tree alignment mafft_ginsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_ginsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.used_alg.fa
INFO -  Writing root node alignment mafft_ginsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_ginsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.fa
INFO -  Generating tree image for mafft_ginsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_ginsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.png
INFO -   Updating tasks status: (Fri Feb  5 20:23:35 2016)
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (R) TreeTask (25 aa seqs, FastTree, /clustalo_d...ee_default)
INFO -   (D) TreeTask (25 aa seqs, FastTree, /clustalo_d...ee_default)
INFO -   (W) TreeTask (25 aa seqs, FastTree, /mafft_lins...ee_default)
INFO -   (W) TreeTask (25 aa seqs, FastTree, /mafft_eins...ee_default)
INFO -  Waiting 2 seconds
INFO -  Done thread mafft_ginsi-none-none-fasttree_default in 1 iteration(s)
INFO -   Updating tasks status: (Fri Feb  5 20:23:37 2016)
INFO -  Thread clustalo_default-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (W) TreeMergeTask (25 aa seqs, TreeMerger, /clustalo_d...ee_default)
INFO -   (D) TreeMergeTask (25 aa seqs, TreeMerger, /clustalo_d...ee_default)
INFO -   (R) TreeTask (25 aa seqs, FastTree, /mafft_lins...ee_default)
INFO -   (D) TreeTask (25 aa seqs, FastTree, /mafft_lins...ee_default)
INFO -   (W) TreeTask (25 aa seqs, FastTree, /mafft_eins...ee_default)
INFO -  Waiting 2 seconds
INFO -  Done thread mafft_ginsi-none-none-fasttree_default in 1 iteration(s)
INFO -  Assembling final tree...
INFO -  Done thread clustalo_default-none-none-fasttree_default in 1 iteration(s)
INFO -  Writing final tree for clustalo_default-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/clustalo_default-none-none-fasttree_default/NUP62.aa.fa.final_tree.nw
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/clustalo_default-none-none-fasttree_default/NUP62.aa.fa.final_tree.nwx (newick extended)
INFO -  Writing final tree alignment clustalo_default-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/clustalo_default-none-none-fasttree_default/NUP62.aa.fa.final_tree.used_alg.fa
INFO -  Writing root node alignment clustalo_default-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/clustalo_default-none-none-fasttree_default/NUP62.aa.fa.final_tree.fa
INFO -  Generating tree image for clustalo_default-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/clustalo_default-none-none-fasttree_default/NUP62.aa.fa.final_tree.png
INFO -   Updating tasks status: (Fri Feb  5 20:23:39 2016)
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -  Thread mafft_linsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (W) TreeMergeTask (25 aa seqs, TreeMerger, /mafft_lins...ee_default)
INFO -   (D) TreeMergeTask (25 aa seqs, TreeMerger, /mafft_lins...ee_default)
INFO -   (R) TreeTask (25 aa seqs, FastTree, /mafft_eins...ee_default)
INFO -   (D) TreeTask (25 aa seqs, FastTree, /mafft_eins...ee_default)
INFO -  Waiting 2 seconds
INFO -  Done thread mafft_ginsi-none-none-fasttree_default in 1 iteration(s)
INFO -  Done thread clustalo_default-none-none-fasttree_default in 1 iteration(s)
INFO -  Assembling final tree...
INFO -  Done thread mafft_linsi-none-none-fasttree_default in 1 iteration(s)
INFO -  Writing final tree for mafft_linsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_linsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.nw
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_linsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.nwx (newick extended)
INFO -  Writing final tree alignment mafft_linsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_linsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.used_alg.fa
INFO -  Writing root node alignment mafft_linsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_linsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.fa
INFO -  Generating tree image for mafft_linsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_linsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.png
INFO -   Updating tasks status: (Fri Feb  5 20:23:42 2016)
INFO -  Thread mafft_einsi-none-none-fasttree_default: pending tasks: 1 of sizes: 25
INFO -   (W) TreeMergeTask (25 aa seqs, TreeMerger, /mafft_eins...ee_default)
INFO -   (D) TreeMergeTask (25 aa seqs, TreeMerger, /mafft_eins...ee_default)
INFO -  Waiting 2 seconds
INFO -  Done thread mafft_ginsi-none-none-fasttree_default in 1 iteration(s)
INFO -  Done thread clustalo_default-none-none-fasttree_default in 1 iteration(s)
INFO -  Assembling final tree...
INFO -  Done thread mafft_einsi-none-none-fasttree_default in 1 iteration(s)
INFO -  Writing final tree for mafft_einsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_einsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.nw
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_einsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.nwx (newick extended)
INFO -  Writing final tree alignment mafft_einsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_einsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.used_alg.fa
INFO -  Writing root node alignment mafft_einsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_einsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.fa
INFO -  Generating tree image for mafft_einsi-none-none-fasttree_default
   /Users/jhc/_Devel/cookbook/recipes/test_aligners/mafft_einsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.png
INFO -  Done thread mafft_linsi-none-none-fasttree_default in 1 iteration(s)
INFO -  Done
INFO -  Deleting temporal data...
   ========================================================================
         The following published software and/or methods were used.        
               *** Please, do not forget to cite them! ***                 
   ========================================================================
   Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R,
      McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG. Fast,
      scalable generation of high-quality protein multiple sequence
      alignments using Clustal Omega. Mol Syst Biol. 2011 Oct 11;7:539.
      doi: 10.1038/msb.2011.75.
   Huerta-Cepas J, Dopazo J, Gabaldón T. ETE: a python Environment for Tree
      Exploration. BMC Bioinformatics. 2010 Jan 13;11:24.
   Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in
      accuracy of multiple sequence alignment. Nucleic Acids Res. 2005 Jan
      20;33(2):511-8.
   Price MN, Dehal PS, Arkin AP. FastTree 2 - approximately maximum-
      likelihood trees for large alignments. PLoS One. 2010 Mar
      10;5(3):e9490.

After a few seconds, you should get 4 trees under the test_aligners/ directory

In [11]:
%%bash
find test_aligners/ -name '*.final_tree.nw'
test_aligners/clustalo_default-none-none-fasttree_default/NUP62.aa.fa.final_tree.nw
test_aligners/mafft_einsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.nw
test_aligners/mafft_ginsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.nw
test_aligners/mafft_linsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.nw

Lets compare the ClustalOmega-based tree with the rest of Mafft-based trees.

In [24]:
%%bash 
ete3 compare -r test_aligners/clustalo_default-none-none-fasttree_default/NUP62.aa.fa.final_tree.nw \
    -t `find test_aligners/ -name '*.final_tree.nw'` --unrooted
source          | ref             | E.size  | nRF     | RF      | maxRF   | src-br+ | ref-br+ | subtre+ | treekoD
==============+ | ==============+ | ======+ | ======+ | ======+ | ======+ | ======+ | ======+ | ======+ | ======+
(..)alo_defaul+ | (..)alo_defaul+ | 25      | 0.00    | 0.00    | 44.00   | 1.00    | 1.00    | 1       | NA     
(..)_einsi-non+ | (..)alo_defaul+ | 25      | 0.59    | 26.00   | 44.00   | 0.72    | 0.72    | 1       | NA     
(..)_ginsi-non+ | (..)alo_defaul+ | 25      | 0.59    | 26.00   | 44.00   | 0.72    | 0.72    | 1       | NA     
(..)_linsi-non+ | (..)alo_defaul+ | 25      | 0.59    | 26.00   | 44.00   | 0.72    | 0.72    | 1       | NA     

Ooops, so the ClustalOmega tree is different to the rest (note that Robinson-Foulds distance is higher than 0 in the three last rows).

Let's take a look where are those differences in tree topology and alignment:

In [25]:
from IPython.display import Image
Image(filename='test_aligners/clustalo_default-none-none-fasttree_default/NUP62.aa.fa.final_tree.png')
Out[25]:
In [23]:
Image(filename='test_aligners/mafft_linsi-none-none-fasttree_default/NUP62.aa.fa.final_tree.png')
Out[23]: