site logo
The ETE toolkit
A Python Environment for Tree Exploration


Organized by the Bioinformatics Core Facility at CRG and Toni Gabaldon's lab. Location: Centre for Genomic Regulation,  Barcelona (Spain).   Duration: 3 days  Dates:  April 6th- 8th (2011

free registration (limited places) deadline: (CLOSED)

(Added some location and accomodation info)


Material and examples are available upon request 

 

  1. Course Description
  2. Pre-requisites
  3. Instructors
  4. Course Program
  5. Acommodation info
  6. How to get CRG

 

COURSE DESCRIPTION: Phylogentic analyses are gradually reaching genomic scales. Nowadays, many resources and surveys encompass a large number of trees that, often, cannot be manually analyzed. Bioinformatics toolkits are intended to provide a flexible framework to deal with specific data in a programmatic way, thus facilitating the analysis of large collections of data. The Environment for Tree Exploration (ETE, http://ete.cgenomics.org) is a Python programming toolkit specially focused on dealing with hierarchical trees. It allows, for instance, to perform a number of operations on phylogenetic trees, as well as designing automatic pipelines. It also provides a highly customizable drawing engine, which can be used to create complex annotated tree images in an automatic way or to interactively explore single trees. Moreover, the ETE toolkit is not only limited to large scale analyses, as it can be used to easily develop specific tree analysis methods for single trees. For instance, TreeKo is a novel software based on ETE which is designed to perform large scale comparisons of tree topologies taking into account speciation a duplication events.

 
The purpose of this course is to provide an introduction to the analysis of phylogenetic trees. It will cover a broad range of tasks that are usually required in any phylogenomic analysis: tree rooting, prediction of orthology and paralogy relationships, tree annotation, calculating distances among sequences or species, tree pruning, trees comparison, and tree visualization. The use of large scale phylogenomic resources, such as
PhylomeDB or Ensembl Compara, will be also tackled through examples and exercises. This course will be mostly practical and will be focused on solving real life examples.

Course Pre-requisites:

Course attendees are expected to have basic programming skills (not necessarily in Python, although it is recommended*). All exercises will consist on developing Python scripts to perform different analysis on phylogenetic trees using the ETE toolkit on a GNU/Linux environment.

*Important Note: NO introduction to Python programming is scheduled in the course. However, Python is a very intuitive language that can be learned quickly when you have programmed in other languages. As a reference, Chapters 3-7 and 9 from this tutorial would be more than enough to follow the whole course.

 

 

INSTRUCTORS:

Toni Gabaldon leads the Comparative Genomics group at the Centre for Genomic Regulationand is Associate Professor at the Universitat Pompeu Fabra. He got his PhD on comparative genomics  at the "Radboud Universiteit Nijmegen" (The Netherlands) in 2005. His group has made significant contributions to the development of tools for phylogenomic analysis including, phylomeDB [2], ETE [3]. TreeKo, trimAl ,and  MetaPhOrs, as well as in the exploitation of phylogenomic data for the understanding of the evolution and function of complex biological processes. Gabaldon has extensive experience in teaching bioinformatics and molecular evolution at the undergraduate and graduate levels.

homepage: http://big.crg.cat/people/toni_gabaldon


 

Jaime Huerta-Cepas is a postdoc researcher within the Comparative Genomics group at the Centre for Genomic Regulation. He got his PhD on human genome evolution [1] and large scale phylogentic analyses at the "Universidad Autónoma de Madrid" in 2008. Jaime is the main developer of the phylomeDB database [2], and the ETE toolkit [3]. His work focuses on applying large scale phylogenetic analyses to address different biological problems, such as understanding gene duplication, the evolution of gene expression, functional genome annotation, orthology and paralogy prediction, and the reconstruction of species Tree of Life. Personal homepage: http://jhcepas.cgenomics.org


 

Marina Marcet-Houben obtained her degree in Biochemistry in the Rovira i Virgili University (Tarragona, Spain). She did her PhD on fungal phylogenomics in the Comparative Genomics lab at the Center for Genomic Regulation in Barcelona. Her main research interests are related to the use of large scale phylogenomics tools to study the evolution of fungi [4] as well as in studies involving the robustness of species trees [5]. Marina is an active collaborator in the phylomeDB project and the main developer of TreeKo, a tool for comparing phylogenetic tree topologies.


[1] J. Huerta-Cepas, H. Dopazo, J. Dopazo and T. Gabaldón. The     Human Phylome. Genome Biology 8:r109, 2007.

[2] Huerta-Cepas, J., Bueno, A., Dopazo, J., Gabaldon, T. PhylomeDB: A database for complete collections of gene phylogenies. Nucleic Acids Res. 2008 Jan. 36 (Database     issue):D491-6.

[3] Jaime Huerta-Cepas, Joaquín Dopazo and Toni Gabaldón. ETE: A  python Environment for Tree Exploration. BMC  Bioinformatics. 2010, 11:24.

[4] Marcet-Houben M, Gabaldón T. Acquisition of prokaryotic genes by fungal genomes. Trends Genet. 2010 Jan;26(1):5-8.

[5] Marcet-Houben M, Gabaldón T. The tree versus the forest: the fungal tree of life and the topological diversity  within the yeast phylome. PLoS One. 2009;4(2):e4357. Epub 2009 Feb 3.

COURSE PROGRAM:

 

Wed, April 6th
Day #1
09:30 - 11:00
  • 1 Lectures: Introduction to phylogenomics
  • 1.1 General phylogenetic pipeline
  • 1.2 Phylogenomics: scope and applications
  • 2. ETE basics (brief introduction)
  •  
  • 2.1 Installing ETE
  • 2.2 Using ETE libraries
  • 2.3 Interactive ETE sessions using ipython
  • 2.4 Tree basics
11:00 - 11:30 Coffee Break
11:30 - 13:30
  • 3. Basic Tree operations (examples and exercises)
  • 3.1 Reading and writing trees
  • 3.2 Creating trees from scratch
  • 3.3 Browsing tree topology: performing per node operations
  • 3.4 Searching nodes by their attributes
  • 3.5 Modifying trees
  • 3.6 Rooting, pruning, splitting and concatenating trees
13:30 - 15:00 Lunch Break
15:00 - 16:00
  • 4. Tree Annotation ( brief introduction)
  • (examples and exercises)

  • 4.1 Adding extra information to the tree nodes
  • 4.2 Exporting data using  the extended newick format
  • 4.3 Exploring trees attributes using the GUI
16:00 - 16:30 Tea Break
16:30 - 18:00 GLOBAL EXERSICE : Functional annotation of the newly sequenced genome of Pea Aphid
Thu, April 7th
Day #2
09:30 - 11:00
  • 5. Lectures:  orthology and paralogy
  • 6. Phylogenetic trees in ETE (brief introduction)
11:00 - 11:30 Coffee Break
11:30 - 13:30

(examples and exercises)

  • 6.1 Associating nodes with multiple sequence alignments
  • 6.2 Species aware trees
  • 6.3 Species guided rooting
  • 6.4 Detecting monophyletic clades
  • 6.3 Detecting orthology and paralogy relationships
  • 6.3.1 Tree reconciliation 
  • 6.3.2 The species overlap algorithm
  • 6.3.3 Working with speciation and duplication events
  • 6.4 Relative dating duplication events
13:30 - 15:00 Lunch Break
15:00 - 16:00
  • 7 Comparing tree topologies: (introduction and exercises)
  • 7.1 Phylip package: consensus & treedist programs
  • 7.2 Comparing tree topologies using Treeko
16:00 - 16:30 Coffee Break
16:30 - 18:00
  • 8.Building clustering trees and phylogenetic profiles(brief introducction)
  • (examples)

  • 8.1 Linking trees to numeric profiles and matrices
  • 8.2 Using profiles as node properties
  • 8.3 Visualizing trees with profiles: tree heatmaps and profiling plots
  • /ul>

    GLOBAL EXERSICE : Functional annotation of the newly sequenced genome of Pea Aphid

Fri, April 8th
Day #3
09:30 - 11:00
  • 9. Lectures:  Dealing with large collections of trees: Phylomes
  • 9.1 Public phylogenomic resources: PhylomeDB, Ensembl Compara, TreeFam
  • 10. PhylomeDB API (brief introduction)
  Coffee Break
11:30 - 13:30
  • 11 The programmable tree drawing engine
  • 11.1 Controlling node aspect (NodeStyle)
  • 11.2 Adding graphical information to nodes (Faces)
  • 11.2.1 Types of NodeFaces
  • 11.2.2 NodeFace properties
  • 11.2.2 NodeFace placement
  • 11.3 Dynamic control of image rendering (TreeStyle and layout functions)
  • 11.4 Rendering trees as PNG, SVG or PDF images
13:30 - 15:00 Lunch Break
15:00 - 16:00
  • 12. Advanced topics
  • 12.1 Webplugin (interactive tree images on the web)
  • 12.2 Nexml and PhyloXML formats
  • 12.3 Creating custom Node Faces
16:00 - 16:30 Tea Break
16:30 - 18:00
  • 13 Course wrap up:
  • 13.1 Comments on the exercises
  • 13.2 Questions & requests

* Some useful information about accomodation

 There are 2 student's residences at a walking distance:

*Residència Campus del Mar*
Passeig Salvat Papasseit, 4
08003 - Barcelona
Tel +34 93 390 4000
Fax +34 93 310 6627
campusdelmar@resa.es
http://www.resa.es/eng/residencias/campus_del_mar#
MAP - R Campus DEL MAR- PRBB.pdf


*Residència Ciutadella*
Pg. Pujades 33-37
ciutadella@resa.es
www.resa.es/esl/residencias/la_ciutadella

MAP - Ciutadella guesthouse.

 

 

 

ETE is developed as an academic free software tool. If you find ETE useful for your work, please cite:

Jaime Huerta-Cepas, Joaquín Dopazo and Toni Gabaldón. ETE: a python Environment for Tree Exploration. BMC Bioinformatics 2010, 11:24. doi:10.1186/1471-2105-11-24

Support mailing list: etetoolkit@googlegroups.com
Contact:: huerta@embl.de


The ETE Toolkit was originally developed at the bioinformatics department of CIPF and greatly improved at the comparative genomics unit of CRG. At present, ETE is maintained by Jaime Huerta-Cepas at the Structural and Computational Biology unit of EMBL (Heidelberg, Germany).