Botany 563 Phylogenetic Analysis of Molecular Data (UW-Madison)
A course in the theory and practice of phylogenetic inference from DNA sequence data. Students will learn all the necessary components of state-of-the-art phylogenomic analyses and apply the knowledge to the data analyses of their own organisms.
- Spring 2022: Wednesday and Friday 2:30-3:45pm (Russell A228)
- Instructor: Claudia Solis-Lemus, PhD
- Email: [email protected]
- website: https://solislemuslab.github.io/
- Office hours: Wednesday 3:45-4:30pm, or by appointment
Learning outcomes
By the end of the course, you will be able to
- Explain in details all the steps in the pipeline for phylogenetic inference and how different data and model choices affect the inference outcomes
- Plan and produce reproducible scripts with the analysis of your own biological data
- Justify the data and model choices in your own data analysis
- Interpret the results of the most widely used phylogenetic methods in biological terms
- Orally present the results of your own phylogenomic data analyses based on the best scientific and reproducibility practices
Textbooks and references
- Phylogenetics in the Genomic Era (open access book) by Celine Scornavacca, Frederic Delsuc and Nicolas Galtier (denoted HAL in the schedule)
- Tree thinking: an introduction to phylogenetic biology by David Baum and Stacey Smith (optional: denoted Baum in the schedule)
- The Phylogenetic Handbook by Philippe Lemey, Marco Salemi and Anne-Mieke Vandamme (optional: denoted HB in the schedule)
- The full list of papers used in this class can be found in this link
Schedule 2022
Session | Topic | Pre-class work | At the end of the session | Lecture notes | Homework | HW due |
---|---|---|---|---|---|---|
01/26 | Introduction | You will know what will be the structure of the class, the learning outcomes and the grading | lecture1.md | Go over ready-for-class checklist | ||
01/28 | Motivation: why learning phylogenomics? | Read HAL 2.1 | You will identify the different components in phylogenomic analyses | lecture2.md | Read HAL 2.1 and do canvas quiz and read Jermiin2020 | 01/28 |
02/02 | Reproducibility crash course | Review shell resources and do canvas quiz | You will prioritize reproducibility and good computing practices throughout the semester (and beyond) | lecture3.md | ||
02/04 | Continue with reproducibility | Have git installed | Reproducibility HW | 02/09 | ||
02/09 | Introduction to sequences | Watch video1, video2, and do canvas quiz | You will be able to describe the next-generation sequencing pipeline (and UCE pipeline) as well as the post-processing bioinformatics steps for quality control | lecture4.md | Sequencing HW | 02/18 |
02/11 | Alignment | You will be able to explain the most widely used algorithms for multiple sequence alignment | lecture5.md | Needleman-Wunsch HW and canvas quiz | 02/23 | |
02/16 | Continue with alignment | lecture5-2.md | 1) Read Alignathon paper; 2) Choose and run an alignment method on your data (github commit) | 03/02 | ||
02/18 | Continue with alignment | One paper assigned per student: 1) ClustalW, 2) MUSCLE, 3) T-Coffee | lecture5-3.md | |||
02/23 | Filtering and Orthology detection | Optional HAL 2.2, 2.4; Make sure to add info on your data in the slides | You will know about the different filtering and orthology inference methods | lecture6.md | 1) Read Nichio2017; 2) Choose one orthology detection method, read its paper and add one slide about it in the class google slides | 03/09 |
02/25 | Overview of phylogenetic inference | You will be able to explain the overall methodology of phylogenetic inference as well as the main weaknesses | lecture7.pdf | |||
03/02 | Distance and parsimony methods | Install R and optional readings: HB Ch 5-6, Baum Ch 7-8 | You will be able to explain both algorithms to reconstruct trees: 1) based on distances and 2) based on parsimony | lecture8.md | ||
03/04 | Continue with distance and parsimony methods | Run distance and parsimony methods on your own data (git commit) | 03/23 | |||
03/09 | Models of evolution | HAL 1.1 and canvas quiz | You will be able to explain the main characteristics and assumptions of the substitution models | lecture9.pdf | ||
03/11 | Continue with models of evolution | Make sure to add info on your orthology method in the slides | ||||
03/16 | Spring break | |||||
03/18 | Spring break | |||||
03/23 | Maximum likelihood | HAL 1.2 and canvas quiz | You will be able to explain the main steps in maximum likelihood inference and the strength/weaknesses of the approach | lecture10.pdf | ||
03/25 | Continue maximum likelihood | Two papers assigned per student: 1) IQ-Tree papers: one, two; 2) RAxML papers: one, two | lecture10-2.md | Choose a ML method to run in your own data | 04/08 | |
03/30 | Bayesian inference | HAL 1.4 and canvas quiz | You will be able to explain the main components of Bayesian inference and their effect on the inference performance | lecture12.pdf | ||
04/01 | Continue Bayesian inference | Read Nascimento et al, 2017 and quiz | Read YangRannala1997 | |||
04/06 | Continue Bayesian inference | Read depending on your canvas group: 1) MrBayes papers: one, two; 2) Larget and Simon, 1999 | lecture12-2.md | Run MrBayes on your own data | 04/20 | |
04/08 | The coalescent | HAL 3.1 and quiz, HAL 3.3 and quiz | You will be able to explain the coalescent model for species trees and networks | lecture14.pdf | ||
04/13 | Continue with the coalescent | One paper per student: ASTRAL or BUCKy | lecture14-2.md | Run ASTRAL or BUCKy on your own data | 04/29 | |
04/15 | Continue with the coalescent | SNaQ chapter and quiz | lecture14-3.pdf | |||
04/20 | Co-estimation methods | Optional reading: HB 18 | You will be able to explain the main components of co-estimation methods and follow the BEAST tutorial | lecture15.md | ||
04/22 | Continue with co-estimation methods | Read BEAST papers: one, two | lecture15-2.md | |||
04/27 | Discussion: Measures of support | One per group: 1) Stenz2015, 2) Lemoine2018, 3) Anisimova2006, 4) Sayyari2016 | You will be able to compare and contrast the different ways in which we can measure confidence in our phylogenetic estimates | Slides | ||
04/29 | Discussion: Coalescent vs concatenation | All: HAL 3.4. One per group: 1) Springer2018, 2) Mendes2018, 3) Philippe2017, 4) Springer2016, 5) Edwards2016 | You will be able to justify the choice of concatenation vs coalescent in specific scenarios | Slides | ||
05/04 | Discussion: Phylogenomics pitfalls | One per group: 1) Bravo2019, 2) Shen2017, 3) Young2020, 4) Steel2005 | You will be able to describe and analyze some of the main pitfalls of phylogenomic analysis of big data | Slides | ||
05/06 | What else is out there? | Read Jermiin2020 again | You will hear a brief overview of topics not covered in this class and will have access to resources to learn more | lecture16.md | ||
05/09 | Final project due | |||||
05/11 | Project presentations | |||||
05/13 | Project presentations |
More details
See list of topics, grading and academic policies in the syllabus