Course Information & Syllabus
This course will be a fast paced survey of modern bioinformatics. The topics covered will be of interest to computer scientists, statisticians, and biologists alike, and we will have several guest lecturers from the forefront of research. Motivation is important, as we are going to cover a lot of material; some background in statistics at the level of Stat 116, programming at the level of CS106B, and basic bioinformatics at the level of CS262 or BMI214 is recommended. See below for formal prerequisites.

Bulletin description: (Listed as STATS 166; Graduate students register for STATS 366; same as BIOMEDIN 366.) Emphasis is on analysis of genomic scale data sets with multivariate methods. Comparative genomics: whole genome phylogeny, genome alignment, identification of constrained elements, genomic motif finding, ENCODE project. Functional genomics: interaction networks, random networks, data integration, tests for functional enrichment, network alignment, deterministic and stochastic genetic circuits, data-driven circuit reconstruction. Population genomics: principles of molecular evolution, tests for selection, HapMap, structural variation. Recommended: familiarity with R, Perl, Unix, basic bioinformatics (sequence alignment and protein structure) at the level of BIOC 218, BMI214/CS274, or CS 262.

Class Schedule
Tuesdays and Thursdays, 9:15 - 10:30 am, Clark Center S361 (map)
Exceptions: 1/9,1/16,1/30 (in Clark S360), 1/23 (starts at 9:30 am in S361), 3/8 (Clark S362)

Instructor
Dr. Balaji S. Srinivasan
balajis_at_stanford_dot_edu
Office Hours: Tuesday 10:30 - 11:30 am, Clark Center S251

Teaching Assistant
Hua Zhou
hwachou_at_stanford_dot_edu
Office Hours: Thursdays 4-5:30, Sequoia 234

Feel free to contact us about general course content, but please ask HW questions on the blog as your fellow students may have already asked the same question.

Mailing List
Please use mailman to subscribe to stat366-win0607-announce if you have not already been subscribed.

Course requirements:

  • Two homework assignments
  • Final project and presentation
Homework: There will be two homework assignments at the beginning of class, focusing on tools and methods. Homework will normally be assigned on Wednesdays and due 2 weeks later. Late homework will experience a 10% penalty for every day it is late.

You are allowed, even encouraged, to work on the homework in small groups, but you must write up your own homework to hand in. Homework will be discussed in class and homework questions should be put on the blog. Assignments will involve significant programming in Perl and R in a Unix environment. These tools will be introduced over the course of the class, but previous programming experience is recommended. Any coding assignments must include a printout of the full source code when handed in. Homework will be graded on a roughly 100 point scale.

Grading: Homework 40%, Final Project 60%. These weights are approximate; we reserve the right to change them later.

Prerequisites
We are going to cover a lot of ground in this class, so motivation is the most important prerequisite.

  1. Soft prerequisite: while you are not expected to be expert in any of the following, ideally you should have exposure to a scripting language such as Python or Perl, some familiarity with Unix, and experience with a high level programming environment such as Matlab, Mathematica, or R. Ideally, you should also have a basic familiarity with bioinformatics at the BIOC 218, BMI 214/CS 274, or CS 262.
  2. Hard prerequisite: You should also have a fairly solid grounding in basic probability and statistics at the level of Stat 110 or Stat 116.
  3. Recommended: Exposure to machine learning and linear algebra will be very useful. You will learn by doing, but some background will help.
The reason that I put the programming and basic bioinformatics as a soft prereq is that these are things that you can pick up over the course of the quarter. However, I would say that the probability/statistics requirement is a hard prereq. If you don't have at least one or two courses in probability under your belt, the course will likely be too fast as we are going to dive into multivariate statistics from day one. Of course, if you're willing to work then you are the best judge of what you can handle.

Textbook and optional references: This is a research-level course, and no one textbook covers all the computational biology material. Lecture notes will be available from the class web page. However, the following textbooks on programming tools and on human evolutionary genetics are required. These will be of use to you in any future work you do in bioinformatics, and are well worth the investment.

Lecture Notes
Homeworks
Please ask HW questions on the blog to benefit your fellow students! Thanks!
Code