| This course will be
a fast paced survey of modern bioinformatics. The topics covered
will be of interest to computer scientists, statisticians, and
biologists alike, and we will have several guest lecturers from
the forefront of research. Motivation is important, as we are
going to cover a lot of material; some background in statistics
at the level of Stat 116, programming at the level of CS106B, and
basic bioinformatics at the level of CS262 or BMI214 is
recommended. See below for formal prerequisites.
Bulletin description: (Listed as STATS 166; Graduate students register for
STATS 366; same as BIOMEDIN 366.) Emphasis is on analysis of genomic scale
data sets with multivariate methods. Comparative genomics: whole
genome phylogeny, genome alignment, identification of constrained
elements, genomic motif finding, ENCODE project. Functional genomics:
interaction networks, random networks, data integration, tests for
functional enrichment, network alignment, deterministic and stochastic
genetic circuits, data-driven circuit reconstruction. Population
genomics: principles of molecular evolution, tests for selection,
HapMap, structural variation. Recommended: familiarity with R, Perl,
Unix, basic bioinformatics (sequence alignment and protein structure)
at the level of BIOC 218,
BMI214/CS274, or
CS 262.
Class Schedule Tuesdays and Thursdays, 9:15 - 10:30 am, Clark Center S361 (map)
Exceptions: 1/9,1/16,1/30 (in Clark S360), 1/23 (starts at 9:30 am in
S361), 3/8 (Clark S362)
Instructor
Dr. Balaji S. Srinivasan
balajis_at_stanford_dot_edu
Office Hours: Tuesday 10:30 - 11:30 am, Clark Center S251
Teaching Assistant
Hua Zhou
hwachou_at_stanford_dot_edu
Office Hours: Thursdays 4-5:30, Sequoia 234
Feel free to contact us about general course content, but please ask HW questions on the blog as your fellow students may have already
asked the same question.
Mailing List Please use mailman
to subscribe to stat366-win0607-announce if you have
not already been subscribed.
Course requirements:
- Two homework assignments
- Final project and presentation
Homework: There will be two homework assignments at the
beginning of class, focusing on tools and methods. Homework will normally be assigned on Wednesdays and
due 2 weeks later. Late homework will experience a 10% penalty for
every day it is late.
You are allowed, even encouraged, to work on the homework in small
groups, but you must write up your own homework to hand
in. Homework will be discussed in class and
homework questions should be put on the blog.
Assignments will involve significant programming in Perl and
R in a Unix environment. These tools will be introduced over the
course of the class, but previous programming experience is
recommended. Any coding assignments must include a
printout of the full source code when handed in. Homework will be
graded on a roughly 100 point scale.
Grading: Homework 40%, Final Project 60%. These
weights are approximate; we reserve the right to change them
later.
Prerequisites
We are going to cover a lot of
ground in this class, so motivation is the most important
prerequisite.
- Soft prerequisite: while you are not expected to be expert
in any of the following, ideally you should have exposure
to a scripting language such as Python or Perl, some familiarity
with Unix, and experience with a high level programming environment
such as Matlab, Mathematica, or R. Ideally, you should also have a
basic familiarity with bioinformatics at the BIOC 218,
BMI 214/CS 274, or
CS 262.
- Hard prerequisite: You should also have a
fairly solid grounding in basic probability and statistics
at the level of Stat 110 or Stat 116.
- Recommended: Exposure to machine learning and linear algebra
will be very useful. You will learn by doing, but some background
will help.
The reason that I put the programming and basic bioinformatics
as a soft prereq is that these are things that you can pick
up over the course of the quarter. However, I would say that
the probability/statistics requirement is a hard prereq. If
you don't have at least one or two courses in probability
under your belt, the course will likely be too fast as we are
going to dive into multivariate statistics from day one. Of
course, if you're willing to work then you are the best judge
of what you can handle.
Textbook and optional references: This is a research-level
course, and no one textbook covers all the computational biology material.
Lecture notes will be available from the class
web page. However, the following textbooks on programming
tools and on human evolutionary genetics are required. These
will be of use to you in any future work you do in
bioinformatics, and are well worth the investment.
Optional
- Unix
- Perl
- R Language
- Probability and Statistics
- Comparative Genomics
- There is no textbook that really covers this material as it
is bleeding edge. The lecture notes should suffice.
- Functional Genomics and Systems Biology
- Population Genomics
|