Computational Molecular Biology

CSE 5615/4510
Florida Institute of Technology
Instructor: Debasis Mitra


Department: Computer Sciences

Abstract

Since the days when DNA was found to be comprised of four letters a, t, c, and g, "life" is showing its digital face. Life seems to be partly biology and partly computing, and for the same reason computer is becoming more and more important instrument for doing biology, particularly molecular biology.

In this course on computational biology we will introduce some important problems and algorithms for computational molecular biology. Some introduction may be provided on data structures and algorithms, the students are expected to have background on fundamentals of programming. Similarly some necessary introduction on biology will also be provided. A sample of the past activities can be seen below.



The syllabus is here.

New text is An introduction to Bioinformatics Algorithms by Neil Jones and Pavel Pevezner, MIT Press 2004, ISBN: 0-202-10106-8

Background
SeqAlignment
Fragment Assembly
Phylogenetic Tree
Structure Prediction
A tutorial on HMM

Spring 2006
---------------
Assignment 1 (due 1/19/06)

Assignment 2 (points 30): (1) Answer the questions on Genebank and Swissprot databses (print the questions too)
(2) Questions 4.15, 4.16 and 4.17 from the text (p122-3).
(3) Analyze the complexities of the algorithms "BruteForceMotifSearch" (p 109) and "SimpleMedianSearch" (p113). Do not use book's analyses even if you arrive at the same results. (due 2/10/06)

Assignment 3 (points 50, FINALLY Due: May 4, 06)

There will be a guest lecture by Dr. Leonard on Tuesday, February 28.

Projects
--(Due: Presentation on May 2, '06, 7:30-10:30 pm) SEE ANNOUNCEMENT -- Presentation schedule (Room Olin EC 239-240:
System biology: 7:30-8:30 pm. (Gary Hrezo and Weijung Huang)
Protein Docking: 8:30-9:30 pm. (Johannes Nangolo and Christpher Roach)
Correlogram method in protein classification: 9:30-10:30 pm. (Kyle Cacciatore and Stephen Jonsson)


Spring 2005
---------------
Assignment 1 (due 2/8/05): Text Exc. 1, 2, 3 on page 30

Biology Presentation schedule:
Robert Asfar - 2/8/05 Florent launay - 2/8/05
Park Sung Hoon - 2/10/05 Ram, Anjali - 2/10/05

Programming assignment:
Implement Global alignment Dynamic programming algorithm, (Due: 2/12/05)

Projects.
Project proposal due 3/17/05, Thursday.

Presentations: (BLAST: Rob, PAM: Anjali,
Suffix tree: Park: 3/15/05 Tuesday (15 min)

Quiz on Fragment assembly: 3/24/05 Thursday
I will let you complete it in the next class Thursday 3/31/05, for about 20 minutes at the end of the class

Programming assignment 2:(Due: 4/20/05 Thursday) Implement the dynamic programming algorithm for RNA base pairing-prediction with the simplest assumption. Use alpha values as follows: alpha(ri,rj)=-2, if (ri,rj)=(A,U) or (U,A) or (G,C) or (C,G), =0, otherwise. Program should work on any string of length up to 100.

Presentation schedule:
Protein structure prediction: Rob Asfar: 4/14-19/05
Anjali Ram: 4/19-21/05
System biology: Park: 4/21-26/05

PROJECT PRESENTATION: THURSDAY 5/5/05 EXAM TIME
POWER-POINT PRESENTATION+DEMO, MAX 40 MIN, MIN 20 MIN
IN CLASSROOM OR IN MY OFFICE

-----------------------------
Resources:

A decent introduction to molecular biology by Hunter.

Some information on Human Genome Project is here.

Some collection of important web databases / tools prowl.

Conferences:

Intelligent Systems for Molecular Biology ISMB.

International Conference on Research on Molecular Biology RECOMB.

IEEE Computational Systems Bioinformatics Conference CSB.

----------------------------- A tutorial on BLAST.

-----------------------------
Spring 2005:

Class Time: Tuesday Thursday 6:30-7:45 pm
Room: E250 -----------------------------
Spring 2003:

(The notes below are primarily from the submissions from the students in Spring 2003, particularly those from Michael Smith.)

Class schedule: Monday-Wednesday 11 - 12:15 am
Meets at: Room 132EC
Chapter 1: Introduction to Biology lecture notes.

Some database search procedures: here.

String comparison algorithms: from Cormen et al's Algorithms text book, embedded in my lecture notes on the Algorithms class notes.

Chapter 3: Sequence comparison lecture slides.

Chapter 4: Fragment Assembly lecture slides.

Chapter 6: Phylogenetic Trees lecture notes.

Chapter 8: Molecular Structure Prediction was not covered this time.

Chapter 9: DNA computing lecture slides.

Project description.

A self study done on Sickle Cell Anemia, some notes.

----------------------------
-----------------------------
Spring 2004:

Projects:

Expectation: (1) Literature survey on the current status of the field evidenced in bibiography development and a presentation(s), (2) and a software implementation. Both (1) and (2) for the Graduate Students, and only (2) for the Undergraduate student.
A report of approximately 5 page typed, data from the experiments, and (outside the 5 pages) source code will be due. E-mail/CD/ floppy any format is acceptable.

Due date for report submission: April 15 or next class to that date.

System Biology of E-coli cell division process. (Data: Prof. Leonard) Michel Lacle

Implementation of Blast allignment algorithm and Sequence distance measurement between Protein chains (subsequently to be expanded toward usage of Correlogram method of Huang et al, as an MS Thesis). (Data: Protein Data Bank) Gandhali Samant
Microarray data clustering algorithm implementation. (Data: ??) Sunjit Bir

Instance-based learning implementation for clustering sequences. (Data: Prof. Leonard / PDB) Manav Rattan

Fragment assembly implementation.(Data: ??) Aditi Gupta

Phylogeny reconstruction implementation. (Data: ??) Lalit Samant

Helix, sheet (secondary structure) prediction, and solvent accessibility prediction (1D structure) using DSP and homology modeling techniques. (Data: PDB) Seema Gandhi

Deploying matrix method, and dynamic programming method to detect motifs in some nucleotide sequences and then representing the sequences based on existing motifs (Ref: Gaurv Tandon's work on computer security). (Data: Prof. Leonard) Carl Harroch

-----------------------------


Materials are copyrighted to me (year 2003), or shared with the acknowledged students, as the case may be. E-mail: dmitra@zach.fit.edu