CS 6293 Advanced Topics: Current Bioinformatics

News and Announcements

11/24: Homework 4 is now available. Electronic submission due 11:59pm, Dec 12, Sunday.

10/20: Homework 3 is now available. Electronic submission due 8:30pm, Nov 17, Wednesday.

10/10: Homework 2 is now available. The problem solving part is due on Monday, Oct 25 before class starts (the presentation will be on Nov 1).

10/07: NGS analysis papers for HW2 presentation.

9/20: Homework 1 avaiable for downloading. Due on Monday, Oct 4 before class starts.

8/25: Week 1 lecture slides and reading materials uploaded

8/21: Welcome! Please take some time to complete a survey form and email to me. Thank you!

Overview | Prerequisite | Time and Location | Instructor | Textbooks and Resources | Policies | Lecture Schedule and Slides

Overview

This course provides a review of the most recent developments and open research problems in the area of bioinformatics and computational biology. The course is self-contained and does not assume prior background in biology or bioinformatics. However you do need to have a good background in algorithms and statistics.

This year we will be primarily focusing on various data mining and combinatorial optimization algorithms for the analysis of next-generation DNA sequencing data (translation: huge amount of short strings) and biological networks (translation: complex graphs with tens of thousands of vertices). The material of the course will be tailored to the interests of the participants. Some topics that will be covered are:

  1. Short-reads assembly problem - how to efficiently and accurately assemble billions of short DNA subsequences (20-30 chars) into the much longer original sequence (e.g. human genome)?
  2. Peak detection for next-generation sequencing data - how to detect peak signals from noisy background?
  3. Gene regulatory network construction and modeling – how to predict the wiring between genes and use it to model the behavior of the cell?
  4. Machine learning methods for disease classification and drug discovery – how to cluster/classify diseases into subgroups and predict more effective personalized drugs, based on molecular markers and, more importantly, interactions between individual markers?
  5. Network topology analysis – what makes biological networks (and other naturally evolved networks) different from human-made networks and what can we learn from them?

Prerequisite

This course is primarily designed for graduate students and advanced undergraduate in the Computer Science department. There is NO formal prerequisite for the course. However, participants are expected to have background in data structures and algorithms, programming experience in at least one of the following programming languages (C/C++/C#/Java/Perl/Python/Matlab), and some knowledge of probability and statistics. Some prior exposure to biology is preferred, but not required, as we will introduce basic biological concepts and terms along the way. Students from other background but are comfortable with programming and algorithms are also welcomed. Talk to me if you are not sure.

Time and Location

We meet in room SB 1.02.08 . Lecturers are Monday and Wednesday, 8:30-9:45 PM.

Instructor

Dr. Jianhua Ruan
Office location: S.B. 4.01.48
Office hours: Wednesday 2:00-3:00pm, or by appointment
Email: jruan@cs.utsa.edu
Phone: (210) 458-6819

Textbooks and Resources

There is no textbook required for this course. The instructor will provide the needed materials including chapters from textbooks, journal papers, and review articles in class or on course homepage.

Online Reading Materials and Resources

Grading Policy

At most 3 classes missed without affecting grade, unless approved by the instructor.
Late assignments will not be accepted and a score of zero will be given

Collaboration Policy

Lecture Schedule and Slides

Week 1 (Aug 25, 30): Introduction to molecular biology

Week 2 (Sept 1, 8): Sequence alignment

Week 3-4 (Sept 13, 15, 20): String matching

Week 4-6 (Sept 22, 27, 29, Oct 4, 6): Next-generation sequencing

Week 7 (Oct 11, 13): DNA motif finding

Nov 8 - 22: Gene expression data analysis

Nov 24 - Dec 1: Biological networks

Tentative lecture topics

Topics Number of weeks
Introduction to molecular biology and bioinformatics 1
Basic sequence analysis algorithms 1
Overview of next-generation sequencing (NGS) and other high-throughput technology 1
Algorithms for NGS sequence alignment 1
Algorithms for NGS sequence assembly 1
ChIP-seq data analysis (peak detection, motif finding) 1
Survey of data mining methods - clustering, classification 2
Gene expression data analysis (Microarray and RNA-seq), Gene ontology and Gene set enrichment analysis 2
Disease classification and drug discovery 1
Algorithms for biological network analysis 2
TBD 1