**CS 5263 & CS4233 (Bioinformatics)**

4/25: HW3A description. Data file. Matlab code output. Matlab code

4/13: Example test questions.

3/20: MATLAB Tutorial

2/23: Midterm project is here. Electronic
submission in blackboard due on 11:59pm, ~~March 21~~ (Extened
to: Friday, March 24), 2017. Notes.

2/16: fixed some typos in HW2 problem 1(c) and 1(d), 4(g), and revised 2(b) and 2(d) for more accurate instructions.

2/12: HW2 due on Thursday Feb 23. Download DNA sequence file for problem 1E (optional).

1/22: HW1 due on Tuesday Feb 7. Download DNA sequence file for problem 3, text file 1 and text file 2 for problem 4. Solution

1/10: Welcome to CS4233 & CS5263 (Bioinformatics)! Please take some time to complete a background survey.

This course is a survey of algorithms and methods in bioinformatics and computational biology, approached from a (more or less) computational viewpoint. Topics covered include: fundamental biology, sequence comparison (dynamic programming), motif finding (combinatorial algorithms, stochastic heuristic search algorithms, suffix trees), next-generation sequencing (suffix array, Burrows Wheeler transform), gene expression data analysis (statics, data mining), and gene network/pathway analysis (graph algorithms).

This course is primarily designed for graduate students and advanced
undergraduate students in the Computer Science department. Fundamental
understanding of data structure, algorithms, excellent programming experience
in at least one programming language, as well as some knowledge of probability
and statistics are expected. Some prior exposure to molecular biology is
preferred, but not required, as we will introduce basic biological concepts and
terms along the way. **Students without background in Algorithms or
Statistics should consult the instructor prior to taking the course. **

We meet in room MH 3.02.30. Lecturers are Tuesday and Thursday, 2:30-3:45 PM.

Instructor: Dr. Jianhua Ruan

Office location: NPB 3.318

Office hours: Tuesday 9-11am or by appointment

Email: jianhua.ruan 'at' utsa
'dot' edu

Phone: (210) 458-6819

Teaching Assistant: TBD

Office location: TBD

Office hours: TBD

Email: TBD

There is NO textbook required for this course. Parts of the course are based on the text:

*Bioinformatics A Computing Perspective*by Gopal, Haake, Jones and Tymann (GHJT)*Basics of Bioinformatics - Lecture Notes of the Graduate Summer School on Bioinformatics of China*, edited by Jiang, Zhang and Zhang (JZZ)- Additional Readings and Resources

- 10% Attendance and
participation
- 40% Homeworks
(~biweekly) and in-class exercises
- 20% Midterm exam /
project
- 30% Final exam / project

Late assignments will not be accepted and a score of zero will be given, unless approved by the instructor.

**Part I: Introduction to bioinformatics and molecular
biology **

- Slides
- Slides: NGS
- Required reading: JZZ Ch 1
- Optional reading:
- GHJT Ch2,3
- Molecular biology for computer scientists,
in
*Artificial intelligence and molecular biology*, Lawrence Hunter.

- Slides: Pairwise
sequence alignment
- Slides: Multiple sequence alignment
- Required reading: GHJT
Ch 5.4, 5.7-5.9
- Optional reading:
- GHJT Ch 5.5-5.6
- What is dynamic
programming? (if you have no background in dynamic programming)
- Linear-space
alignment algorithm
- Alignment statistics
- Statistics and
Probability Primer for Computational Biologist

- Slides
- Reading:

o Statistics and Probability Primer for Computational Biologist

- Additional reading:

- Slides set 1
- Slides set 2
- Additional reading:
- Exact
string matching
- Weeder algorithm
- Next-generation
DNA sequencing, Shendure and Ji, Nat Biotechnol. 2008.
- Sense
from sequence reads: methods for alignment and assembly, Flicek & Birney, Nature Methods, 2009
- Ultrafast
and memory-efficient alignment of short DNA sequences to the human
genome, Langmead et al, Genome Biology 2009,
10:R25

- Slides
- Reading:

- slides
- Additional reading
material:

**Part VIII: Biological networks**

·
slides

Topics |
Number of weeks |

Introduction to molecular biology |
1 |

Sequence alignment |
2 |

String matching algorithms and applications |
2 |

Motif finding |
1 |

RNA structure prediction |
1 |

Transcriptomic data analysis and data mining |
4 |

Next-generation sequencing |
2 |

Biological networks |
1 |