CS 5263 Final project page

Projects should be done individually. Group projects combining people from different backgrounds are acceptable / encouraged but need to be discussed with me in advanced. Group project will be evaluated differently than individual projects, and your final report must also clearly describe the contribution of each personnel in the group.

Timeline (all submissions via blackboard):

To look for possible topics

         CS 5263 Final Project Ideas

         List of bioinformatics topics in Wikipedia

To search for papers

To browse for papers from bioinformatics-related journals

To browse for papers from bioinformatics-related conferences

Advices on how to read and present a paper (Adapted from this web page)

When you present a paper in this course (or elsewhere), your goal is to get your audience to appreciate the contribution that the paper makes to scientific knowledge. Generally, you need to explain the following three things about the paper to do that. It often makes sense to present each point in order, but it is more important to focus on the essence of the contribution than it is to follow any particular format.

  1. What is the problem the paper is trying to address? You should both define the problem and explain its broader significance. In addressing this question, you want to consider things like: What is the biological nature of the problem? Is it reconstructing evolutionary history, identifying genes relevant to the prognosis or treatment of a disease? Why is that important? What is the contribution of the paper to furthering our understanding of the biology? Then you may want to talk about the computational nature of the problem. How was the biological problem reformulated into a computational problem? Is that the main contribution (it often is)? Are there aspects of the computational problem that are particularly interesting? Is a previous (or obvious) computational formulation too slow or not accurate enough? If so, what kind of improvement in the computational approach would be important, and why? Or is this a comparison of alternative approaches? If so, why were those approaches selected and not others? How are they to be compared?
  2. What were the methods used in the paper? Often, this is where you have to spend the most time in your presentation, since new methods are the essence of most bioinformatics publications. You want to carefully explain exactly what was done. It may require a very close reading of the paper to figure this out; often important facts are buried in seeming asides. When you are working on this part of your presentation, imagine you were trying to replicate the work. What would you need to know?
  3. What were the results reported? Ideally, it would be straightforward to compare the results presented with the problem statement, but it is not always that easy. Discuss the evaluation method(s) as well as the results. It is often interesting to consider how the authors chose to evaluate there contribution: was it fair? was it indicative of "real world" performance?

Try to identify where the main contribution of the paper is. For example, some papers define interesting new problems, but apply relatively straightforward methods to addressing them. For a paper like that, focus on work on related problems, and how the new problem statement differs from them. Are there better approaches developed for related problems that can be applied to the new problems? Some papers present a new approach to a well studied problem. For those papers, carefully compare the new method to other approaches people have taken to the problem. Also, in that situation, the choice of the evaluation method (used to compare the new approach to existing methods) is an important place to focus.

Look for unstated assumptions made in the paper, and try to make them explicit. For example, does a paper on finding cis-regulatory elements from sequence and gene expression data assume that the elements are independent of each other? That the position of the element with respect to the start of transcription is unimportant? Reading alternative approaches to the same problem will make it easier for you to identify these assumptions.

After you have communicated these facts about the paper, you can discuss the aspects you thought were most important or interesting. Is this a method that belongs in your "bioinformatics toolkit"? Can it be applied to related problems straightforwardly, or is it highly specialized? Was there something particularly impressive about the method, the evaluation, the translation of the problem into computational terms, etc.?

In general, bioinformatics papers have an "engineering" flavor that fits well into this problem / method / results paradigm. However, some papers have more of a "basic science" flavor, where a particular claim is being made, and evidence is presented to support that claim. Providing evidence for a claim is closely related to testing a particular hypothesis. If you feel that this better fits the paper you are presenting, then rather than using the problem / method / results paradigm, you can explain it in terms of claims and evidence.