CS 6243 Machine Learning

News and Announcements

4/16: A description of the final project is available here. Some ideas / suggestions. Download data.

3/5: A draft description of the Mid-term project is available here:

1/18: Welcome! Please take some time to complete a survey form and give to me in class. Thank you!

Overview | Prerequisite | Time and Location | Instructor | Textbooks and Resources | Policies | Slides | Lecture Schedule | Assignment

Overview

Description of course (from catalog): "This course studies machine learning techniques in the area of artificial intelligence. Topics include inductive learning, unsupervised learning, speedup learning, and computational learning theory."

* For more accurate information about the materials that will be covered in the course, see tentative list of topics.

In particular, we will discuss heavily applications of machine learning algorithms in biology and you can expect to read many papers in biology-related areas.

There may be some overlap between this course and the Bioinformatics and Advanced Bioinformatics courses that I offerred in previous years, although the emphasis in this course would be mainly algorithmic issues and some practical issues.

Prerequisite

CS 5233 (Artificial Intelligence) or CS 5633 (Analysis of Algorithms). You also need to be fluent in JAVA programming and be prepared to learn some MATLAB, and some experience working in a Linux/Unix environment.

Time and Location

We meet in room SB 3.02.02 . Lecturers are Monday and Wednesday, 7:00-8:15 PM.

Instructor

Dr. Jianhua Ruan
Office location: S.B. 4.01.48
Office hours: Wednesday 1:00-2:00pm, and by appointment
Email: jruan@cs.utsa.edu
Phone: (210) 458-6819

Textbooks and Resources

Data Mining: Practical Machine Learning Tools and Techniques (3 rd edition), by Ian H. Witten, Eibe Frank, and Mark A. Hall. Morgan Kaufmann. January 2011.

Throughout the semester we will also be using materials including chapters from other textbooks, journal papers, and review articles which will be available in class or on course homepage.

Software

We will be using Weka , a collection of machine learning algorithms implemented in Java.

Grading Policy

You can have at most two unexcused absences without affecting your final grade, unless approved by the instructor.

In general, late assignments will not be accepted and a score of zero will be given.

Collaboration Policy

Slides

Lecture 1 (1/18): Introduction

Lectures 2 and 3 (1/23, 1/25): Input, output, WEKA

Lectures 4 and 5 (1/30, 2/1): 1R, decision tree

Lectures 6 and 7 (2/6, 2/8): Instance-based learning

Lectures 8, 9 (2/13, 2/15): Performance evaluation

Lectures 10 - 12 (2/20, 2/23, 2/27): Linear models and SVM

Lectures 13, 14 (2/29, 3/5): Ensemble Learning

Lectures 15, 17 (3/7, 3/21): Naive Bayes Classifier

Lectures 18, 19 (3/26, 3/28): Unsupervised learning and semin-supervised learning

Lectures 20 - 23 (4/2 - 4/11): Hidden Makov Models

Lectures 24 (4/16): DNA motif finding, Final Project

Lectures 25, 26 (4/18, 4/23): Rule-based Learning

Lectures 27, 28 (4/18, 4/23): Feature selection and data transformation

Tentative list of topics

 

Readings

Number of lectures

Introduction, admin stuff, Input and output

Chap 1-3

3

Decision trees

Chap 4.1, 4.3, 6.1

2

Instance-based learning

Chap 4.7, 6.5

2

Evaluating hypotheses

Chap 5

2

Linear models and SVM

Chap 4, 6, handouts

3

Data transformation and feature selection

Chap 7

2

Ensemble learning

Chap 8

2

Unsupervised learning

Chap 4, 6, handouts

4

Association rule learning

Chap 4, 6, handouts

2

Statistical learning

Chap 4, 6, handouts

3

Hidden Markov models

Handouts

4