UofAComputing ScienceSemester 2002-3

Dimensionality Reduction for Text Categorization and Grouping
(Independent Study)
Instructor: Osmar R. Zaļane

OBJECTIVE/DESCRIPTION:

This course will provide the students with (1) an overview of the area of advanced document categorization and clustering, (2) a thorough study on classification evaluation in the context of uneven distribution of text categories, and (3) a comprehensive comparison of dimensionality reduction techniques used specifically for text. The students will be provided with enough background so that a term project can be developed in which they will implement the different dimensionality reduction techniques and study their effect on the categorization algorithms in terms of accuracy and the different measures investigated in the course. The goal is to cover most recent research published in the field and get enough knowledge about dimensionality reduction to write a survey paper summarizing the taxonomy of the recent techniques with a comparative study in the context of text categorization and clustering.

The course will mainly consist of a series of discussions on the topics listed below as a general guideline. Throughout the course recent relevant research papers will also be read/discussed.

TOPICS:

The course will cover the following topics:

GRADING:

Annotated Bibliography (20%),
Discussions (20%),
Implementation and testing (20%)
Final Term paper (40%).

TEXTBOOK and REFERENCES:


Distributed: July, 2002