CMPUT 695: Principles of Knowledge Discovery in Data

Assignment 2

CMPUT 695 (Fall 2004)

Due Date: November 5th, 2004 at 17:00 (hard deadline)
Percentage overall grade: 5%
Penalties: 20% off a day for late assignments
Maximum Marks: 10

This assignment is about evaluating some data mining tools that can be purchased off the self. Some are commercial tools other are publically/freely available.

In this assignment, assume that you work in a company that needs to do some data mining on its own customers data. The decision makers in your company need to analyze the data and discover patterns related to association analysis, clustering analysis, classification, and other patterns. Unfortunately, there is no real study comparing the existing tools. One very old report dating from 1998 can be found here: http:www.datamininglab.com/pubs/ kdd98_elder_abbott_nopics_bw.pdf. You are asked to evaluate a particular tool and write a brief report that summarizes the functionalities of the tool, the positive as well as the negative aspects of the tool, its user interface, flexibility of the API, etc.
Basically, your report and the reports of your other colleagues should help the decision makers in your company to decide which tool to aquire, or whether to outsource the data mining tasks elsewhere.

Students should team by two to evaluate each tool. The tools and the assignment by tool are the following:
Tool ResourceStudentsReport
Weka WEKA http://www.cs.waikato.ac.nz/ml/weka/index.html Leila & Nasimeh weka.pdf
Xelopes XELOPES http://www.prudsys.com/Produkte/Algorithmen/Xelopes/ Ben & Paul xelopes.pdf
MCubiX   http://www.diagnos.ca/en/index1.html    
ADaM ADaM http://datamining.itsc.uah.edu/adam/ Sheldon & Dean adam.pdf
Gnome Data Mine TOGAWARE http://www.togaware.com/datamining/gdatamine/ Junfeng & Haobin GDMT.pdf
Tanagra TANAGRA http://chirouble.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html Jonathan & Jessica tanagra.pdf
PolyAnalyst MEGAPUTER http://www.megaputer.com/products/pa/index.php3 Hongqin Fan & Yunping polyanalist.pdf
XLMiner XLMINER http://www.xlminer.net/ Rafal & Wojciech xlminer.pdf

IMPORTANT:

You may want to try other tools for compaison. It may help you identify the positive and negative points of the tool you are evaluating.

Deliverables:

This assignment is to be submitted via email. Send one pdf file containing your report. A postscript of MS word file is also acceptable, but PDF is prefered. If you take snapshots of the tool, these snapshots should be included in the report. The report will be put on-line for all students to access.

The report should conclude with two points: (1) is the tool worth aquiring; (2) In you opinion, what are the ideal features and charactersistics a data mining tool should have.



Posted on Oct 19. Last updated (Oct. 19, 2004 - 12:00)