Assignment 4
CMPUT 695 (Fall 2004)
Due Date: See table below (one day before the
presentation date at 17:00 (by e-mail)
Percentage overall grade: 5%
Penalties: 20% off a day for late assignments
Maximum Marks: 10
One of the major activities in this course is that each student is
expected to read and present one paper from the provided research
literature. The other students still have to read the paper to better
understand and follow during the presentation and hopefully have a
discussion after the presentation. Only the designated presenter,
however, is to prepare slides and a report underlining a review of the
paper.
As a forth assignment, students are required to prepare a report with
the review of an additional paper that they are not assigned to
present in class. The review is to be handed in the day before the
presentation of the paper in question.
The review should be about 2 pages (maximum 5) and should be written as if you were reviewing a journal article or a paper submitted to a conference program committee.
The review should contain at least these sections:
1-Brief summary of the main contributions of the paper
2-Elaboration on the positive aspects presented in the paper
3-Elaboration on the negative aspects presented in the paper
4-Comments on how to improve the ideas/issues/experiments presented in the paper.
The list of papers, the assigned students and the deadlines for the
assignment are as follows:
Deadline | Paper | Student |
October 27, 2004 |
Closet+: Searching for the best strategies
for mining frequent closed itemsets, J.Wang, J. Han, and J. Pei,
Ninth ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining
(SIGKDD),Washington, DC, USA, 2003.
| Haobin Li |
October 27, 2004 |
CHARM: An
Efficient Algorithm for Closed Itemset Mining
, M. Zaki, and C-J Hsiao, SIAM SDM 2002, Arlington, VA, April 2002.
| Jon Klippenstein |
Novemver 8, 2004 |
Efficiently mining long patterns from databases, R. J. Bayardo,
ACM SIGMOD international conference on Management of data, 1998.
| Leila Homaeian |
Novemver 8, 2004 |
Mafia: A maximal frequent
itemset algorithm for transactional databases ,
D. Burdick, M. Calimlim, and J. Gehrke, 17th International
Conference on Data Engineering (ICDE), April 2001.
See also:
MAFIA:
A Performance Study of Mining Maximal Frequent Itemsets, Doug
Burdick, Manuel Calimlim, Jason Flannick, Johannes Gehrke, and Tomi
Yiu, Workshop on Frequent Itemset Mining Implementations
(FIMI'03). Melbourne, Florida, November 2003.
| Hongqin Fan |
November 15, 2004 |
Dualminer: A dual-pruning algorithm for itemsets
with constraints , C. Bucila, J. Gehrke, D. Kifer, and W. White,
Data Mining and Knowledge Discovery, Vol. 7, Issue 4, July 2003, pages
241-272
| Jessica Enright |
November 15, 2004 |
Constrained
Frequent Pattern Mining: A Pattern-Growth View , J. Han and J. Pei, ACM SIGKDD Explorations
(Special Issue on Constraints in Data Mining), June 2002.
| John Sheldon |
November 17, 2004 |
Mining Sequential Patterns , R. Agrawal, R. Srikant, International
Conference on Data Engineering (ICDE), 1995.
| Paul Nalos |
November 17, 2004 |
PrefixSpan: Mining sequential patterns efficiently by prefix-projected
pattern growth ,
J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, M. Hsu,
17th International
Conference on Data Engineering (ICDE), April 2001.
| Rafal Rak
|
November 22, 2004 |
A Robust Outlier Detection Scheme in Large Data Sets ,
J. Tang, Z. Chen, A. Fu , D. Cheung, the Sixth Pacific-Asia Conference
on Knowledge Discovery and Data Mining, (PAKDD), Taipei, 6-8 May, 2002.
| Wojciech Stach |
November 22, 2004 |
LOF: Identifying Density-Based Local Outliers , M. Breunig, H.-P. Kriegel, R. Ng, and
J. Sander, ACM SIGMOD Int. Conf. on Management of Data, 2000.
| Yunping Wang |
November 24, 2004 |
Mining Top-n Local Outliers in Large Databases ,
W. Jin, K.H. Tung and J. Han, ACM SIGKDD 2001, San Jose, California,
Aug. 2001.
| Ben Chu |
November 25, 2004 |
Rainforest - a
framework for fast decision tree construction of large datasets , J. Gehrke, R. Ramakrishnan and V. Ganti,
Proc. Very Large DataBases (VLDB), 1998.
| Dean Cheng |
November 29, 2004 |
Data Bubbles for Non-Vector Data: Speeding-up Hierarchical
Clustering in Arbitrary Metric Spaces, ,
J.Zhou and J. Sander, Conf. on Very Large DataBases (VLDB), 2003.
| Junfeng Wu |
November 29, 2004 |
Privacy-Preserving Data Mining ,
R. Agrawal and R. Srikant, ACM SIGMOD 2000, Dallas, May 2000.
| Nasimeh Asgarian |
Deliverables:
This assignment is to be submitted via email. Send one pdf file
containing your report. A postscript of MS word file is also
acceptable, but PDF is prefered.
The report is to be 2 pages long (maximum 5).
|