Department of Computing Science
CMPUT 497: Cluster Challenge and Computational Science
January 2009

Lecture
Instructor
B3 - T,Th 11:00 to 12:20 AM,
Room CSC 215 and GSB 315
Paul Lu, Associate Professor, Athabasca Hall 3-40, 492-7760
E-mail: paullu <at> cs.ualberta.ca
Except for emergencies, please use email instead of phone calls.
Office Contact Times: Wednesday, Friday 3:30 to 4 PM, or by appointment.
Instructor's Home Page: http://www.cs.ualberta.ca/~paullu

Instructor's Home Page: http://www.cs.ualberta.ca/~paullu
Course's Home Page (i.e., this document): http://www.cs.ualberta.ca/~paullu/C497

Publicly available Papers, notes, other downloads

Course-only Papers and notes
Contact professor for password.

Purpose

This course is an applied systems course on the configuration and use of clusters (i.e., networked collections of compute servers) for computational science. As a platform, clusters are becoming more important in computing science, computational science, and industry. For example, clusters are used to handle the large amounts of data in bioinformatics, to simulate ocean currents and climate, and to calculate risks associated with complicated financial transactions.

The course will cover the hardware, software, and conceptual principles of cluster computing.

Prerequisites

Any 200-level CMPUT course, or permission of the instructor.

Course Outline

Some of the topics to be covered (not necessarily in the following order) include:

  1. Basics of parallelism and Amdahl's Law
  2. Cluster hardware architecture
    1. commodity processors, multi-core
    2. interconnection networks
  3. Basics of systems administration for clusters
    1. distributions: OSCAR, NPACI ROCKS, etc.
    2. batch schedulers
    3. file system issues
    4. power issues
  4. Performance metrics
    1. speed-up
    2. response time
    3. throughput
  5. Benchmarks
    1. HPCC
  6. Scientific computation

Marking Scheme

Assignment 1 12.5% Due Thursday, March 12, in class, plus demo.

Assignment 2 12.5% Due Thursday, April 2, in class. No demo.

Midterm exam 25% Thursday, February 26, in class.
Short answers. Based on research papers and class discussion.
Project with report 50% Due Tuesday, April 7, in class.
Student's choice, with guidance of instructor.

Readings and Textbooks

  1. Various research papers and resources, provided by the instructor (THE ONLY REQUIRED READINGS).

  2. B. Wilkinson and M. Allen, Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers, 2/e, Prentice Hall, 2005. (OPTIONAL).

    Web site for textbook.
  3. W.R. Stevens and S.A. Rago, Advanced Programming in the Unix Environment, 2nd Edition Addison Wesley, 2005. (OPTIONAL). Alternatively any equivalent book

Useful Resources

  1. Conferences:
    1. ACM SIGPLAN Principles and Practice of Parallel Programming (PPOPP)
    2. International Parallel and Distributed Processing Symposium (IPDPS)
    3. Supercomputing (SC)
    4. International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA)
    5. International Conference on High Performance Computing (HiPC)
  2. Journals:
    1. U of Alberta Library's Electronic Journals in Computing Science
    2. IEEE Transactions on Parallel and Distributed Computing
    3. Journal of Parallel and Distributed Computing
    4. Concurrency: Practice and Experience
    5. Parallel Computing