CMPUT 399: Computational Science and Clusters

Paul Lu
Department of Computing Science
January 2008

Assignment 2: Tools and Systems for Clusters

Due Date: Tuesday, February 26, 2008, in class

Description:

This assignment is to be done individually and is worth 12.5% of your mark in the course.

The purpose of the assignment is for you to learn more about one tool or systems software that is beneficial for the use or management of clusters in high-performance computing (HPC). In contrast, the last assignment was about applications and this one is about tools.

There are three parts to this assignment: reading, summarizing, and presenting.

  1. Reading: Choose any significant tool or systems software used for clusters. Some suggestions are given below, but you can discuss other possibilities (in advance) with the instructor.

    Find between 1 and 3 substantial articles on your chosen topic. The best articles include research papers (from academic conferences and journals; see list at end of Course Outline) and articles from science-oriented magazines such as Scientific American and IEEE Computer.

    Read and understand your articles. NOTE: You will have to hand in copies of your articles to the instructor.

  2. Summarizing: Write a 3-page report (1 inch margins, at least 12 point font, single-spaced) on the topic and articles that you have chosen. Use your own words and analysis to synthesize the ideas from the article(s) such as to answer 3 main questions:

    1. What is the main technical problem (of using a cluster) being solved? Why is this technical problem important to cluster computing and/or computational science?
    2. What is the main tool, software system, or algorithmic technique used to solve the problem (e.g., security system, systems management, load balancer/scheduling)?
    3. In what way is the problem specifically related to clusters, or does the problem apply equally to other kinds of computer architectures? What other techniques or systems exist to solve the same technical problem? Why are those techniques either better or worse than the one you have chosen to discuss?

    Be sure to use proper citation and referencing techniques (any academic style of citation is acceptable). Be aware of the Code of Student Behaviour and it how applies to referencing source material.

  3. Presenting: Based on your report, prepare a presentation of approximately 5 to 10 slides (e.g., PowerPoint, Keynote, PDF, or any other appropriate format). Your presentation should also answer (obviously, in abbreviated form) the same 3 main questions as your report.

    You will be presenting your slides in class, at a time to be arranged.

What to hand in:

On the due date, hand in via paper copy and email of your electronic files, both your report and presentation.

Also, hand in copies of the articles that you used.

Marking:

The assignment is worth 12.5% of your final mark in the course. This is an individual assignment. Do not work in groups. You may discuss the assignment with other students, but the report and presentation must be your individual work.

70% of the marks will be for the report. 30% of the marks will be for the presentation slides. No marks are allocated for the class presentation, but the class presentation will still be required.

Suggestions and Hints:

  1. Web pages found by Google and Wikipedia are good places to start searching for information. But, they are not reliable enough (by themselves), so seek out actual papers or articles.
  2. Suggestions for topics: More to come...
    1. Batch schedulers, such as OpenPBS, Sun Grid Engine (SGE)
    2. Cluster monitoring systems, such as Ganglia
    3. Middleware, such as the Globus Toolkit
    4. Systems such as Amazon's EC2 and S3