Department of Computing Science
CMPUT 681/497: Parallel and Distributed Systems
September 2009

Lecture
Instructor
A1 - MWF 1 PM, Room CSC B-41 Paul Lu, Associate Professor, Athabasca Hall 3-40, 492-7760
E-mail: paullu <at> cs.ualberta.ca
Except for emergencies, please use email instead of phone calls.
Office Contact Times: Mondays and Thursdays, 4 to 4:30
Instructor's Home Page: http://www.cs.ualberta.ca/~paullu

Instructor's Home Page: http://www.cs.ualberta.ca/~paullu
Course's Home Page (i.e., this document): http://www.cs.ualberta.ca/~paullu/C681
Newsgroup: ualberta.courses.cmput.681 (Please read it and use it regularly.)

On-Line Lecture Notes

Podcasts
(user=c681, passwd=podcasts)

Purpose

This course provides a graduate-level introduction to parallel programming and parallel and distributed systems. Both shared-memory parallel computers and distributed-memory multicomputers (e.g., clusters) will be studied. Aspects of the practice and research issues in parallelism will be covered.

For students registered in CMPUT 497, the lectures are the same as with CMPUT 681 and you will write the same take-home exam. But, I will assign you a different, defined project rather than what the CMPUT 681 students are going to do. You can also discuss your project with me, if you prefer. Finally, I will be assigning grades separately for the 497 and the 681 students; undergraduate and graduate students are not in ''competition'' with each other when it comes to the final grade assignment.

Prerequisites

  1. CMPUT 201 (Practical Programming Methodology) or equivalent.
  2. CMPUT 379 (Operating System Concepts) or equivalent.
NOTE: There will be a substantial programming component to this course. Fluency in C or C++ and Unix will be important. Also, familiarity with concurrency and basic operating system concepts is assumed.

If you have any concerns about your prerequisites for this course, please see the Instructor as soon as possible.

Course Outline

We will cover fundamental and current research topics in the design, implementation, and evaluation of parallel and distributed systems. Our focus will be on the systems software and parallel programming systems, but some hardware issues will also be covered. Topics will include parallel algorithms, parallelization strategies, distributed shared memory (and related ideas), system area networks (SAN), and operating system support.

Approximately 2/3 of the course will be on the practical and hands-on aspects of parallel programming. The other 1/3 of the course will be on current research issues. A number of research papers will be discussed and critically evaluated.

Some of the topics (in rough order) to be covered include:

  1. Introduction
    1. Why use parallel and distributed systems? Why not use them?
    2. Speedup and Amdahl's Law
    3. Hardware architectures: multiprocessors (shared memory), networks of workstations (distributed memory), clusters (latest variation)
    4. Software architectures: threads and shared memory, processes and message passing, distributed shared memory (DSM), distributed shared data (DSD)
    5. Possible research and project topics
  2. Parallel Algorithms
    1. Concurrency and synchronization (review)
    2. Data and work partitioning
    3. Common parallelization strategies
    4. Granularity
    5. Load balancing
    6. Examples: parallel search, parallel sorting, etc.
  3. Shared-Memory Programming: Threads
    1. Pthreads
    2. Locks and semaphores
  4. Distributed-Memory Programming: Message Passing and MapReduce
    1. MPI
    2. Google's MapReduce
    3. Hadoop
  5. Other Parallel Programming Systems
    1. TreadMarks: Distributed shared memory
    2. Aurora: Scoped behaviour and abstract data types
    3. Enterprise: Process templates
  6. Research Topics
    1. Protocols for DSM systems
    2. Impact of network protocols (TCP/IP, UDP/IP, bulk-data transfer, etc.)
    3. System area networks (SAN) (e.g., Myrinet)
    4. Operating system issues
    5. More to come

Marking Scheme

Programming assignment 20% Shared-memory programming.
Due Wednesday, Oct. 28, 2009, in class. Released on Oct. 7.
Take-home exam 30% Short answers. Based on research papers and class discussion.
Project with report 50% For 681: Student's choice, with guidance of instructor.
Groups of 4 to 6 students; start forming groups now.
Individual projects are possible iff you discuss
it with me beforehand.
For 497: Instructor will provide a project.

Equipment for Programming Assignments

We will be using various departmental and university computers for the programming assignments and the project.

Textbooks

  1. A. Grama, G. Karypis, V. Kumar and A. Gupta, Introduction to Parallel Computing, 2/e , Addison Wesley, 2003. (OPTIONAL).
  2. B. Wilkinson and M. Allen, Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers, 2/e, Prentice Hall, 2005. (OPTIONAL).

    Web site for textbook.
  3. W.R. Stevens and S.A. Rago, Advanced Programming in the Unix Environment, 2nd Edition Addison Wesley, 2005. (OPTIONAL). Alternatively any equivalent book
  4. Various research papers.

Useful Resources

  1. Shared-Memory Thread Programming:
    1. LLNL Pthread Programming Tutotiral (suggested by Bob Beck)
    2. SGI's Irix 6.5 Programming Manual ( Local copy of PDF and Chp. 12: Process-level (incl. sproc()) (gzipped, Postscript) and Chp. 13: Pthreads (gzipped, Postscript) )
    3. Stupid Barrier Tricks (SBT)
    4. All Irix documentation can be found here.
    5. For caslan/Aurora/Borealis Only: SGI's Irix 6.5 Debugger User's Guide ( Local copy of PDF )
    6. For caslan/Aurora/Borealis Only: SGI's Irix 6.5 SpeedShop User's Guide ( Local copy of PDF )
    7. For caslan/Aurora/Borealis Only: SGI's Irix 6.5 Origin2000 and Onyx2 Performance Tuning and Optimization Guide ( Local copy of PDF )
    8. New threads FAQ and Old comp.programming.threads FAQ
    9. Mark Hay's POSIX Thread Tutorial (1997) ( Local copy of HTML )
    10. Wagner and Towsley's Pthreads tutorial (1995)
    11. Adve and Gharachorloo's Tutorial on Memory Consistency Models:
      1-per page, gzipped PS.
  2. Distributed-Memory Message-Passing Programming:
    1. MPI Man pages (Web format)
    2. MPICH User's Guide
    3. MPICH Home Page
    4. Book: MPI: The Complete Reference
    5. For caslan/Aurora/Borealis Only: SGI's Irix 6.5 Message Passing Toolkit: MPI Programmer's Manual ( Local copy of PDF )
  3. Performance Analysis and Tuning (General References):
    1. Supercomputing 2000 Tutorial by Gerndt, Mohr, and Miller. Excellent! ( local copy of PDF)
    2. Links from SC2000 tutorial ( local copy of HTML)
  4. Glossary of Terms ( Local copy )
  5. Greg Wilson's Parallel Computing Timeline (1955 to 1993) ( Local copy )
  6. IEEE Computer's ParaScope
  7. Java and IEEE 754 Floating Point (in a phrase: not quite, not yet):
    1. Dr. Kahan's Home Page (Turing Award Winner 1989 for the IEEE 754 Standard)
    2. Talk: How JAVA's Floating-Point Hurts Everyone Everywhere (1998/2001) (PDF) ( local copy) )
    3. Joseph Darcy First Java Floating-Point Czar, former student of Dr. Kahan
    4. Borneo Project (Darcy & Kahan)
    5. Talk: What Everybody Using the JavaTM Programming Language Should Know About Floating-Point Arithmetic (JavaOne 2001) (PDF) ( local copy) )
    6. Talk: What Some People Using the JavaTM Programming Language Want to Know About Floating-Point Arithmetic (JavaOne 2001) (PDF) ( local copy) )
    7. From JavaSpec 1.0 Java and IEEE FP
    8. General IEEE 754 Information
  8. Conferences:
    1. ACM SIGPLAN Principles and Practice of Parallel Programming (PPOPP)
    2. International Parallel and Distributed Processing Symposium (IPDPS)
    3. Supercomputing (SC)
    4. International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA)
    5. International Conference on High Performance Computing (HiPC)
  9. Journals:
    1. U of Alberta Library's Electronic Journals in Computing Science
    2. IEEE Transactions on Parallel and Distributed Computing
    3. Journal of Parallel and Distributed Computing
    4. Concurrency: Practice and Experience
    5. Parallel Computing