CMPUT 379: Operating System Concepts
Department of Computing Science
January 2005

Assignment #2: A More Sophisticated Directory Nanny
(Unix Pipes, Message Passing, and Signals)

Due Date: Wednesday, March 2, 2005
On-line submission using the astep program before 9 P.M. on due date.

What if my Assignment #1 was incomplete?

If you were unable to complete Assignment #1 and would like a solution to build upon for this assignment, you may purchase an assignment for a penalty of -20% (i.e., 20/100) off your mark for this assignment. You cannot arbitrarily adopt another student's (or anyone else's) code; that is considered plagiarism. This purchased assignment is not guaranteed to be bug free, but we will provide a solution of reasonable quality. Please contact your instructor if you would like to do this. Of course, read this assignment description first.

Cleaning Up Your Processes

When using fork() (and related functions) for the first time, it is easy have bugs that leave processes on the system, even when you logout of the workstation. It is your responsibility to clean up (i.e., kill) extraneous processes from your workstation before you logout. Learn how to use the ps and kill (and related) commands.

Marks will be deducted if you leave processes on a workstation after you logout.


Standard Comment About Design Decisions

Although many details about this assignment are given in this description, there are many other design decisions that are left for you to make. In those cases, you should make reasonable design decisions (e.g., that do not contradict what we have said and do not significantly change the purpose of the assignment), document them in your source code, and discuss them in your report. Of course, you may ask questions about this assignment (for example, in the newsgroup) and we may choose to provide more information or provide some clarification. However, the basic requirements of this assignment will not change.

Assignments in this Course:

All three assignments in CMPUT 379 this term will be related to each other. Although it may not be obvious now, you will be using the knowledge (and some code) from this assignment in your future assignments. Therefore, please take the time to understand the concepts and to write clean, readable code.

Also, as a student in this course, it is essential to have a solid understanding of how memory needs to be managed when aids such as Java's garbage collection are not present; operating systems and most systems software do not use garbage collection.

Overview:

In this assignment, you will be extending and improving the dirnanny program from Assignment #1. The main changes are:

  1. Pipes: Parent and child processes communicate using Unix pipes (i.e., IPC).
  2. Messages: Only the parent process produces output. Child processes no longer create a log file. This is related to the use of Unix pipes.
  3. Signals: The parent process must have a signal handler to catch a SIGHUP signal when the configuration file has been changed. The parent process must have a signal handler to catch a SIGINT signal and exit gracefully. The child processes do not have to have signal handlers.

Your program must use fork() to create child processes. The name of your executable program must be dirnanny.

Input/Output and Behaviour Specification:

As with Assignment #1, the program dirnanny takes exactly one command-line argument that specifies the full pathname to a configuration file. An example of how the program is started from the command line is:

% dirnanny /tmp/dirnn.config

where the contents of file /tmp/dirnn.config might be, for example:

10:00:00
/tmp/onebigcache 00:01:00 lessthan 2M
/tmp/othercache 00:00:20 mostrecent 03:00:00

The format and semantics of the configuration file is the same as in Assignment #1. Note that the above description of dirnanny is taken verbatim from Assignment #1.

Some of the other key aspects of the behaviour/functionality include:

  1. Recall: ``Upon terminating, dirnanny reports the number of files removed during the entire run (over all the directories monitored).'' With the introduction of Unix pipes, this information can now be passed from child to parent via the pipe.
  2. When a file in a monitored directory is removed, the parent process should output information to a single, global logfile. The child process that actually removes the file sends a message to the parent process, which generates the output.

    The location of the global logfile is determined by an environment variable DIRNANNYLOGS. If the environment variable is not set, then the default location  $HOME/.dirnanny/ is used. The logfile from a new run of dirnanny is free to clobber the previously existing logfile. The filename of the logfile is dnnylog.global

    For each directory being monitored, a line of output such as:

    [Sun Feb 6 11:27:10 MST 2005]  Info: Initializing monitoring of directory /tmp/onebigcache.
  3. is generated.

Please refer to the Assignment #1 description for more details on the configuration file.

SOME NEW FUNCTIONALITY:

In this assignment, the child processes never send output to standard out. Any output (that would have been sent to a log file in Assignment #1) must now be made by the parent process. Therefore, the child process must provide the required information to the parent process via the Unix pipe. (For debugging purposes, child processes can still use standard out, but that code should be removed before your final assignment submission.)

Your source code comments and your report should discuss the format, semantics, and use of all your messages (from parent-to-child and child-to-parent).

At any time, the user is allowed to change the configuration file to add, remove, or change any line in the file. The user is allowed to send a hangup signal (SIGHUP) to the parent dirnanny process which forces dirnanny to re-read the configuration file (which has the same filename). Then, the parent dirnanny should print to stdout the message:

[Sun Feb 6 11:27:16 MST 2005] Info: Caught SIGHUP. Configuration file '/tmp/dirnn.config' re-read.

Without exiting, the parent dirnanny must reconfigure itself and reconfigure its child processes according to the new configuration file. Also, child processes may not exit, unless there are more child processes than directories to monitor.

For example, suppose there are currently 3 child processes to monitor 3 directories.

  1. If the configuration is changed in order to monitor 3 different directories (and a SIGHUP is sent to the parent), then neither the parent nor child processes can (or need to) exit in order to handle the new configuration.
  2. If the configuration is changed in order to monitor 4 different directories (and a SIGHUP is sent to the parent), one extra child process is created, but neither the parent nor the existing child processes can (or need to) exit.
  3. If the configuration is changed in order to monitor 2 directories (and a SIGHUP is sent to the parent), one child process exits, but the parent and the other existing child processes do not exit.

Note that appropriate output is sent to the global logfile each time that the configuration file is read or re-read.

Also, the user is allowed to send an interrupt signal (SIGINT) to the parent dirnanny process which forces dirnanny to close any open files that it might have, kill each of its children via a message (even if they are currently monitoring a directory), and free up all resources (e.g., memory, return values from children that have exited). In essence, an interrupt signal is used to cleanly exit from the parent dirnanny process. Of course, if the child process is sleeping, it may be some time before it reads the exit message from the parent. For this assignment, this delay in the exit of the child process is an acceptable design decision.

Once all of the cleaning up has been completed, the parent dirnanny process prints the following (example) output and exits:

[Sun Feb 6 11:37:16 MST 2005] Info: Caught SIGINT. 33 files removed. Exiting cleanly.

Required Design:

As with Assignment #1, when the program dirnanny is started from the command line, it must read the configuration file. dirnanny checks all the directories specified in the configuration file. If the corresponding directory does not exist, it is not considered an error. Instead, dirnanny reports to stdout:

dirnanny: Warning: /tmp/othercache does not currently exist.

The child process will execute the exact same program as dirnanny, that is the children should not invoke exec(). Each child process is now entirely focussed on a single monitored directory. The parent communicates the frequency and name of the  monitored directory to the child using a Unix pipe. The child puts itself to sleep (see sleep()) for the specified monitoring period. Awakens. Implements the file removal policy. Sleeps again, and the process continues until the termination time specified in the first line of the configuration file. If the monitored directory has been removed, the child process of dirnanny continues checking periodically regardless (the directory may get created and erased an arbitrary number of times).

You must write a Makefile for your program. When someone types make, your Makefile should build the executable program dirnanny. When someone types make clean, your Makefile should remove the executable dirnanny (if any), all .o files (if any), and all core files (if any).

Note that there are synchronization issues within the parent process. For example, what happens if a SIGHUP or SIGINT signal is received asynchronously when the parent is about to fork off a child process?

Other Implementation Details:

As appropriate, you must use C memory allocation (e.g., malloc(), free()) and C file I/O functions (e.g., fopen(), fscanf(), fclose()). You cannot use streams or the STL/the C++ stdlib (e.g., cannot use type/class string). Also, your particular TA may not have any expertise in C++ and therefore we cannot guarantee lab support for languages other than C.

It is IMPERATIVE that your program properly deallocates ALL dynamic memory in a correct fashion (i.e., using free()) before your program terminates, or else your assignment will LOSE marks. To check that your program properly allocates and de-allocates ALL dynamic memory it uses, you must use the MEMWATCH package, which is simple to do.

Using MEMWATCH:

This term, we will use Version 2.71 (stable) of MEMWATCH. You can download the package yourself.

The TAs will expect that your files have been compiled with the header file memwatch.h and file memwatch.c in your working directory. MEMWATCH also has a README and FAQ.

In all of your source files (either directly or indirectly), you must add the directive

#include "memwatch.h"

and when you compile, you must compile memwatch.c along with your source file with the variables MEMWATCH and MW_STDIO defined.  As an example:

gcc -Wall -DMEMWATCH -DMW_STDIO main.c memwatch.c

When you run your program, if you get a message in your output that reads something like:

MEMWATCH detected 5 anomalies

it means you have not de-allocated dynamic memory properly.  In particular, this message indicates that 5 allocated structures have not been de-allocated. You should also check the MEMWATCH log file for any reports. If your assignment is not properly compiled with MEMWATCH enabled or if MEMWATCH reports that your memory allocation/deallocation was incorrect, then you will lose marks.

What to hand in:

All elements are to be handed in on-line via the astep program. Use the command:

unix-prompt% astep -c c379 -p as2 submit.tar

All of the following must be packaged into a tar file with the name submit.tar. Information about tar is available from the manual page (see man tar). For example, tar cvf submit.tar Makefile main.c my.h is an archetypal command; be very, very careful of the tar cvf submit.tar part. Before you submit, make sure your tar file works from within a fresh directory.

  1. A README file (ASCII text is fine) for your assignment with: (1) your name, (2) student number, (3) Unix id, (4) lecture section, (5) instructor's name, (6) lab section, and (7) TA's name clearly labelled. Marks will be deducted if any of these items are missing. The README file must also include a short description of your program, as well as a description of the relevant commands to build (e.g. make all) and how to execute your programs including command line parameters.

  2. A report in HTML file format, in a file called report.html, describing the design, implementation, and testing of your assignment. The report should contain no more than 1,000 words. (If /usr/local/bin/dehtml < report.html | wc -w is greater than 1,050 (i.e., 1,000 + a small margin; 1,051 is too many words), then marks will be deducted.) You do not need to repeat any information contained in this assignment description. I recommend you spend 25% of your report on an overview of your assignment, 50% on your design and implementation, and 25% on how you tested your program, and some concluding remarks. Note the emphasis on testing your program.

  3. Your source code file(s) for dirnanny, including all header files. Do NOT submit any MEMWATCH files, as the TA will use his/her own fresh copy of that code, but the use of MEMWATCH should be enabled in your code and Makefile.

  4. Your Makefile.

NOTE: Do not submit files or test data not described above. Only submit what is requested and what is required to compile your program (except, of course, the MEMWATCH files).

Also, make sure that your program does not produce any debugging or extraneous output during normal execution. Only the requested output should be generated. Marks will be deducted for incorrect and other unrequested output. It is acceptable to have output to report an actual error.

Note:  All files in your submission must contain the identification information labelled (1) to (7) in point 1 above (e.g., as a C or Makefile comment).

Marking:

The assignment is worth 15% of your final mark in the course. This is an individual assignment. Do not work in groups. Review the Course Outline (extracted below) on this matter.

The assignment itself will be marked as follows: 20% for your report (clarity, technical accuracy, completeness, thoroughness of the testing, etc.), 60% for the correctness of the program when we test it using CSC 225, using gcc, and 20% for the quality of the implementation (design, modularity, good software engineering, coding style, useful and appropriate comments, etc.).

If your source code, as submitted, does not compile and run (using the submitted Makefile) on the CSC 225 workstations using gcc, you will receive a mark of zero for correctness. Review the Course Outline (extracted below) on this matter.

All that you have learned about good technical communication (e.g., for your report) will apply. All that you have learned about good programming style and comments in your code will apply. Having correct code is important, but good style, design, and documentation are also important. We cannot provide an exhaustive list of what we will look for, but an incomplete list includes: a comment for each source code file, a comment for each procedure/function, a comment for each significant (global or local) variable, good choice of names/identifiers, proper modularity (e.g., do NOT put all/most of the code in main()) etc.

NOTE: There are a number of programs that you can download off the Internet that provide similar functionality to what you are asked to implement for this assignment. We are familiar with them. Therefore, do not download these programs; write your own solution to this problem. Modifying someone else's program (including programs that you can download) is against the requirements of this assignment and is an Academic Offense. If you have any doubts about whether your actions are permissible or not, you should ask a professor before proceeding.

Hints:

You may also want to learn about the following Unix programs: ps, grep, kill

Learn about system call select().

Further hints may be given later on in the newsgroup, if warranted. Be sure to read the newsgroup on a regular basis.

Cleaning up runaway processes is good etiquette when using a shared computer. For this course, it is a necessity. Make sure you know how to use the ps and kill commands.


Important Extracts from the Course Outline:

The University of Alberta is committed to the highest standards of academic integrity and honesty. Students are expected to be familiar with these standards regarding academic honesty and to uphold the policies of the University in this respect. Students are particularly urged to familiarize themselves with the provisions of the Code of Student Behaviour and avoid any behaviour which could potentially result in suspicions of cheating, plagiarism, misrepresentation of facts and/or participation in an offence. Academic dishonesty is a serious offence and can result in suspension or expulsion from the University. (GFC 29 SEP 2003)

NOTE: All assignments must be completed individually. Some high-level discussion of concepts between students is allowed. Do not work in groups. Do not share or discuss specific code in any way with other students; seek help from your TAs or instructor on these matters. Do not post code fragments longer than about 5 lines of code to the newsgroup. Note that we may use automated tools, such as MOSS, to detect potential cases of plagiarism. Note the definition of plagiarism and cheating in the Code of Student Behaviour

VERY IMPORTANT: Your programming assignments, as submitted, must work on the department's laboratory machines in CSC 225 (Linux, uj01 to uj19 ) and with the gcc compiler. All testing will be done on these machines using gcc. We also recommend that you use the gdb debugger. A program that does not work in CSC 225 with gcc, even if it works on a different Unix-like machine (e.g., other versions Linux or BSD) or compiler, will be considered incorrect. It is your responsibility to double check your tar files on the lab workstations before submitting them. Any mistakes in the above procedures, Makefiles, missing files, improper pathnames, and ``last minute changes'' to the files that prevent proper compilation will result in a mark of zero for correctness (approximately 60% of the total marks for each assignment). If you find an error in your submission, you can use the Late Policy (see below) to correct the mistake.

LATE POLICY: All programming assignments must be submitted electronically before 9 P.M. on the due date. (Note that even 1 second past 9 P.M. will be considered late.) Though not advised, it is possible to submit assignments late, with a penalty. The penalty for being late 1 day (i.e., up to 24 hours) is 10% of the maximum possible mark. Similarly, the late penalty for 2 days (i.e., more than 24 and up to 48 hours) is 20% of the maximum possible mark. No assignments will be accepted after 2 days past the deadline, except under extraordinary conditions and only with the approval of the instructor in advance.