CIS307: Guidelines for the solution of Homework 5

Organization of the program

We now have three processes - HMW_MAIN, RAND_PROC1, and RAND_PROC2. HMW_MAIN creates the child processes RAND_PROC1 and RAND_PROC2. RAND_PROC1 and RAND_PROC2 cooperate with each other to have their virtual address spaces overlap on a segment of shared memory that accommodates the data item STORE previously maintained by STORE_MANAGER. Thereafter they perform (an unbounded number of) read and update operations on STORE, invoking the functions SSTORE_READ() and SSTORE_UPDATE(). HMW_MAIN interacts with the child processes to make available to the user statistical information about their operation, or to terminate all processes according to the user's choice.

RAND_PROC1 and RAND_PROC2 execute identical algorithms but use different seeds for the random number generator. Given this set up, it would be natural to organize the program into two files - hmw_main.c, and rand_proc.c (additional header files may be used to group together data structures and operations needed in each process). Each child process operates in the address space of the parent process HMW_MAIN as it begins execution. It immediately invokes one of the exec functions to execute (the machine code version of) the program in rand_proc.c. Thus RAND_PROC1 and RAND_PROC2 would invoke

  
           execlp("rand_proc",arg1,arg2,NULL)
where
    arg1 - character string representation of the identifier for the child  
           process (e.g., 1 for RAND_PROC1, and 2 for RAND_PROC2), and  

    arg2 - character string representation of the seed for the random 
           number generator,  
are the arguments to the program in rand_proc.c. The program (or the function main() to be more precise) in rand_proc.c would accept the arguments from execlp() and convert them into the respective integer values. The file rand_proc.c contains the executable code for rand_proc.

Structure of STORE

STORE holds a set of ordered pairs (SSTORE_ID, SSTORE_ELEM) where SSTORE_ID and SSTORE_ELEM are character strings. In order to simplify the read and update operations on STORE, we will define SSTORE_ID and SSTORE_ELEM to be fixed-length strings. For uniformity we shall use the following definitions :
  const

    STRING_LENGTH = 20;
    SSTORE_SIZE = 10;
     
  type

    STRING_TYPE = array [0 .. STRING_LENGTH - 1] of char;

    SSTORE_ENTRY_TYPE = record
                         SSTORE_ID : STRING_TYPE;
                         SSTORE_ELEM : STRING_TYPE;
                       end;

    SSTORE_TYPE = array [0 .. SSTORE_SIZE - 1] of SSTORE_ENTRY_TYPE;

    SSTORE_PTR_TYPE = pointer to SSTORE_TYPE;
  
  var
    
    STORE_PTR : SSTORE_PTR_TYPE;
Since STORE is not visible to HMW_MAIN, these declarations should be local to rand_proc.c.

We note that a variable of type SSTORE_TYPE is NOT created STATICALLY. Instead a shared memory segment with the size of SSTORE_TYPE is created at run time. Both RAND_PROC1 and RAND_PROC2 attach this segment to their address space. The kernel returns to each process the address at which the segment is attached in its address space. Each process saves this address in STORE_PTR after converting it into a pointer of type SSTORE_PTR_TYPE. The shared memory segment may then be deemed to hold an item of type SSTORE_TYPE that can be accessed through STORE_PTR. In what follows we take STORE to mean the variable accessed through STORE_PTR although the name STORE is not explicitly associated with any data item in the declarations above.

Initialization of STORE

STORE is initialized by either RAND_PROC1 or RAND_PROC2 (but not both) before read and update operations are performed on it. The SSTORE_ID component of each element of STORE is assigned a unique value that is never modified by subsequent (read or update) operations. The values chosen are fixed-length strings (of STRING_LENGTH characters) and may represent internet host addresses such as "jedi.cis.temple.edu" (padded with enough spaces at the end to make up a total of STRING_LENGTH characters). This set of values is available to both RAND_PROC1 and RAND_PROC2 (for example, through a header file specifying these values that is attached to rand_proc.c through a #include directive) which enables them to generate read and update requests that can be satisfied. The SSTORE_ELEM component of each element of STORE is assigned the same value - a fixed length string of 0's, consisting of STRING_LENGTH 0's.

Establishing STORE in shared memory

A simple way to establish STORE in shared memory and enable RAND_PROC1 and RAND_PROC2 to attach it to their address spaces is to get the parent process HMW_MAIN to create a shared memory segment and initialize it before spawnning the child processes so that they automatically receive the identifier for the segment while executing in a copy of the parent's address space. However we are forbidden to follow such an approach, and RAND_PROC1 and RAND_PROC2 should manage between them the task of getting their address spaces to overlap over STORE.

The problem becomes a little more tractable if one of the processes is designated to carry out the task of creating a shared memory segment, initializing it, and communicating the identifier for the segment to the other process by, for example, writing it into a file. This approach works fine, but would not be acceptable for the reason that it introduces a non-symmetric relationship between RAND_PROC1 and RAND_PROC2. The two processes should behave like peer processes and execute identical algorithms.

The requirement that the two processes should behave symmetrically and the fact that no assumptions can be made about their speeds relative to each other give rise to a number of complications. The processes should cooperate with each other to create a single shared memory segment that each would be able to attach to its address space. One way to accomplish this is to get both processes to invoke shmget with the SAME value for the key, for instance, by defining the key in a common header file. Unfortunately this approach will not work if the key is already associated with a shared memory segment created by another user. Invoking shmget with the value IPC_PRIVATE for the key guarantees the creation of a new segment. However, the value IPC_PRIVATE is unique to EACH process, and so two independent segments would be created if each process were to invoke shmget specifying IPC_PRIVATE as the key. Therefore exactly one process should invoke shmget with IPC_PRIVATE as the key. The identifier returned for the segment should then be made available to the other process through some means such as storing it in a file. For this approach to work, each process should first check if the file is empty, read the identifier if it is in the file, and proceed to create a segment otherwise. The test and the associated actions must be carried out as an atomic operation. Advisory locking may be used to accomplish this. The following algorithm embodies the approach described.

  procedure GetSharedRegion
  begin
    Open (creating if necessary) the (initially empty) file SHMIDFILE
      for reading and writing;
    Obtain an exclusive lock on SHMIDFILE;
    if ISEMPTY(SHMIDFILE) then  { shared memory segment not created yet }
    begin
      Invoke shmget as 
        shmid := shmget(IPC_PRIVATE,sizeof(SSTORE_TYPE),SHM_R|SHM_W);
      Invoke shmat to attach segment to address space as 
        StartAddressOfSTORE := shmat(shmid,0,0);
      Initialize STORE;
      Write shmid into SHMIDFILE;
    end
    else { shared memory segment already created }
    begin
      Read shmid from SHMIDFILE;
      Invoke shmat to attach segment to addres space as 
        StartAddressOfSTORE := shmat(shmid,0,0);
    end
    Release lock on SHMIDFILE;
  end { GetSharedRegion }

Choice of arguments for SSTORE_READ() and SSTORE_UPDATE() operations

As stated above, STORE is initialized before read and update operations are performed on it. The SSTORE_ID component of each element is assigned a fixed value, and all the SSTORE_ELEM components are initialized with a fixed-length string of 0's.

Read and update operations performed by RAND_PROC1 and RAND_PROC2 specify a value for the SSTORE_ID component through the parameter WHO in the functions SSTORE_READ() and SSTORE_UPDATE(). If an entry with that value for the SSTORE_ID component is found in STORE, the operation is performed on the entry, otherwise a failure status is returned for the operation. RAND_PROC1 and RAND_PROC2 select values for the parameter WHO randomly from a list maintained in each process. This list contains ALL values held in the SSTORE_ID components in STORE, and also an additional value that is not present in STORE. Thus the list contains SSTORE_SIZE + 1 values, and the value not contained in STORE is selected, on the average, once every (SSTORE_SIZE + 1) trials, causing the operations requested by the process to fail at the above rate.

Update operations specify through the parameter WHAT, a value to be assigned to the SSTORE_ELEM component of a specified entry in STORE, obliterating the value currently held in it. We will stipulate that RAND_PROC1 would ALWAYS use a string of 1's (of length STRING_LENGTH) and RAND_PROC2, a string of 2's (also of length STRING_LENGTH), for the parameter WHAT when requesting an update operation. Recall that the SSTORE_ELEM component of each element of STORE is assigned a string of 0's during initialization. Thus the SSTORE_ELEM component of an entry in STORE can hold one of three values -

This makes it easier to ascertain whether the read operations returned the values last written into the entries in STORE by scanning the log generated by RAND_PROC1 and RAND_PROC2. Correctness requires that the read and update operations performed by RAND_PROC1 and RAND_PROC2 be serializable for the integrity of the data in STORE to be preserved.

Guaranteeing the atomicity of read and update operations

Since RAND_PROC1 and RAND_PROC2 directly access the shared data item STORE while performing read and update operations on it, two concurrent operations (of which at least one is an update) that access the same element of STORE may potentially interfere with each other. Such interference would result in the violation of the integrity of STORE, or cause read operations to return incorrect results. Operations on STORE can be prevented from interfering with one another if each operation is made to appear as an atomic action to the others so that its intermediate effects on STORE are not visible to them. This may be done by treating sections of code in SSTORE_READ() and SSTORE_UPDATE() that access STORE as critical sections and synchronizing access to STORE to ensure mutual exclusion in the critical sections.

We use advisory locking to control entry into the critical sections. We associate with each element of STORE a distinct region of a file, and stipulate that the read and update operations of both RAND_PROC1 and RAND_PROC2 should obtain a lock on the region of the file associated with an element before attempting to access that element, and release the lock upon completion of access. Regions of a file may be locked and unlocked in Unix through lockf or fcntl.

We use a file named STORELOCKFILE for advisory locking, associating each element of STORE with a distinct byte position in the file. The file may be created manually before you run your program. It should hold a minimum of SSTORE_SIZE bytes, but the bytes themselves may be arbitrary. A file with the above attributes would be present in the directory where I run your program, and can be opened by the program if it uses the name STORELOCKFILE with all letters in uppercase. The critical sections in SSTORE_READ() and SSTORE_UPDATE() are now bracketed by lock and unlock operations on individual byte positions in STORELOCKFILE as shown below.

  { The elements of STORE are assumed to be indexed 0 .. SSTORE_SIZE - 1 }

  function FindEntry (TARGET_ID : STRING_TYPE) : integer;
  {  Returns the index of the element of STORE with the value specified 
     by TARGET_ID for the SSTORE_ID component if one exists, otherwise 
     -1.
  }

  function SSTORE_READ (WHO : STRING_TYPE; var WHAT : STRING_TYPE) : integer;
  begin
    i := FindEntry(WHO);
    if (i in 0 .. SSTORE_SIZE - 1 ) then
    begin
      Obtain an exclusive lock on the byte at offset i relative to the 
       beginning of STORELOCKFILE;
      WHAT := STORE[i].SSTORE_ELEM;
      Append record summarizing operation to LOG.DAT, invoking write;
      Unlock the byte at offset i relative to the beginning of STORELOCKFILE;
      return (0);
    end 
    else
      return (1);
  end; { SSTORE_READ }
 

  function SSTORE_UPDATE (WHO : STRING_TYPE; WHAT : STRING_TYPE) : integer;
  begin
    i := FindEntry(WHO);
    if (i in 0 .. SSTORE_SIZE - 1) then
    begin
      Obtain an exclusive lock on the byte at offset i relative to the 
       beginning of STORELOCKFILE;
      STORE[i].SSTORE_ELEM := WHAT;
      Append record summarizing operation to LOG.DAT, invoking write;
      Unlock the byte at offset i relative to the beginning of STORELOCKFILE;
      return (0);
    end 
    else
      return (1);
  end; { SSTORE_UPDATE }
Note that the write to the log file is issued by the process before it leaves the critical section. This is to ensure that log records for operations accesing the SAME element of STORE appear in LOG.DAT in the order in which they were actually performed. Since we rely solely on the log file to help us verify whether the operations on STORE were performed correctly, i.e., in a serializable manner, it is essential that the log file reflect accurately the order in which (conflicting ) operations were performed on STORE.

Note that the use of exclusive locks prevents two read operations requiring access to the same item in STORE from accessing STORE concurrently eventhough the semantics of the operations or the integrity of STORE would not be violated if they were allowed to do so. We can improve the degree of achievable concurrency by getting read operations to lock regions of the file in a shared mode instead of an exclusive mode. Since multiple processes may hold a shared lock on the same region in a file, two read operations requiring access to the same element of STORE may execute concurrently. However, a shared lock conflicts with an exclusive lock, and so an update operation will not access STORE concurrently with either a read or an update operation that also requires access to the same element of STORE. This is the maximum amount of concurrency we can SAFELY permit, without hazarding the risk of violating the integrity of STORE. Region locking in shared and exclusive modes may be accomplished by invoking the function fcntl.

A simpler approach towards achieving mutual exclusion in critical sections using advisory locking would be to associate EVERY element of STORE with the ENTIRE file instead of a distinct region of the file. Thus each read and update operation would lock the entire file while accessing STORE. This serializes ALL operations on STORE, regardless of whether they access the SAME element or not, and the resulting loss in concurrency leads to a degradation in the response time and throughput associated with the operations.

Format of entries in the log file LOG.DAT

RAND_PROC1 and RAND_PROC2 write into the file LOG.DAT a summary of each read and update operation they perform on STORE. Since the log file is our primary source of information in determining whether operations on STORE are performed consistently, we must ensure that all the relevant information about each operation is written to the log file. Each entry in the log file should identify the writer (RAND_PROC1 or RAND_PROC2), the operation performed (read or update), the parameters of the operation, the status returned for the operation, and a timestamp. Each entry may be written on a separate line for clarity.

The following are examples of the suggested format :

  1  R  jedi.cis.temple.edu   00000000000000000000  OK   < timestamp >
  2  U  jedi.cis.temple.edu   22222222222222222222  OK   < timestamp >
  2  U  yoda.cis.temple.edu   22222222222222222222  OK   < timestamp >
  1  R  yoda.cis.temple.edu   22222222222222222222  OK   < timestamp >
  1  U  none.cis.temple.edu   11111111111111111111  ERR  < timestamp >
  2  R  none.cis.temple.edu   --------------------  ERR  < timestamp >
The last two operations fail because "none.cis.temple.edu" is not a valid SSTORE_ID value in STORE.

It should be noted that the file LOG.DAT is used only by RAND_PROC1 and RAND_PROC2. Since HMW_MAIN does not need to access LOG.DAT, it should not open the file. Each child process opens the file independently, enabling read and write permissions for the owner while creating the file, so that the contents of the file may be examined by the user when the program terminates.

Interaction of the processes with the user

The parent process HMW_MAIN, after creating the child processes, prompts the user repeatedly to ask whether he wants statistical information from either RAND_PROC1 or RAND_PROC2, or to have all processes terminated.

If statistical information is requested, HMW_MAIN sends the signal SIGUSR1, a signal with user defined semantics, to the appropriate process, which in turn catches the signal and displays on the screen the number of read and update operations it has completed. The response from RAND_PROC1, for example, would look like

  RAND_PROC 1  :  Number of Reads = 8.  Number of Updates = 6.
Since the signal catcher is not allowed to take any user defined arguments, the variables recording the number of read and update operations completed should be made global to it. When the process returns from the signal catching function, it resumes its normal operation, going back to where it was before being interrupted by the signal.

If the user indicates that he would like all processes terminated, HMW_MAIN sends SIGTERM signals to both child processes, and waits for each child to terminate, invoking one of the wait functions as many times as necessary. Each child process informs the user of the receipt the SIGTERM signal, performs some clean-up operations such as detaching the shared memory segment from its address space and removing the segment if it is the last process to detach the segment, and terminates itself invoking exit(). HMW_MAIN confirms the termination of a child process each time it returns from the wait function, and finally invokes exit() to terminate itself. Here is an example of screen output resulting from the user's choice of terminating all processes :


  RAND_PROC2       :  Terminating in response to SIGTERM signal.
  RAND_PROC1       :  Terminating in response to SIGTERM signal.
  HMW_MAIN         :  RAND_PROC2 terminated.
  HMW_MAIN         :  RAND_PROC1 terminated.
  HMW_MAIN         :  Terminating after all child processes.

Graceful Termination

It is the responsibility of a process that has been granted a resource to release the resource when it is no longer needed, so that the requests of other processes for the resource may be satisfied. In keeping with this principle, the shared memory segment created by your program should be removed when the program terminates so that the system tables do not retain entries for segments that are no longer in use. The signal catching function for SIGTERM in RAND_PROC1 and RAND_PROC2 should carry out the operations necessary to remove the segment before terminating the process. A process can detach a segment from its address space by invoking shmdt which causes the attachment count for the segment, stored in the shm_nattch component of the shmid_ds structure associated with the segment, to be decremented. The attachment count for a segment can be examined by a process by getting the shmid_ds structure copied into a private variable, invoking shmctl with IPC_STAT as the second argument. A process can remove a segment by invoking shmctl with IPC_RMID as the second argument. This causes the identifier associated with the segment to be recycled rendering the old value invalid, but the segment remains in the system until the last process using the segment terminates or detaches it. Therefore exactly one of the processes RAND_PROC1 and RAND_PROC2 should attempt to remove the segment although both processes may detach it from their address spaces. This requires that each process atomically check the attachment count for the segment and perform either a detach or a remove operation depending on the value of the attachment count. This may be accomplished through advisory locking. The following is an outline of the steps involved in removing a shared memory segment.
  Procedure RemoveSegment
  var
    buf : pointer to struct of type shmid_ds; 
  begin
    Obtain an exclusive lock on the file SHMIDFILE;
    Invoke shmctl as 
       shmctl(shmid,IPC_STAT,buf);
    if shm_nattch component of buf equals 2 then 
      Invoke shmdt to detach the segment;
    else { attachment count is 1 } 
      Invoke shmctl as 
         shmctl(shmid,IPC_RMID,NULL);
    Release lock on SHMIDFILE;
  end { RemoveSegment }
We note that these actions are performed from within the signal catching function for SIGTERM. Since the signal catching function cannot take user defined arguments, data items such as the descriptor for SHMIDFILE should be made global.

An alternate approach to removing the segment is to delete it manually from the command line. The command ipcs displays a list of shared memory segments (among other IPC structures) currently in the system along with their attributes such as the id of the segment and its owner's id. A segment created by your program can be deleted by invoking the command ipcrm with the option flag -m to indicate a shared memory segment, and specifying the identifier for the segment.

amandala@astro.ocis.temple.edu