CIS 4307: Threads I

[Return Value of Thread Functions], [Pthreads], [Threadsafe Functions], [Two Examples], [Locks for Pthreads], [Posix Semaphores]

Threads

We now consider the use of POSIX threads, as they are provided on OSF/1 on our alphas with the pthreads package. Note that these commands may not be available in this form on all Unix systems.

Good source of information on threads are the books:

S.Kleiman,D.Shah,B.Smaalders: Programming with Threads, Prentice-Hall,1996

B.Nichols,D.Buttlar,J.Proulx: Pthreads Programming, O'Reilly, 1996

D.R.Butenhof: Programming with POSIX Threads, Addison-Wesley, 1997

Before I forget, on our systems remember to compile programs that use threads as follows

cc sourcename.c -o execname -threads

gcc -DREENTRANT sourcename.c -o execname -L/usr/shlib -lpthread -lexc

Beware that cc likes "pthread_attr_default" while gcc likes "NULL".

On Linux I use

gcc -Wall sourcename.c -o execname -lpthread

All threads defined within a process share the address space of that process. This is very convenient since each thread can communicate with the other threads in that process through memory. It is also very dangerous because we get into problems of mutual exclusion and synchronization.

POSIX has a large number of functions for managing threads, using mutexes, using condition variables, etc. Check these services with the command

man pthreads

Threads are not easy to understand well. There are all sorts of complexities hidden by names such as user threads, kernel threads, light-weight threads as it is discussed in the textbook. There are obvious questions we will not explore. For example:

A parent process where multiple threads are executing forks. What threads will be active in the child? [Only the thread that executed the fork call, but mutexes reamin in their current state, i.e. they may be held by threads that have disappeared!]
Can distinct threads in a process have different blocking masks? and who will handle an occurrence of a signal? [A signal handler function is unique, process-wide. Each thread can have its own signal blocking mask. A signal directed to a specific thread will be handled by that thread. A synchronous signal will be handled by the thread where it arose. Asynchronous signals may be handled by any thread that does not block that signal (we do not know a priori which one it will be.]
How are threads scheduled? [Just assume random but fair.]

When using threads aim for simplicity and avoid being in situations where you have to answer questions such as the above.
An interesting fact is that in recent thread packages it is possible to have different threads of a single process execute on different processors at the same time. That is, within a process two or more threads may be executing with true concurrency.

Return Value of Thread Functions

Usually system calls, if they detect an error condition, they set the external variable errno to an integer value that identifies the error. In the case of calls to thread functions, they do not set errno, they instead return the error code (or 0 if no error). Thus where we used to do things like

   if ((pid=fork()) < 0) {
	perror("cannot fork");
	exit(1);
   }

where perror used the errno information, with thread functions we have to do things like

   if ((rc = pthread_create(..,..,..,..)) != 0) {
	fprintf(stderr, "Cannot create thread %s\n", strerror(rc));
	exit(1);
   }

By the way, for traditional system calls that set the int errno, originally this errno was global to all threads, but in recent Unix implementations of threads, errno has become local to each thread.

Threadsafe Functions

Since threads are executed concurrently in the same address space and control can be transferred between them at any time, we have to be very cautions in using them and make sure that concurrent executions of functions do not result in problems. We say that a function is threadsafe when it can be executed without problems by concurrent threads. Usually this means that the function uses only local variables and read-only global variables. In the case of functions that that write to global storage, they may be made threadsafe if they use appropriate locks. If we protect a thread unsafe function with a lock (i.e. we precede the call with a lock and follow it with an unlock) we may or not become threadsafe.
Suppose I write a function

   /* keep a running total of the values passed in calls*/
   int adder(int n) {
	static int sum = 0;
	sum += n;
	return sum; }

This function, when used by concurrent threads may fail (it is not threadsafe) because "sum += n;" is a critical section. But even if we use locks the function remains unthreadsafe, in the following sense: In a thread Alice uses "adder" to keep track of her deposits, while in another thread Bob would like to keep track of his deposits. Clearly that will not happen: adder is unsafe. In general, whenever we have stateful functions (i.e. functions that maintain a state acroos calls) we are dealing with potentially unsafe functions. In this case we need to pass the state as a parameter. In our case adder could be rewritten:

   int adder(int *sum, int n) {
	*sum = *sum +n;
	return *sum;}

Many functions in the standard C library are not threadsafe, though threadsafe versions may also be available as we have seen for rand and rand_r.

Another thing to be aware of when using threads is that it is dangerous to use pointers from a thread to locations on the stack of another thread. For example if thread A calls a function moo and in there declares a variable x and passes x as parameter to a thread B. Then moo returns. Now B is accessing a location that has been deallocated by A, and perhaps reallocated with a different meaning. Moral: to a thread pass only dynamically allocated data, or static data.

As an example, when using threads we call rand_r instead of rand to generate random numbers because rand_r is re-entrant (i.e. threadsafe).

Here are the random functions we use in non-threaded programs:

       void srand(unsigned int seed);
       int rand(void);

srand sets some global variable to seed and each call to rand updates and returns the value of the global variable. If srand and rand are called concurrently by more than one thread the global location holding the seed is clobbered and the threads get unpredictable [unrepeatable] sequences of integers. When using threads we call:

       int rand_r(unsigned int *seedptr);

and pass the address of a seed local to the calling thread [i.e. different threads use different seed variables and initialize them directly with an assignment, not srand].

Two Examples

We write two simple programs with threads using the functions pthread_create and nanosleep. Here are the specifications of pthread_create and nanosleep.

    #include <pthread.h>
    int pthread_create(
	pthread_t *thread,	// The thread that is created
	const pthread_attr_t *attr,//attributes for thread; usually
                                  // we use pthread_attr_default (or NULL)
	void * (*start_routine)(void *), //function executed by thread.
	  // The value returned by the start_routine takes the role of
	  // the status parameter in pthread_exit (see below) and can be 
	  // collected with pthread_join (see below).
	void * arg); // address of argument passed to startroutine
          // Returns 0 iff successful

    The created thread is ready as soon as created and
    inherits scheduling discipline and signal mask from its creator.
    For the definition of various pthread types, look in the file
    /usr/include/bits/pthreadtypes.h.

    For pthread_attr_t it is best to go with the default, either 
    NULL or use the system call pthread_attr_init. You can also
    use functions to set particular aspects of the attribute.

    #include <time.h>
    int nanosleep(
	const struct timespec *req, struct timespec *rem); 
    delay thread for req time or until interruped by a signal,
    in which case the remaining time is stored in rem.
        struct timespec {
	    time_t tv_sec;   /* seconds */
	    long   tv_nsec;} /* nanoseconds */

Our first example is a threaded version of the Hello World example. The principal thread (i.e. the only thread that exists when we start a process) creates a new thread then waits some time before terminating. The created thread prints in a loop "Hello World!".

The program will print out 16 times the string "Hello World!" and then terminate. Notice that when the main thread of program terminates so do all other threads.

In the second example we run three concurrent threads [this number can be easily changed]. Each thread executes code that writes to different locations so as not to have race conditions. The exception is the variable shrd which is write-shared to demonstrate that concurrent threads actually share the same address space.

If you run this program you will notice:

While a thread sleeps, other threads run, and the advancement of time affects everybody.
Clearly shrd is read and written by all the threads.
Threads execute functions that are threadsafe [see below] since each is given its own set of variables with the states parameter..
The way that threads are scheduled limits the way they interleave.
When the main thread terminates, all threads terminate. You can easily see this by running the program and observing that not all final iterations run to completion.
Clearly the shared variable shrd should be protected by a lock.

A thread can terminate its own execution with the command:

   #include <pthread.h>
   void pthread_exit(void *status);
   
   It exits the current thread and returns status to the thread waiting
   in pthread_join, if any.
   This command does not by default return the thread's resources to the
   system. If the main function of a threaded program uses "return"
   to terminate, all running threads are ended. If insteand the main
   function uses "pthread_exit", then the program continues executing
   until all its threads have terminated.
   A thread can also request the termination of another thread with
   the call pthread_cancel, but the termination is not
   immediate, taking place only when the thread being cancelled reaches
   a cancellation point.

We can wait for termination of a specific thread with the function pthread_join:

   #include <pthread.h>
   int pthread_join(pthread_t  who, void **status);

   It suspends the calling thread until the thread who has terminated.
   Status will receive the value returned by the terminating thread
   when it exited with the pthread_exit command. pthread_join returns
   0 iff successful.
   When a terminated thread is joined, its resources, including memory, 
   are reclaimed by the system.
   We cannot wait in a join for a thread that was detached.

We can modify the previous program so that the main thread waits for the termination of all the created threads by replacing in main the lines involving nanosleep with the lines

      /* Wait for all other threads to terminate */
      for (i=0; i < THREADSCOUNT; i++) {
          pthread_join(states[i].t, NULL);
          printf("Thread %d has terminated\n", i);}

You can mark for deletion and reclaim the storage and other resources associated with a thread (of course, after it has terminated executing) with the command:

   #include <pthread.h>
   int pthread_detach(pthread_t thread);

This command will not terminate a thread that is executing, only indicating that we want to reclaim automatically its storage when it terminates execution.
Other ways of reclaiming the resources of a thread are:

If the thread was created with attribute set to PTHREAD_CREATE_DETACHED, or
If this thread is waited for with a pthread_join call.

We can send a signal from a thread to another thread:

   #include <pthread.h>
   #include <signal.h>
   int pthread_kill(pthread_t thread, int signal);

The specified 'signal' is sent to the specified 'thread. The pthread_sigmask function can be used to set up the signal mask for the current thread and the sigwait function can be used to make the current thread wait for a signal in a specified set.

Locks for Pthreads

We can use mutual exclusion semaphores, or locks, or mutexes with pthreads. These locks should be global to the threads.

    #include <pthread.h>
    int pthread_mutex_init(
	pthread_mutex_t *mutex;    /* The mutex being created */
	pthread_mutexattr_t *attr); /* usually the default, NULL, 
                                      i.e. pthread_mutexattr_default */
    int pthread_mutex_lock(pthread_mutex_t *mutex);
    int pthread_mutex_unlock(pthread_mutex_t *mutex);
    int pthread_mutex_destroy(pthread_mutex_t *mutex);
	/* When done with a mutex we can free its resources this way */

There are three kinds of mutexes depending on the value of the pthread_mutexattr_t attribute. We could have MUTEX_FAST_NP (the default), to be used in the standard lock..unlock protocol; MUTEX_RECURSIVE_NP: which allows one thread to do things like "lock .. lock .. unlock .. unlock"; MUTEX_NONRECURSIVE_NP is like the fast lock, but with better debugging facilities. One normally uses for the attribute the default value NULL, i.e. pthread_mutexattr_default.
Mutexes are intended to be used in the pattern "lock .. unlock" where the locking and unlocking operations are made by the same thread. This pattern is enforced in all mutexes except the fast (or normal) mutexes. In this case no check is made to enforce the requirement that unlock is done by the same thread that locked the mutes. Thus we can use fast mutexes as if they were "blocking semaphores" to enforce priority constraints between activities. For example if I want thread 1 to do A before thread 2 can start B, we will create, initialize and lock a global fast mutex m. Then thread 2 before executing B will try to lock m, and thread 1 after finishing A will unlock m. [This is much easier than what would be needed if the ownership constraint is enforced.]

Here is a program with threads that use locks to share a resource.

Posix Semaphores

For locking we can also use Posix Semaphores, which have the basic properties we studied in concurrency programming. At present on Linux they can only be used to synchronize threads within a single process, but potentially they can be used across processes. They are more general than pthread_mutexes, but they cannot be used in conjunction with condition variables (see next lecture note on threads). Here are the basic operations on semaphores:

       #include <semaphore.h>

       int sem_init(sem_t *sem, int pshared, unsigned int value);

       It initializes sem to the count value (1 for a mutual exclusion semaphore)
       pshared is 0 as long as Linux does not support use of semaphores across
       processes (1 for sharing). It return 0 in case of success, -1 otherwise.

       int sem_wait(sem_t * sem);

       The P operation. It always return 0. The caller blocks if the count is
       not positive. In all cases the count is decremented.
       
       int sem_post(sem_t * sem);

       The V operation. The count is incremented. If processes were waiting
       one is waken up. It returns 0, -1 if the count gets too large (greater than
       SEM_VALUE_MAX).

Here is the program referred above using a Posix semaphore instead of a lock.

ingargio@joda.cis.temple.edu