CIS 4307: Unix III

Select

At times a process needs to wait for messages from more than one source, say two sources. If the process blocks on one source, it is in no position to know if there is a message from the other source. If it checks each source without blocking, it performs a busy loop. Unix gives us a mechanism, select, for blocking on more than one source, waking up when there is information from any of them. In fact select allows us to wait at the same time for messages, events, or write operations.

Select is a complex service:

    #include <sys/types.h>
    #include <sys/time.h>
    int select(
          int nfds,          /* Number of objects monitored*/
          fd_set *readfds,   /* Set of file descriptors open for reading*/
          fd_set *writefds,  /* Set of file descript. open for writing*/
          fd_set *exceptfds, /* Set of pending exception conditions. Always 0 for us. */
          struct timeval *timout); /* Structure specifying timeout */
        If timout is null, then we block until one of the fd objects
        becomes ready. If timout points to a structure with timeout 
        equal to 0, then we do not block at all. Otherwise we will block
        up to the time specified by timout.
    FD_ZERO(&fdset)   /* fdset becomes the empty set */
    FD_SET(fd,&fdset) /* fd is added to fdset */
    FD_CLR(fd,&fdset) /* fd is removed from fdset */
    FD_ISSET(fd,&fdset) /* it determines if fd is in fdset */
    The structure timeval, defined in sys/time.h is
	struct timeval { int tv_sec;    /* second */
			 int tv_usec;}  /* microseconds */
    Upon return from select readfds, writefds, exceptfds have set only the 
    bits that correspond to ready files. The select function returns the
    number of bits that are set in readfds, writefds, exceptfds.
    [fd_set is just an array of integers, where each integer is interpreted
     as a bit vector. More information about fd_set and the various operations
     FD_XERO, FD_SET, etc. are in sys/select.h ]

Here are four examples of use of select.

The first is a simple program from Stevens for determining the maximum size for a pipe. [When run on my alpha the program prints that the size is 65536.]

    #include  <stdio.h>
    #include  <stdlib.h>
    #include  <unistd.h>
    #include  <sys/types.h>
    #include  <sys/time.h>

    int main(void)
    {
	int		i, n, fd[2];
	fd_set		writeset;
	struct timeval	tv;

	if (pipe(fd) < 0){
		perror("pipe");
		exit(1);}
	FD_ZERO(&writeset);  /* set to zero the writeset */

	for (n = 0; ; n++) { /* write 1 byte at a time until pipe is full */
		FD_SET(fd[1], &writeset); /*add write-end of pipe to writeset*/
		tv.tv_sec = tv.tv_usec = 0; /* don't wait at all */
		/* select returns the number of objects (in readset, */
		/* writeset, or exceptionset) that are ready. */
		/* In our case there will be at most one ready object.*/
		if ( (i = select(fd[1]+1, NULL, &writeset, NULL, &tv)) < 0) {
			perror("select");
			exit(1);}
		else if (i == 0) /*We cannot write to pipe, i.e. it is full*/
			break;
		if (write(fd[1], "a", 1) != 1){
			perror("write error");
			exit(1);}
	}
	printf("pipe capacity = %d\n", n);
	exit(0);
    }

The second is a program also from Stevens for waiting on a timer.

    #include	<sys/types.h>
    #include	<sys/time.h>

    main(int argc, char *argv[])
    {
	long			atol();
	static struct timeval	timeout;

	if (argc != 3) {
		printf("usage: timer <#seconds> <#microseconds>");
		exit(1);}
	timeout.tv_sec  = atol(argv[1]);
	timeout.tv_usec = atol(argv[2]);

	/* select blocks waiting for the timeout to expire. */
	if (select(0, (fd_set *) 0, (fd_set *) 0, (fd_set *) 0, &timeout) < 0){
		perror("select error");
		exit(1);}
	exit(0);
    }

The third is a simple program where a process reads fixed size messages from two pipes. You should terminate the program from the terminal with a CONTROL-C.

The fourth example combines the third example with the use of a timeout.

Note that in a more realistic program we will deal with messages that are of variable sizes and more than two pipes (thinking of how you would deal with many file descriptors without having to increase the size of your program? can you also think of ways to be fair to the message sources?)

There will be other occasions to see select in action. Also, much of the need to use select calls to wait for a number of possible concurrent events can be obviated by the use of concurrent threads, each waiting for a specific event.
A significant example of uses of the select function is in the notes dealing with sockets: http://knight.cis.temple.edu/~ingargio/cis307/readings/tst.c.

Often when using select we set the sckets in non-blocking mode. We do this by using the systems call

   int fcntl(int fd, int cmd, long arg);
	where cmd can be
	      F_GETFD
        	Read the file descriptor flags.
       	      F_SETFD
                Set the file descriptor flags to the value specified by arg.
	and arg will include the O_NONBLOCK flag
as shown here
   if ((n = fcntl (clientfd, F_GETFL)) < 0
            || fcntl (clientfd, F_SETFL, n | O_NONBLOCK) <0) {
	// error condition
   }
where n is an integer.

The select call helps us write programs that do not use many threads. Normally we tend to put threads reading in blocking mode from each file descriptor from which they expect to read. With select we can use a single thread to wait on all of these descriptors. You could look up an application of this idea in the SEDA web server. It is written in Java using the New IO primitives. Though written in Java it is faster than C written servers such as Apache and Flash. The advantage of doing away with threads overcomes the overhead of Java interpretation.

ingargio@joda.cis.temple.edu