CSC443 Jan15

slide version

single file version

Contents

  1. Multihtreaded Programming
  2. What is a thread?
  3. Remote Login Server Example
  4. Thread Version
  5. Single Threaded Database Server
  6. Multithreaded Database Server
  7. Complications
  8. When are Thread Arguments deallocated?
  9. Termination
  10. Detached Threads
  11. pthread Signature (i.e., paramter types)
  12. Thread Attributes
  13. Stack Size
  14. Compiling
  15. Mutual Exclusion
  16. Multiple Locks
  17. Lock Hierarchies
  18. Conditional Locking
  19. Problems Requiring more than Mutexes
  20. Semaphores
  21. POSIX Semaphores
  22. Condition Variables
  23. Readers Writers Problem
  24. A 'Solution' with POSIX Threads
  25. Ensuring Writers get to write
  26. New(er) pthread functions

Multihtreaded Programming[1] [top]

Threads are a natural means of dealing with concurrency. In Systems II, basic thread primitives were introduced and alternatives were presented with simple "echo" server.

The first program assignment revisits (reviews) multithreading in the context of a simple data base. Complications beyond the creating multiple threads - one for each client - are managing concurrent threads that can both query and modify the database. the

What is a thread?[2] [top]

A thread is the abstraction of a processor. It is a thread of control. We are accustomed to writing single-threaded programs and to having multiple single-threaded programs running on our computers. Why does one want multiple threads running in the same program?

Programming with threads is a natural means for dealing with concurrency. As we will see, concurrency comes up in numerous situations. A common misconception is that it is a useful concept only on multiprocessors. Threads do allow us to exploit the features of a multiprocessor, but they are equally useful on uniprocessorbsin many instances a multithreaded solution to a problem is simpler to write, simpler to understand, and simpler to debug than a single-threaded solution to the same problem.

Remote Login Server Example[3] [top]

A slightly simplified description of a remote login server:

  1. reads input from the remote user
  2. writes this input to a local application
  3. reads the response from the local application
  4. writes the response back to the remote user

Which of the 4 operations (2 reads, 2 writes) should be attempted first?

This problem can be handled sequentially using the select system call. (See the code on pages 40 - 42)

It is rather complex looking and involves these steps:

The code is rather good at obscuring what it does and is admittedly complicated.

Thread Version[4] [top]

The remote login problem recast as two threads (see bottom of page 42) and each thread only takes a few lines and is easy to understand (each thread is sequential):

  1. One thread just reads from the remote user and writes to the local application:

           void incoming(int r_in, int l_out) {
    	  int eof = 0;
    	  char buf[BSIZE];
    	  int size;
    
    	  while (!eof) {
    	      size = read(r_in, buf, BSIZE);
    	      if (size <= 0)
    	        eof = 1;
    	      if (write(l_out, buf, size) <= 0)
    	        eof = 1;
    	  }
           }
    	
  2. The other thread just reads the response from the local application and writes back to the remote user:

           void outgoing(int l_in, int r_out) {
    	  int eof = 0;
    	  char buf[BSIZE];
    	  int size;
    
    	  while (!eof) {
    	    size = read(l_in, buf, BSIZE);
    	    if (size <= 0)
    	      eof = 1;
    	    if (write(r_out, buf, size) <= 0)
    	      eof = 1;
    	    }
           }
    	

Single Threaded Database Server[5] [top]

The code provided for the first assignment consists initially of a single threaded database server handling a single client.

A single threaded approach to extending this to multiple clients might multiplex the clients using the select statement to give each "ready" client a bit of service in turn.

This can work, but is messy and requires identifying the partial service to give to a client before checking other reading clients. Alternatively, if each client request is satisfied completely, this may penalize clients with short requests having to wait for clients with length ones. So to be fair, the more complicated approach is needed.

As in the remote login example, the code would also be complex and prone to errors.

Multithreaded Database Server[6] [top]

Your assignment is to extend the original version to handle multiple clients by making by converting the server to use multiple threads, one for each client.

The code should be essentially as simple as the original sequential version and as fair as the complicated 'select' version.

Some synchronization of access to the database is necessary, but this will turn out not to add substantially to the code bloat if done properly.

Thread Creation

To create a thread call pthread_create:

 pthread_create(&thread,   // thread id
		0,         // ptr to thread attributes (e.g. stack size?)
		server,    // start function
		argument); // ptr to arguments for start function
    

Return type of pthread_create is int (0 means success)

0 (or NULL) for the thread attributes gets the default attributes, which are usually acceptable. E.g., the default stack size for the thread is usually satisfactory.

server must be a function that returns void * and its declared argument must also be of type void *

Complications[7] [top]

The remote login incoming function would need two descriptors, one to read from and one to write to.

To give this function to a thread as its start function, we have to modifiy it so that it takes only one paramter - a pointer to a struct with the two expected descriptors.

Since the declared type of the parameter is void *, the incoming function would have to first cast the pointer back to a pointer to such a struct and finally have access to the two descriptors.

When are Thread Arguments deallocated?[8] [top]

If the caller of pthread_create passes the address of a struct allocated locally in the caller function and the caller terminates, the thread might be pointing to deallocated storage - a dangling pointer!

This can be handled either by ensuring the caller waits for the thread it created to terminate:

      pthread_join(thread, 0);
    

Or possibly by having the arguments allocated on the heap (instead of the caller's stack) and then the thread copies the arguments to local variables and deallocates the heap versions!

Termination[9] [top]

Individual threads can terminate without terminating all the other threads in the process by calling:

      pthread_exit((void *) value);
    

or (except for main) by a return statement

      return((void *) value);
    

Note: If main() terminates by the return statement:

      return value;
    

This implicitly calls exit(value), not pthread_exit(value).

Calling exit(..) instead of pthread_exit will as usual terminate the process and consequently all its threads.

Detached Threads[10] [top]

If many threads are to be created by not waiting for existing threads to terminate (i.e., not calling pthread_join), we need some way of cleaning up after threads when they do terminate.

For multiple child processes this is usually done in a signal handler since a parent process receives a signal when a child terminates.

An different method is used for threads. Assuming we don't need to do anything when a created thread terminates, we can arrange that the thread implementation simply cleans up the thread resources completely when the thread terminates.

To do this, it is only necessary to call:

    pthead_detach(thread); // thread is the integer id for the thread
    

This call can either be done in the caller just after creation, or in the thread's own start function.

pthread Signature (i.e., paramter types)[11] [top]

      int pthread_create(pthread_t * thridptr, 
                         pthread_attr_t *attptr, 
                         void * (*(void *) start, 
                         void *arg);
    

Casting to the expected types only works if the arguments behave as those types.

For example, if a function has void return type but is cast as returning void *, problems may or may not occur depending on the calling protocol of the underlying machine.

Thread Attributes[12] [top]

A number of properties of a thread can be specified via the attributes argument when the thread is created. Some of these properties are specified as part of the POSIX specification, others are left up to the implementation. By burying them inside the attributes structure, we make it straightforward to add new types of properties to threads without having to complicate the parameter list of pthread_create. To set up an attributes structure, one must call pthread_attr_init. As seen in the next slide, one then specifies certain properties, or attributes, of threads. One can then use the attributes structure as an argument to the creation of any number of threads.

Note that the attributes structure only affects the thread when it is created. Modifying an attributes structure has no effect on already-created threads, but only on threads created subsequently with this structure as the attributes argument.

Storage may be allocated as a side effect of calling pthread_attr_init. To ensure that it is freed, call pthread_attr_destroy with the attributes structure as argument. Note that if the attributes structure goes out of scope, not all storage associated with it is necessarily releasebdto release this storage you must call pthread_attr_destroy.

Stack Size[13] [top]

Among the attributes that can be specified is a thread's stack size. The default attributes structure specifies a stack size that is probably good enough for most applications. How big is it? The default stack size is not mandated by POSIX. In Digital Unix 4.0, the default stack size is 21,120 bytes, while in Solaris it is one megabyte.

How large a stack is necessary? The answer, of course, is that it depends. If the stack size is too small, there is the danger that a thread will attempt to overwrite the end of its stack. There is no problem with specifying too large a stack, except that, on a 32-bit machine, one should be careful about using up too much address space (one thousand threads, each with a megabyte stack, use a fair portion of the address space).

Compiling[14] [top]

Compiling requires adding the -pthread option to gcc:

     gcc prog.c -o prog -pthread
    

Mutual Exclusion[15] [top]

In part 2 of the program assignment, you have to deal with operations that modify the database - add and delete - not just queries.

With multiple threads accessing the same database structure, some mutual exclusion is required in order not to corrupt the database structure.

The POSIX threads provide several functions to achieve mutual exclusion, the simplest being:

      pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER; // shared 
 
      pthread_mutex_lock(&m);

      pthread_mutex_unlock(&m);
    

Code to be executed by only one thread at a time should be surrounded by the two calls.

The pthread_mutex_lock(&m) call blocks the caller if any other thread has obtained the lock and not yet released it.

Multiple Locks[16] [top]

If two threads each want to lock the same to mutexes, there is a danger of deadlock.

This can be prevented if the code is written so that all threads try to lock the mutexes in the same order.

Prevention this way is easy (if we remember to do it). Detecting deadlock after the fact or allowing any order of locks and trying to check if each lock will cause deadlock is hard.

Lock Hierarchies[17] [top]

If many mutexes are to be used, they could be arranged in levels. Then if a thread is holding a lock at level k, it should not be allowed to request another lock at level k or smaller.

This will also prevent deadlock over the mutex resources.

Conditional Locking[18] [top]

As just noted, the simplest (and best) approach to prevent deadlock to make certain that all threads that will hold multiple mutexes simultaneously obtain these mutexes in a prescribed order. Ordinarily this can be done, but occasionally it might turn out to be impossible.

For example, we might not know which mutex to take second until the first mutex has already been obtained.

To avoid deadlock in such situations, we can use the approach shown here:

   proc1( ) {
      pthread_mutex_lock(&m1);
      /* use object 1 */
      pthread_mutex_lock(&m2);
      /* use objects 1 and 2 */
      pthread_mutex_unlock(&m2);
      pthread_mutex_unlock(&m1);
   }


   proc2( ) {
      while (1) {
        pthread_mutex_lock(&m2);

        if (!pthread_mutex_trylock(&m1))
            break;
         pthread_mutex_unlock(&m2);
      }

      /* use objects 1 and 2 */

      pthread_mutex_unlock(&m1);
      pthread_mutex_unlock(&m2);
   }
    

Here thread 1, executing proc1, obtains the mutexes in the correct order. Thread 2, executing proc2, must for some reason take the mutexes out of order. If it is holding mutex 2, it must be careful about taking mutex 1. So, rather than call pthread_mutex_lock, it calls pthread_mutex_trylock, which always returns without blocking. If the mutex is available, pthread_mutex_trylock locks the mutex and returns 0. If the mutex is not available (i.e., it is locked by another thread), then pthread_mutex_trylock returns a nonzero error code (EBUSY). In the example, if mutex 1 is not available, it is probably because it is currently held by thread 1. If thread 2 were to block waiting for the mutex, we have an excellent chance for deadlock. So, rather than block, thread 1 not only quits trying for mutex 1 but also unlocks mutex 2 (since thread 1 could well be waiting for it). It then starts all over again, first taking mutex 2, then mutex 1.

Problems Requiring more than Mutexes[19] [top]

Semaphores[20] [top]

Semaphores can be thought of as extended mutexes that can have more than the two states - locked and unlocked.

Semaphores conceptually represent an abstract data type with two members:

Producer Consumer with a Bounded Buffer

A producer thread inserts items, one at a time, into a buffer of capicity B.

A consumer thread removes items, one at a time, from the same buffer.

The producer must block if the buffer is full.

The consumer must block if the buffer is empty.

The insert and remove operations may interfer with the buffer structure if allowed to be overlapped in execution.

Do we then also need mutual exclusion?

POSIX Semaphores[21] [top]

      sem_t s;

      sem_init(&s, 0, B); // initialize s to value B

      sem_wait(&s);

      ...

      sem_post(&s);
    

For sem_init, the second parameter 0 means the semaphore is only used by threads in the same process. 1, means by threads in multiple processes.

Condition Variables[22] [top]

Condition variables are another means for synchronization in POSIX; they represent queues of threads waiting to be woken by other threads. Though they are rather complicated at first glance, they are even more complicated when you really get into them.

A thread puts itself to sleep and joins the queue of threads associated with a condition variable by calling pthread_cond_wait. When it places this call, it must have some mutex locked, and it passes the mutex as the second argument. As part of the call, the mutex is unlocked and the thread is put to sleep, all in a single atomic step: i.e., nothing can happen that might affect the thread between the moments when the mutex is unlocked and when the thread goes to sleep. Threads queued on a condition variable are released in first-in-first-out order. They are released in response to calls to pthread_cond_signal (which releases the first thread in line) and pthread_cond_broadcast (which releases all threads).

However, before a released thread may return from pthread_cond_wait, it first relocks the mutex. Thus only one thread at a time actually returns from pthread_cond_wait. If a call to either routine is made when no threads are queued on the condition variable, nothing happens -- the fact that a call had been made is not remembered.

Readers Writers Problem[23] [top]

Another standard synchronization problebm is the readers-writers problem. Here we have some sort of data structure to which any number of threads may have simultaneous access, as long as they are just reading. But if a thread is to write in the data structure, it must have exclusive access.

Part 2 of your program assignment is exactly this.

A 'Solution' with POSIX Threads[24] [top]

pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t readerQ = PTHREAD_COND_INITIALIZER;
pthread_cond_t writerQ = PTHREAD_COND_INITIALIZER;
int readers = 0;
int writers = 0;

reader( ) {
  pthread_mutex_lock(&m);
  while (!(writers == 0)) {
    pthread_cond_wait(&readersQ, &m);
  }
  readers++;
  pthread_mutex_unlock(&m);
  /* read */
  pthread_mutex_lock(&m);
  if (--readers == 0) {
    pthread_cond_signal(&writersQ);
  }
  pthread_mutex_unlock(&m);
}

writer( ) {
  pthread_mutex_lock(&m);
  while(!((readers == 0) && (writers == 0))) {
    pthread_cond_wait(&writersQ, &m);
  }
  writers++;
  pthread_mutex_unlock(&m);
  /* write */
  pthread_mutex_lock(&m);
  writers--;
  pthread_cond_signal(&writersQ);
  pthread_cond_broadcast(&readersQ);
  pthread_mutex_unlock(&m);
}
    

Writers can 'starve' with this code. That is, a waiting writer may be indefinitely prevented from writing? How might this happen?

Ensuring Writers get to write[25] [top]

pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t readerQ = PTHREAD_COND_INITIALIZER;
pthread_cond_t writerQ = PTHREAD_COND_INITIALIZER;
int readers = 0;
int writers = 0;
int active_writers = 0;

reader( ) {
  pthread_mutex_lock(&m);
  while (!(writers == 0)) {
    pthread_cond_wait(&readersQ, &m);
  }
  readers++;
  pthread_mutex_unlock(&m);
  /* read */
  pthread_mutex_lock(&m);
  if (--readers == 0) {
    pthread_cond_signal(&writersQ);
  }
  pthread_mutex_unlock(&m);
}

writer( ) {
  pthread_mutex_lock(&m);
  writers++; // number that want to write
  while(!((readers == 0) && (active_writers == 0))) {
    pthread_cond_wait(&writersQ, &m);
  }
  active_writers++; // number allowed to write
  pthread_mutex_unlock(&m);
  /* write */
  pthread_mutex_lock(&m);
  active_writers--;
  if (--writers == 0) {
    pthread_cond_broadcast(&readerQ);
  } else {
    pthread_cond_signal(&writerQ);
  }
  pthread_mutex_unlock(&m);
}
    

New(er) pthread functions[26] [top]

 pthread_rwlock_t lock;

 // Initialization 

 lock = PTHREAD_RWLOCK_INITIALIZER;

// or

 pthread_rwlock_init(&lock, 0); 

 // Then

 int pthread_rwlock_rdlock(&lock);

 int pthread_rwlock_wrlock(&lock);

 int pthread_rwlock_unlock(&lock);