Threads are a natural means of dealing with concurrency. In
Systems II, basic thread primitives were introduced and
alternatives were presented with simple "echo" server.
- sequential - no concurrency
- concurrency forking separate processes
- concurrency using threads
- multiplexing clients with the select system call
- concurrency using a pool of threads, etc.
The first program assignment revisits (reviews) multithreading
in the context of a simple data base. Complications beyond the
creating multiple threads - one for each client - are managing
concurrent threads that can both query and modify the database.
the
A thread is the abstraction of a processor. It is a thread of
control. We are accustomed to writing single-threaded programs and
to having multiple single-threaded programs running on our
computers. Why does one want multiple threads running in the same
program?
Programming with threads is a natural means for dealing with
concurrency. As we will see, concurrency comes up in numerous
situations. A common misconception is that it is a useful
concept only on multiprocessors. Threads do allow us to exploit
the features of a multiprocessor, but they are equally useful on
uniprocessorbsin many instances a multithreaded solution to a
problem is simpler to write, simpler to understand, and simpler
to debug than a single-threaded solution to the same problem.
A slightly simplified description of a remote login server:
- reads input from the remote user
- writes this input to a local application
- reads the response from the local application
- writes the response back to the remote user
Which of the 4 operations (2 reads, 2 writes) should be
attempted first?
This problem can be handled sequentially using the
select system call. (See the code on pages 40 - 42)
It is rather complex looking and involves these steps:
- Make each operation be non-blocking
- Create two file descriptor sets - one set for input (reads)
and one for writes, initially empty.
- Keep track of which operations are wanted next (e.g., after
a write to an application, next we want a read of the
applications response)
- Add the descriptors for the wanted operations to the
sets.
- call select
- Now check which wanted operations are actually
ready
- It is safe to execution those ready operations (if any)
without blocking. (The select call itself will block if NO
wanted operations are ready.)
The code is rather good at obscuring what it does and is
admittedly complicated.
The remote login problem recast as two threads (see bottom of
page 42) and each thread only takes a few lines and is easy to
understand (each thread is sequential):
One thread just reads from the remote user and writes to the
local application:
void incoming(int r_in, int l_out) {
int eof = 0;
char buf[BSIZE];
int size;
while (!eof) {
size = read(r_in, buf, BSIZE);
if (size <= 0)
eof = 1;
if (write(l_out, buf, size) <= 0)
eof = 1;
}
}
-
The other thread just reads the response from the local
application and writes back to the remote user:
void outgoing(int l_in, int r_out) {
int eof = 0;
char buf[BSIZE];
int size;
while (!eof) {
size = read(l_in, buf, BSIZE);
if (size <= 0)
eof = 1;
if (write(r_out, buf, size) <= 0)
eof = 1;
}
}
The code provided for the first assignment consists initially
of a single threaded database server handling a single client.
A single threaded approach to extending this to multiple
clients might multiplex the clients using the select
statement to give each "ready" client a bit of service in
turn.
This can work, but is messy and requires identifying the
partial service to give to a client before checking other
reading clients. Alternatively, if each client request is satisfied
completely, this may penalize clients with short requests having
to wait for clients with length ones. So to be fair, the more
complicated approach is needed.
As in the remote login example, the code would also be complex
and prone to errors.
Your assignment is to extend the original version to handle
multiple clients by making by converting the server to use
multiple threads, one for each client.
The code should be essentially as simple as the original
sequential version and as fair as the complicated 'select'
version.
Some synchronization of access to the database is necessary,
but this will turn out not to add substantially to the code bloat
if done properly.
Thread Creation
To create a thread call pthread_create:
pthread_create(&thread, // thread id
0, // ptr to thread attributes (e.g. stack size?)
server, // start function
argument); // ptr to arguments for start function
Return type of pthread_create is int (0 means success)
0 (or NULL) for the thread attributes gets the default
attributes, which are usually acceptable. E.g., the default stack
size for the thread is usually satisfactory.
server must be a function that returns void * and its declared
argument must also be of type void *
The remote login incoming function would need two
descriptors, one to read from and one to write to.
To give this function to a thread as its start function, we
have to modifiy it so that it takes only one paramter - a
pointer to a struct with the two expected descriptors.
Since the declared type of the parameter is void *, the
incoming function would have to first cast the pointer back
to a pointer to such a struct and finally have access to the two
descriptors.
If the caller of pthread_create passes the address of a struct
allocated locally in the caller function and the caller
terminates, the thread might be pointing to deallocated storage -
a dangling pointer!
This can be handled either by ensuring the caller waits for the
thread it created to terminate:
pthread_join(thread, 0);
Or possibly by having the arguments allocated on the heap
(instead of the caller's stack) and then the thread copies the
arguments to local variables and deallocates the heap
versions!
Individual threads can terminate without terminating all the other
threads in the process by calling:
pthread_exit((void *) value);
or (except for main) by a return statement
return((void *) value);
Note: If main() terminates by the return statement:
return value;
This implicitly calls exit(value), not
pthread_exit(value).
Calling exit(..) instead of pthread_exit will as
usual terminate the process and consequently all its threads.
If many threads are to be created by not waiting for existing
threads to terminate (i.e., not calling pthread_join), we need
some way of cleaning up after threads when they do terminate.
For multiple child processes this is usually done in a signal
handler since a parent process receives a signal when a child
terminates.
An different method is used for threads. Assuming we don't need
to do anything when a created thread terminates, we can arrange
that the thread implementation simply cleans up the thread
resources completely when the thread terminates.
To do this, it is only necessary to call:
pthead_detach(thread); // thread is the integer id for the thread
This call can either be done in the caller just after creation,
or in the thread's own start function.
int pthread_create(pthread_t * thridptr,
pthread_attr_t *attptr,
void * (*(void *) start,
void *arg);
Casting to the expected types only works if the
arguments behave as those types.
For example, if a function has void return type but is cast as
returning void *, problems may or may not occur depending on the
calling protocol of the underlying machine.
A number of properties of a thread can be specified via the
attributes argument when the thread is created. Some of these
properties are specified as part of the POSIX specification,
others are left up to the implementation. By burying them inside
the attributes structure, we make it straightforward to add new
types of properties to threads without having to complicate the
parameter list of pthread_create. To set up an attributes
structure, one must call pthread_attr_init. As seen in the next
slide, one then specifies certain properties, or attributes, of
threads. One can then use the attributes structure as an
argument to the creation of any number of threads.
Note that the attributes structure only affects the thread when
it is created. Modifying an attributes structure has no effect
on already-created threads, but only on threads created
subsequently with this structure as the attributes argument.
Storage may be allocated as a side effect of calling
pthread_attr_init. To ensure that it is freed, call
pthread_attr_destroy with the attributes structure as
argument. Note that if the attributes structure goes out of
scope, not all storage associated with it is necessarily
releasebdto release this storage you must call
pthread_attr_destroy.
Among the attributes that can be specified is a thread's stack
size. The default attributes structure specifies a stack size
that is probably good enough for most applications. How big is
it? The default stack size is not mandated by POSIX. In Digital
Unix 4.0, the default stack size is 21,120 bytes, while in
Solaris it is one megabyte.
How large a stack is necessary? The answer, of course, is that it
depends. If the stack size is too small, there is the danger
that a thread will attempt to overwrite the end of its
stack. There is no problem with specifying too large a stack,
except that, on a 32-bit machine, one should be careful about
using up too much address space (one thousand threads, each with
a megabyte stack, use a fair portion of the address space).
Compiling requires adding the -pthread option to gcc:
gcc prog.c -o prog -pthread
In part 2 of the program assignment, you have to deal with
operations that modify the database - add and delete - not just
queries.
With multiple threads accessing the same database structure,
some mutual exclusion is required in order not to corrupt the
database structure.
The POSIX threads provide several functions to achieve mutual
exclusion, the simplest being:
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER; // shared
pthread_mutex_lock(&m);
pthread_mutex_unlock(&m);
Code to be executed by only one thread at a time should be
surrounded by the two calls.
The pthread_mutex_lock(&m) call blocks the caller if any
other thread has obtained the lock and not yet released
it.
If two threads each want to lock the same to mutexes, there is
a danger of deadlock.
This can be prevented if the code is written so that all threads
try to lock the mutexes in the same order.
Prevention this way is easy (if we remember to do it).
Detecting deadlock after the fact or allowing any order of locks
and trying to check if each lock will cause deadlock is hard.
If many mutexes are to be used, they could be arranged in
levels. Then if a thread is holding a lock at level k, it should
not be allowed to request another lock at level k or smaller.
This will also prevent deadlock over the mutex resources.
As just noted, the simplest (and best)
approach to prevent deadlock to make certain that all threads
that will hold multiple mutexes simultaneously obtain these
mutexes in a prescribed order. Ordinarily this can be done, but
occasionally it might turn out to be impossible.
For example, we
might not know which mutex to take second until the first mutex
has already been obtained.
To avoid deadlock in such situations,
we can use the approach shown here:
proc1( ) {
pthread_mutex_lock(&m1);
/* use object 1 */
pthread_mutex_lock(&m2);
/* use objects 1 and 2 */
pthread_mutex_unlock(&m2);
pthread_mutex_unlock(&m1);
}
proc2( ) {
while (1) {
pthread_mutex_lock(&m2);
if (!pthread_mutex_trylock(&m1))
break;
pthread_mutex_unlock(&m2);
}
/* use objects 1 and 2 */
pthread_mutex_unlock(&m1);
pthread_mutex_unlock(&m2);
}
Here thread 1,
executing proc1, obtains the mutexes in the correct
order. Thread 2, executing proc2, must for some reason take the
mutexes out of order. If it is holding mutex 2, it must be
careful about taking mutex 1. So, rather than call
pthread_mutex_lock, it calls pthread_mutex_trylock, which always
returns without blocking. If the mutex is available,
pthread_mutex_trylock locks the mutex and returns 0. If the
mutex is not available (i.e., it is locked by another thread),
then pthread_mutex_trylock returns a nonzero error code
(EBUSY). In the example, if mutex 1 is not available, it is
probably because it is currently held by thread 1. If thread 2
were to block waiting for the mutex, we have an excellent chance
for deadlock. So, rather than block, thread 1 not only quits
trying for mutex 1 but also unlocks mutex 2 (since thread 1
could well be waiting for it). It then starts all over again,
first taking mutex 2, then mutex 1.
- Producer Consumer Problem
- Readers Writers Problem
Semaphores can be thought of as extended mutexes that can have
more than the two states - locked and unlocked.
Semaphores conceptually represent an abstract data type with
two members:
- an integer
- a list of waiting threads (possibly empty)
Producer Consumer with a Bounded Buffer
A producer thread inserts items, one at a time, into a buffer of capicity
B.
A consumer thread removes items, one at a time, from the same
buffer.
The producer must block if the buffer is full.
The consumer must block if the buffer is empty.
The insert and remove operations may interfer with the buffer
structure if allowed to be overlapped in execution.
Do we then also need mutual exclusion?
sem_t s;
sem_init(&s, 0, B); // initialize s to value B
sem_wait(&s);
...
sem_post(&s);
For sem_init, the second parameter 0 means the semaphore is
only used by threads in the same process. 1, means by threads in
multiple processes.
Condition variables are another means for synchronization in POSIX;
they represent queues of threads waiting to be woken by other
threads. Though they are rather complicated at first
glance, they are even more complicated when you really get into
them.
A thread puts itself to sleep and joins the queue of threads
associated with a condition variable by calling
pthread_cond_wait. When it places this call, it must have some
mutex locked, and it passes the mutex as the second argument. As
part of the call, the mutex is unlocked and the thread is put to
sleep, all in a single atomic step: i.e., nothing can happen
that might affect the thread between the moments when the mutex
is unlocked and when the thread goes to sleep. Threads queued on
a condition variable are released in first-in-first-out
order. They are released in response to calls to
pthread_cond_signal (which releases the first thread in line)
and pthread_cond_broadcast (which releases all
threads).
However, before a released thread may return from
pthread_cond_wait, it first relocks the mutex. Thus only one
thread at a time actually returns from pthread_cond_wait. If a
call to either routine is made when no threads are queued on the
condition variable, nothing happens -- the fact that a call had
been made is not remembered.
Another standard synchronization
problebm is the readers-writers problem. Here we have some sort of
data structure to which any number of threads may have
simultaneous access, as long as they are just reading. But if a
thread is to write in the data structure, it must have exclusive
access.
Part 2 of your program assignment is exactly this.
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t readerQ = PTHREAD_COND_INITIALIZER;
pthread_cond_t writerQ = PTHREAD_COND_INITIALIZER;
int readers = 0;
int writers = 0;
reader( ) {
pthread_mutex_lock(&m);
while (!(writers == 0)) {
pthread_cond_wait(&readersQ, &m);
}
readers++;
pthread_mutex_unlock(&m);
/* read */
pthread_mutex_lock(&m);
if (--readers == 0) {
pthread_cond_signal(&writersQ);
}
pthread_mutex_unlock(&m);
}
writer( ) {
pthread_mutex_lock(&m);
while(!((readers == 0) && (writers == 0))) {
pthread_cond_wait(&writersQ, &m);
}
writers++;
pthread_mutex_unlock(&m);
/* write */
pthread_mutex_lock(&m);
writers--;
pthread_cond_signal(&writersQ);
pthread_cond_broadcast(&readersQ);
pthread_mutex_unlock(&m);
}
Writers can 'starve' with this code. That is, a waiting writer
may be indefinitely prevented from writing? How might this happen?
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t readerQ = PTHREAD_COND_INITIALIZER;
pthread_cond_t writerQ = PTHREAD_COND_INITIALIZER;
int readers = 0;
int writers = 0;
int active_writers = 0;
reader( ) {
pthread_mutex_lock(&m);
while (!(writers == 0)) {
pthread_cond_wait(&readersQ, &m);
}
readers++;
pthread_mutex_unlock(&m);
/* read */
pthread_mutex_lock(&m);
if (--readers == 0) {
pthread_cond_signal(&writersQ);
}
pthread_mutex_unlock(&m);
}
writer( ) {
pthread_mutex_lock(&m);
writers++; // number that want to write
while(!((readers == 0) && (active_writers == 0))) {
pthread_cond_wait(&writersQ, &m);
}
active_writers++; // number allowed to write
pthread_mutex_unlock(&m);
/* write */
pthread_mutex_lock(&m);
active_writers--;
if (--writers == 0) {
pthread_cond_broadcast(&readerQ);
} else {
pthread_cond_signal(&writerQ);
}
pthread_mutex_unlock(&m);
}
pthread_rwlock_t lock;
// Initialization
lock = PTHREAD_RWLOCK_INITIALIZER;
// or
pthread_rwlock_init(&lock, 0);
// Then
int pthread_rwlock_rdlock(&lock);
int pthread_rwlock_wrlock(&lock);
int pthread_rwlock_unlock(&lock);