Sep 08

Written Homework: should be submitted to Course Online Site

Discussion Forum: I will check, but not necessarily monitor. If you have specific questions, it is still better to email me. I may post comments on the discussion forum if a topic is of appropriately general interest or concern.

Assignments: will be posted on the Course Online. Comments and grades will also be posted there.

Due Dates: Online Learning section due dates will generally be 1 day later In Class section(s).

Email: Starting with the first class, I will check email regularly. Emails received on day x should have responses by the end of day x + 1.

Office Hours: 4:00 - 5:30 Tu and by appointment. Visit, email, or phone me during this time.

Office Phone: 312-362-8718. But if you don't get me, you may leave a message and it will be sent to my email.

Overview [2] [top]

Operating Systems: Software to facilitate running programs on some hardware. But how is operating system software different from other complicated systems such as compilers, editors, integrated development environments, etc.

Operating system code is event driven; that is, o.s. code executes in response to events rather than being executed directly the way a compiler is.

Approach 1: Could study the general concepts that most operating systems have in common. That is, abstract away many of the details.

Approach 2: Examine general concepts, but then probe how they are implemented in a real operating system.

We will attempt to take the second approach.

What real OS? Until recently, details (e.g., source code) of Windows, not available.

Linux source available is available, and a good deal of examination and explication of its structure.

Both are committed to be robust, production operating systems.

This goal tends to make a full understanding of how various components work together quite opaque.

Minix [3] [top]

Minix: Minix was developed by Andrew Tanenbaum and released in 1987. Its name stands for mini-UNIX, with the goal of looking and working like UNIX, but being small enough so that one could understand how it works.

Subsequently, the IEEE organization published a standard for UNIX systems called POSIX. Minix has since evolved toward the new standard.

Linux: Linus Torvalds got a copy of Minix, installed it and made modifications to suit his needs.

Eventually, after writing device drivers and a file system, he had the basis for the kernel of what was to become Linux.

The Linux development emphasized performance and its internal design is monolithic and not at all like that of Minix.

Minix development has recently placed emphasis on building a reliability, and its structure remains much more transparent.

Perspectives [4] [top]

Application programmer perspective: write programs that make system calls. View the OS as an application interface.

Operating system perspective: implementation of abstractions including the API of system call functions available to application programmers.

Application Programmer's Perspective [5] [top]

Using System Calls Example (a shell)

The shell process reads a command from the user (the read waits for input)
Creates a child process to execute the command
The child loads the program and executes it, while
the parent (the shell) waits for the child to finish

Services Needed [6] [top]

Applications view OS services as obtainable through a library of functions.

Needed:

Create process
Have process execute a particular program
Wait for the process to terminate

Signals [7] [top]

Need: Interrupt a process during its normal execution.

Waiting for a child. The child has terminated.
Send a signal to the parent.
A process is taking too long.
Send a terminate signal.
System is shutting down.
Send a terminate signal to all processes.
Wait for a specified amount of time before continuing.
Sleep and wait for a signal to wake up.

Example [8] [top]

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>

void handler(int n)
{
  printf("I'm melting ....\n");
  exit(0);

}

int main(int argc, char* argv[])
{
  int n = 0;
  signal(SIGALRM, handler);
  alarm(2);

  while(1) {
    printf(".");
    if ( ++n % 30 == 0 ) {
      printf("\n%d ", n);
    }
  }

  return 0;
}

How does this program terminate? or does it?
Who sends the SIGALRM signal to this process?

Operating System Perspective: Design [9] [top]

Systems are more flexible if they carefully separate:

Mechanisms
Policies

The implementation of the mechanism of making system calls such as fork, execvp, etc. is radically different in Linux versus Minix and illustrates two distinct designs.

Monolithic Design [10] [top]

 char buffer[N];
 int nread;   /* Assuming type ssize_t is defined as int */
 nread =   read(fd, buffer, nbytes);

Continued...[11] [top]

1 - 4: user calls c library read routine
5: read library routine doesn't read; it executes in user mode, puts the system call number for read in a register and executes a trap.
The system trap handler takes over executing in kernel mode.
6 - 8: The kernel routine uses the value in the register to switch to the appropriate routine to execute the read system call (do_read) on behalf of the user (still in kernel mode) copying the data into the user's buffer variable.
9: kernel routine returns from the trap call (a special return) that restores the processor to user mode as well as returning to the user routine.
11: The user level read library routine returns to the user.

Layered Design [12] [top]

In this monolithic design, the "kernel" contains all of the operating system code that will execute in privileged (kernel) mode in one compiled address space.

However, the functions it contains are not called from a main but are typically invoked through the system call mechanism just described as needed by different user processes..

Although in a monolithic design, the compiled image is one big glob, the functions can still logically have a "layered" structure.

User level processes

System call routines

Lower level routines
used by the system call routines

Monolithic Design Comments [13] [top]

In the monolithic design, all the operating system functions compiled into one address space and is effectively part of each user process.

A system call in the monolithic case results in a mode switch from user mode to kernel mode, but is not a context switch to a different process.

The user's memory address is still usually directly accessible.

Linux has a monolithic design structure.

Client Server Design [14] [top]

Minix uses a client server design.

It has a small kernel - routines that are compiled and linked together in one image.

Device drivers are NOT part of the kernel.

File "servers" and process "server" are also not part of the kernel.

The device drivers and servers execute in their own separate address spaces.

The kernel implements a message passing facility.

System calls must use this message passing to communicate with the appropriate server.

Process Management [15] [top]

Each design still implements the process abstraction.

We will first look at some aspects of this abstraction, before examining its implementation.

Processes [16] [top]

The usual definition of process is: a running program

In order to switch the processor from one process to another, the operating system must be able to save the state of the current process and to restart it in the same state when it switches the processor back to this process.

What constitutes the state of a process?

which instruction is next
memory where the program code and data is located
values stored in registers
files it has open
which process created it
is it running or waiting for some resource
...

Data Structures for Processes [17] [top]

Each process has an entry in a process table to hold this information about the process.

This table must be accessible to the operating system code, but protected from user processes.

The dual mode of a processor can support this.

Processor Mode [18] [top]

For a dual mode processor, some register has a bit that determines the current mode of the processor.

We'll refer to these two modes generically as user mode and kernel mode.

If the processor is in kernel mode, then all machine instructions for the processor can be executed.

In user mode, some of the instructions will "trap" and not execute. These are called privileged instructions.

The term kernel will refer to the operating system code or at least to the part that does the work that requires extra privilige.

The mode of the processor should be user when user code is executing and kernel mode when kernel code is executing.

Process States [19] [top]

A process has a state property whose value changes and at any time provides some information about the current activity of the process.

A simple set of possible values of process state is

        { READY, BLOCKED, RUNNING }

The process can itself cause changes of its state. E.g., [2] A RUNNING process makes a system call to request some operating system service causing its state to become BLOCKED until [3] the service is made available.

[1] The operating system can cause the state of a READY process to become RUNNING by scheduling that process to execute.

If the operating system can get control of the cpu, it can also [4] preempt the cpu from the previously running process, sending it back to the READY state and select a different process to run.

Process State Transitions [20] [top]

Running -> Blocked
Running -> Ready
Ready -> Running
Blocked -> Ready

Events:

timer interrupt
Running process makes a system call
A system call completes (e.g. a resource becomes available)

Scheduler: Handles transition 3. But note that transition 4 goes from Blocked to Ready, not Blocked to Running.

This typically gives more chance of implementing whatever scheduling policy is desired.

Process Table [21] [top]

In Minix, three separate modules manage interprocess communication, memory, and file resources.

The process table is partitioned into three parts, with each part being managed by one of the modules

System Calls [22] [top]

A system call by a user process (executing in user mode) executes some operating system function on behalf of that user process.

To the programmer, calling a system function looks like calling any user written function: pass parameters, get a return value back.

But the code that is linked to the user program has to somehow bridge the gap between user mode to kernel mode at the call and back to user mode on return.

That's all the linked code does.

Note: There can't be a way for user code to change the mode to kernel and still be executing user code!

The actual kernel code that does the work of the system call is never linked into the user program.

Minix Organization [23] [top]

The Minix operating system has kernel code and data, but also a number of operating system processes it calls server processes which have their own memory and code, but don't have direct access to the kernel data/code.

Two Minix server processes

Process Manager
FileSystem

The server processes execute in user mode.

Kernel code does the critical steps for system calls; that is, the steps that require kernel mode privilege.

Some parts of a system call may be handled by one or more of the operating server processes.

Sounds complicated!

Message Passing in Minix [24] [top]

The Minix kernel provides a message passing facility.

For user mode processes (user processes and server processes), sending or receiving a message is a system call.

It is the only system call that doesn't itself use message passing.

User processes make Minix system calls by sending a message to the appropriate Minix server process.

Server processes can handle some system calls by themselves because the server process has all the information needed in its address space and can execute its code for the system call in user mode.

System Tasks in Minix [25] [top]

Minix has two other "processes" besides the server processes, but they are called "tasks":

System Task
Clock Task

In contrast to the server processes, the tasks run in kernel mode and have access to all the kernel data structures and code.

That is, the clock task and the system task share the kernel address space, but each one has its own call stack and is scheduled as separate processes.

Summary: System Calls in Minix [26] [top]

A user makes a system call by passing a message and receiving a reply from a Minix server process.

The Minix server process either handles the system call and replies on its own in user mode or else sends a message to the Minix System Task and gets a reply.

User Message Passing [27] [top]

User processes can send messages to Minix server processes using this function:

int syscall(int who, int syscallnr, message *msgptr);

who identifies which server process
syscallnr is the number of the system call function
msgptr should contain the address of a message struct containing parameters for the the system call.

Syscall [28] [top]

The syscall function is just a library function that can be linked into user mode programs.

It executes in user mode.

What does it do? Not much.

It calls another function:

int sendrec(int who, message *msgptr)

Ok, so how does the message really get sent?

The sendrec function [29] [top]

This function is in assembly code, not C:

__sendrec:
        push  %ebp
        mov   %esp, %ebp
        push  %ebx
        mov   SRC_DST(%ebp), %eax    ! eax = dest-src
        mov   MESSAGE(%ebp), %ebx    ! ebx = message pointer
        mov   SENDREC, %ecx        ! _sendrec(srcdest, ptr)
        int   SYSVEC              ! trap to the kernel
        pop   %ebx
        pop   %ebp
        ret

The routine pushes arguments on the stack and then executes the int machine instruction with operand SYSVEC, which is just an integer identifying an entry in the interrupt vector table.

This instruction int is an instruction (software), but has the effect of a hardware interrupt.

In particular it saves part of the state of the current process and loads the program counter (register eip) and the flags register for a kernel routine that handles all system calls.

Important! This switches from user mode to kernel mode, but also from user code to kernel code.

The int instruction can't be used to somehow magically switch to kernel mode but to user written code.

Process Related System Calls [30] [top]

fork
execvp
waitpid
exit
getpid
getppid
kill
signal

Signals [31] [top]

Unix (and Linux and Minx) use signals to notify a parent that a child has terminated. Signals are also used for keyboard interrupts, to terminate another process, etc.

The kill system call is used to "send" a signal from one process to another.

A few of the symbolic names for the integer signals:

SIGKILL
SIGINT
SIGCHLD

fork [32] [top]

pid_t fork();

Creates a new process - the child
Child process gets a copy of code and data of parent
Child and parent both continue at the return from fork() call
fork returns 0 to the child (if it succeeds)
fork returns the pid of the child to the parent

Example (fork)[33] [top]

    1	
    2	#include <stdio.h>
    3	#include <stdlib.h>
    4	#include <unistd.h>
    5	#include <sys/wait.h>
    6	#include <signal.h>
    7	
    8	int main(int argc, char* argv[])
    9	{
   10	  pid_t n;
   11	  int i;
   12	
   13	  n = fork();
   14	  printf("fork returned %d, my pid is %d\n", n, getpid());
   15	  printf("pid %d: Hello\n", getpid());
   16	
   17	  return 0;
   18	}

What is the output?

How many processes?

Which process finishes first?

exit [34] [top]

	int exit(int status);

A SIGCHLD signal is sent to the parent process.
This process is terminated and all memory is released, any open files are closed
If the parent hasn't waited for this child, the process table entry remains and holds the exit status value and this process becomes a zombie.
If this process has any zombie children, they are deleted completely.
If this process has any children that have not terminated, each child is "adopted" by the init process.

execvp, execlp [35] [top]

	int execvp(char * filePath, char * args[])
        int execlp(char *filePath, char * arg0, char * arg1, ..., 0)

replace this process's program by the executable file given by filePath
for execvp the array should have 0 in the last entry; the non-zero entries are counted and passed to the main routine of the new program as the value of argc (number of command line arguments) and args is passed to main as the array of command line arguments
for execlp each command line arguments is passed as a separate parameter and the last parameter should be 0. The non-zero parameters are counted and passed as the value of argc to the main function of the new program.

Example (child)[36] [top]

    1	
    2	#include <stdio.h>
    3	#include <stdlib.h>
    4	#include <unistd.h>
    5	
    6	int main(int argc, char *argv[])
    7	{
    8	  char ch = 'X';
    9	  int i;
   10	
   11	  if ( argc >= 2 ) {
   12	    ch = argv[1][0];
   13	  }
   14	
   15	  printf("argc = %d\n", argc);
   16	  for(i = 0; i < argc; i++) {
   17	    printf("%s ", argv[i]);
   18	  }
   19	  printf("\n");
   20	
   21	  printf("pid %d: %c\n", getpid(), ch);
   22	  if (ch == 'X') {
   23	    return 0;
   24	  } else if (ch == 'A') {
   25	    return 1;
   26	  } else if (ch == 'B') {
   27	    return 2;
   28	  } else if (ch == 'C') {
   29	    return 3;
   30	  } else {
   31	    return 99;
   32	  }
   33	
   34	}

waitpid [37] [top]

      pid_t waitpid(pid_t pid, int *status, int option)

If option is 0,

If pid is -1, wait for any child process to terminate
If pid > 0 wait for a specific child to terminate
The integer pointed to by status will be filled with bits indicating how the process terminated and the exit value if the process terminated normally.

It is also possible to wait for a child that has stopped by setting option to the appropriate value.

If no unwaited for children exist, waitpid returns -1 without waiting.

Example (parent: fork + execlp)[38] [top]

    1	
    2	#include <stdio.h>
    3	#include <stdlib.h>
    4	#include <unistd.h>
    5	#include <sys/wait.h>
    6	#include <sys/types.h>
    7	#include <signal.h>
    8	
    9	int main(int argc, char* argv[])
   10	{
   11	  pid_t pid;
   12	
   13	  int status;
   14	  char str[32];
   15	  printf("Enter a letter: ");
   16	  scanf("%s", str);
   17	
   18	  if ( (n = fork()) == 0 ) {
   19	    execlp("printCh", "printCh", str, (char *) 0);
   20	    printf("Unable to execute printCh\n");
   21	    exit(0);
   22	  }
   23	
   24	  pid = waitpid(-1, &status, 0);
   25	  if ( WIFEXITED(status) ) {
   26	    printf("child %d exited with status = %d\n", pid,
   27	  WEXITSTATUS(status));
   28	  }
   29	
   30	  return 0;
   31	}

Execlp: Copies Executable Program into Memory?[39] [top]

Straight foward copy is time and space consuming
Different processes should be able to share the read-only code (not the data) segment.

Rather than copying a program into a process's address space, a better choice is to map the program file into the address space.

       void *mmap(void *start, size_t length, int prot, int flags,
                  int fd, off_t offset);

start - just a suggestion to the operating system. Use 0.
length - size of the file in bytes! (see stat or fstat functions)
prot
- PROT_EXEC Pages may be executed.
- PROT_READ Pages may be read.
- PROT_WRITE Pages may be written (and read).
- PROT_NONE Pages may not be accessed.
flags
- MAP_SHARED
- MAP_PRIVATE

Example (mmap)[40] [top]

(source)

    1	
    2	#include <stdio.h>
    3	#include <stdlib.h>
    4	#include <unistd.h>
    5	#include <fcntl.h>     // open and O_RDWR
    6	#include <sys/stat.h>  // struct stat
    7	#include <sys/mman.h>  // mmap, munmap, ...
    8	
    9	
   10	
   11	
   12	int main(int argc, char* argv[])
   13	{
   14	  int fd = open("test.bak", O_RDWR);
   15	
   16	  if ( fd < 0 ) {
   17	    fprintf(stderr, "Cannot open input file test.bak for update\n");
   18	    exit(1);
   19	  }
   20	  struct stat s;
   21	
   22	  fstat(fd, &s);
   23	
   24	  char *buf;
   25	
   26	  buf = mmap(0, s.st_size, PROT_WRITE, MAP_SHARED, fd, 0);
   27	
   28	  if ( buf == (void *) -1 ) {
   29	    fprintf(stderr, "mmap failed size = %lu\n", s.st_size);
   30	    exit(2);
   31	  }
   32	  int i;
   33	  printf("\nInitial test.bak contents:\n\n");
   34	  for(i = 0; i < s.st_size; i++) {
   35	    printf("%c", buf[i]);
   36	  }
   37	
   38	  // Reverse the characters in the file
   39	  char ch;
   40	  int j = s.st_size - 1;
   41	  for(i = 0; i < s.st_size / 2; i++, j--) {
   42	    // swap buf[i] and buf[j]
   43	    ch = buf[j];
   44	    buf[j] = buf[i];
   45	    buf[i] = ch;
   46	  }
   47	
   48	  printf("\n\nModified test.bak contents:\n\n");
   49	  for(i = 0; i < s.st_size; i++) {
   50	    printf("%c", buf[i]);
   51	  }
   52	  
   53	
   54	  munmap(buf, s.st_size);
   55	  close(fd);
   56	
   57	
   58	
   59	  return 0;
   60	}

CSC443/CSC343 Sep07

slide version

single file version

Contents

Administration[1] [top]

Overview[2] [top]

Minix[3] [top]

Perspectives[4] [top]

Application Programmer's Perspective[5] [top]

Services Needed[6] [top]

Signals[7] [top]

Example[8] [top]

Operating System Perspective: Design[9] [top]

Monolithic Design[10] [top]

Continued...[11] [top]

Layered Design[12] [top]

Monolithic Design Comments[13] [top]

Client Server Design[14] [top]

Process Management[15] [top]

Processes[16] [top]

Data Structures for Processes[17] [top]

Processor Mode[18] [top]

Process States[19] [top]

Process State Transitions[20] [top]

Process Table[21] [top]

System Calls[22] [top]

Minix Organization[23] [top]

Message Passing in Minix[24] [top]

System Tasks in Minix[25] [top]

Summary: System Calls in Minix[26] [top]

User Message Passing[27] [top]

Syscall[28] [top]

The sendrec function[29] [top]

Process Related System Calls[30] [top]

Signals[31] [top]

fork[32] [top]