CSC443/343 Project #4

Adding Profiling to Minix


Introduction

Profilers gather statistics about how much time is spent in which part of a program.

Once the program has been run and the statistics gathered, a report can produce valuable information about where to try to improve the efficiency of the program.

Gathering time measurements at the user level is not practical, and so the operating system needs to provide this service.

Although there are system calls related to timing, this involves message passing between the user level program and the system servers processes and system tasks.

Minix doesn't provide any separate profiling functionality.

Your job is to add profiling capability to Minix.

To profile a process, the operating system allocates a large number of counters, called bins, each corresponding to a small range of bytes in the program's code. The more counters, the finer the profiling granularity. You need to take advantage of clock interrupts to have the operating system to periodically check (many times per second) which instruction the process is executing, and increase the respective bin by one. Eventually, the bins' values should give a reasonably good approximation of the percentage of time the CPU spent within each corresponding address range of the program code.

You will also need to write a utility program, prof that runs a program while profiling it and then prints the profiling statistics for that run.

 

Part I: New Minix System Calls.

You have to modify the kernel to support profiling. You should define and implement two new system calls, profile and getprofile:

Part II: The profil Utility

The profil utility can be written using the new system calls. It behaves like a temporary shell that

  1. Starts profiling on a target program
  2. Executes the target program with profiling active
  3. When the target program finishes, retrieves the profile statistics
  4. Prints a report of the profile statistics

The profil utility should take several command line arguments:


profil [ options ] profiledProg [ profiledProg_args]

options: (in any order)

   -s startaddr 
         The starting address of the address range in the profiledProg
         to collect profile statistics. Default is 0 if omitted.


   -e endaddr 
         The ending address of the address range in the profiledProg
         to collect profile statistics. Default is the end of the
         profiledProg's text segment if omitted.

   -i interval
         Profile statistics are to be accumulated for each successive
         block (called a bin) of length 'interval' bytes within the
         profiled address range beginning with the startaddr. Default
         is the minimum possible so that the address range in the
         profiledProg can be covered by the maximum number of bin
         counters.

Note: For -s and -e, addresses may be given in decimal or in hex. To
specify the address in hex the address should be preceded by 0x.
The interval size specified by -i may also be given in either decimal or hex.

Examples:

   profil fib
         profile the fib program
         start address = 0
         end address = end of text segment for fib
         bin size =  smallest possible bin size so that
                     the profiled segment can be covered by the fixed
                     number of bins available in the kernel.

   profil -i 0x100 fib
         profile the fib program
         start address = 0
         end address = end of text segment for fib
         bin size = 0x100 = 256 bytes

   profil -s 0x80 -e 0x1000 -i 128 fib
         profile the fib program
         start address 0x80 = 128
         end address 0x1000 = 4096
         bin size = 128 bytes

   profil -i 128 -s 0x80 -e 0x1000 fib
         same as previous example

   profil -i 0x100 /bin/ls -Rl /
         execute and profile the ls command with ls command line arguments -RL /
	   The -l ls option means long listing of each directory entry.
	   The -R option means recursively list subdirectories
	   The / command argument means ls should begin in the root
	   directory.
         start address = 0 (in the ls text segment)
         end address = end of the text segment for ls
         bin size = 0x100 = 256 bytes

A Test Program

For short simple programs, the profile statistics can easily be 0 since clock ticks are relatively infrequent (only 1 each 16.67ms).

So to observe statistics in user code, test programs must be written take some time.

A second problem is in knowing what part of the program corresponds to what addresses.

Here is a sample program that solves the first problem by:

The second problem is addressed by having the function print the addresses of main, the 3 functions above, and the address of printf.

Here is the program fib.c (source):

    1	#include <stdio.h>
    2	#include <stdlib.h>
    3	#include <math.h>
    4	
    5	double fib(int n)
    6	{
    7	  double r;
    8	
    9	  if (n == 0 || n == 1 ) {
   10	    return 1.0;
   11	  } else {
   12	    return fib(n-1) + fib(n-2);
   13	  }
   14	}
   15	
   16	void loop(int n)
   17	{
   18	  int i, k;
   19	  double x;
   20	  double d, a;
   21	
   22	  for(i = 0; i < n; i++) {
   23	    for(k = 0; k < n; k++) {
   24	      x = (i < k )? k : i + 1;
   25	      a = x/2;
   26	      d = a*a - x;
   27	      d = ( d < 0 ) ? -d : d;
   28	      while( d > .0001 ) {
   29		a = (x/a + a)/2.0;
   30		d = a * a - x;
   31		d = (d < 0)? -d : d;
   32	      }
   33	    }
   34	  }
   35	}
   36	void overflow(void)
   37	{
   38	  int n = 1;
   39	  while( n > 0 ) {
   40	    n++;
   41	  }
   42	}
   43	int main(int argc, char *argv[])
   44	{
   45	  double x;
   46	  int n = 38;
   47	  x = fib(n);
   48	  loop(1000);
   49	  overflow();
   50	  printf("fib(%d) = %f\n", n, x);
   51	  printf("fib.main: %p\n", main);
   52	  printf("fib.fib: %p\n", fib);
   53	  printf("fib.loop: %p\n", loop);
   54	  printf("fib.overflow: %p\n", overflow);
   55	  printf("fib.printf: %p\n", printf);
   56	  return 0;
   57	}

Sample Run of profil

Here are two sample runs of profil, profiling the fib program.


$ profil -s 0x50 -e 0x300 -i 128 fib

************************* Output of 'fib' ****************************
fib(38) = 63245986.000000
fib.main: 0x247
fib.fib: 0x50
fib.loop: 0x99
fib.overflow: 0x22d
fib.printf: 0x1794

************************* Profile Statistics ****************************
1024 maximum bins
Profiled Text Size = 688 = 0x2b0
688/1024 = 0.672
Minimum bytes/bin: 1
Bin Size: 128
Bins used: 6
Last bin represents last 80 bytes

start_addr = 0x50, end_addr = 0x300, nr_bins = 6, bin_sz = 128
Total ticks in all bins: 296
....................................maximum percentage interval: 70%

 0050-00cf (28.04%) : ************                              (83)
 00d0-014f ( 0.68%) : *                                         (2)
 0150-01cf ( 2.03%) : *                                         (6)
 01d0-024f (69.26%) : ****************************              (205)
Percentage time in shown intervals: 100.00


$ profil -s 0x50 -e 0x300 -i 0x49 fib

************************* Output of 'fib' ****************************
fib(38) = 63245986.000000
fib.main: 0x247
fib.fib: 0x50
fib.loop: 0x99
fib.overflow: 0x22d
fib.printf: 0x1794

************************* Profile Statistics ****************************
1024 maximum bins
Profiled Text Size = 688 = 0x2b0
txtSize/1024 = 0.672
Minimum bytes/bin: 1
Bin Size: 73
Bins used: 10
Last bin represents last 42 bytes
fib
start_addr = 0x50, end_addr = 0x300, nr_bins = 10, bin_sz = 73
Total of all bins: 305
..................................................maximum percentage
interval: 69%
interval = 73
*** ====> prf.bin[0] = 76
 0050-0098 (24.92%) : **********                                (76)
 00e2-012a ( 0.66%) : *                                         (2)
 0174-01bc ( 3.28%) : **                                        (10)
 01bd-0205 ( 2.30%) : *                                         (7)
 0206-024e (68.85%) : ****************************              (210)
Percentage time in shown intervals: 100.00

System Call Implementation Notes

1. User Library Functions

The profile and getprofile user library functions follow the same pattern we've seen before, passing messages to the process manager.

Assuming PROFILE and GETPROF are selected as the system call numbers for the two system calls, profile and getprofile follow the usual pattern:

int profile(int pid, int startaddr, int endaddr, int interval)
{
  message m;
  int ret;

  /* put pid, startaddr, endaddr, and interval into message m
     using an appropriate message type.
  */
  ...
  ret = _syscall(MM, PROFILE, &m);
  return ret;
}

int getprofile(struct prof_data *p)
{
  message m;
  int ret;

  /* put p into message m using an appropriate message type. */
  ...
  ret = _syscall(MM, GETPROF, &m);
  return ret;
}

2. Process Manager's do_prof

The process manager's do_prof function:

  1. Extract pid, startaddr, endaddr, and interval from the message.
  2. If pid is 0, set the endpoint to NONE.
  3. Otherwise,
    1. find the corresponding endpoint value.
    2. If pid is not found or the endpoint value is not ok, return ESRCH.

      The pm_isokendpt can be used to check that the endpoint value corresponds to a pid that is valid and in use.

    3. If endaddr is 0, use the mem_map to set endaddr to the last address in the text segment.
    4. Check that 0 <= startaddress <= endadddress. If not return EFAULT.
  4. call sys_prof with the endpoint of the process to be profiled, the startaddr, the endaddr, and the interval:
    ret = sys_prof(endpt, startaddr, endaddr, interval);
    
  5. Return ret

3. The sys_prof Server Library Function

The sys_prof function should just package the 4 parameters into a message and send it to the system task:

  ...
  ret = _taskcall(SYSTASK, SYS_PROF, &m);
  return ret;

4. System's New Data

Most of the kernel's data structures other than the process table are defined in /usr/src/kernel/table.c.

However, these data structures are declared in /usr/src/kernel/glo.h so other files can just include glo.h.

The table.c file also includes glo.h, but uses the EXTERN trick so that in table.c, these declarations become definitions.

  1. New data is needed for profiling in the both the system and clock tasks.

    1. struct prof_data prof_data; - structure for recording profiling data.
    2. int profiling; - 0 means not profilling; 1, profiling.
    3. int profiled_endpt; - If profiling, the endpoint of the profiled process.

    This data can be declared in /usr/kernel/glo.h again using the EXTERN trick which will make it be defined in /usr/kernel/table.c.

    It will then be accessible to both the system task (for the do_prof and do_getprof functions) and also in the clock task (for the clock_handler function).

5. The System Task's do_prof Function

  1. Extract the endpoint, startaddr, endaddr, and interval ( = bin size) from the message.
  2. If endpoint is NONE, stop profiling and return OK.
  3. Check if already profiling, if so return EBUSY.
  4. Start profiling if endpoint is valid. (Use isokendpt to check validity and simultaneously get the proc number.)
  5. Zero out the bin counters:
      physaddr = umap_local(proc_addr(-2), D, (vir_bytes) &prof_data,
      sizeof(prof_data));
      if ( physaddr == 0 ) {
        return EGENERIC;
      }
      phys_memset(physaddr, 0, sizeof(prof_data));
    
    Note 1: -2 is the proc number (and endpoint value) for the System Task and proc_addr converts
    this to a pointer to the entry in the proc table.
    
    Note 2: umap_local should be used to get the physical address of the
    prof_data so that the fast phys_memset function can be used to
    reinitialize the prof_data struct's bin counters to all 0's when
    profiling a new process. 
    
  6. Next initialize the prof_data's start_addr and end_addr.
  7. Calculate the minimum bin size so that 1024 bins will cover the the address range end_addr - start_addr and set the prof_data bin_sz to this minimum size.
  8. If the interval is > minimum bin_size, set the bin_sz = interval.
  9. Calculate the prof_data.nr_bins, making sure to round up if necessary:
      int ar = prof_data.end_addr - prof_data.start_addr;
      prof_data.nr_bins = ar/prof_data.bin_sz;
      if ( ar > prof_data.bin_sz * prof_data.nr_bins ) 
        prof_data.nr_bins++;
    
    
  10. Turn profiling on.
  11. return OK.

Since there are a fixed number of 1024 bins, profiling granularity will depend on the length of the address range being profiled. Each bin should represent the minimum number of bytes so that the whole range being profiled can be contained in the range covered by no more than 1024 bins.

Concretely, here are a few examples of the bit of calculation required for the bin size and number of bins used.

Divide the length of the address range to be profiled by 1024 and round up to find how many bytes fall in one bin. I.e. to profile a region of 40008 bytes of code,

Integer division:
prof_data.bin_sz = 40008 / 1024 = 39    
(39.0703125, but integer division discards the decimal part)

Round up:
Since prof_data.bin_sz * 1024 = 39 * 1024 = 39936 < 40008
we must increase the bin size by 1. So final prof_data.bin_sz = 40.

So we get 40 bytes per bin is the minimum bin size to cover the 40008 address range. How many bins are used?

Divide the address range by the bin size and round up:

prof_dat.nr_bins = 40008/40 = 1000  (1000.2, but integer division
                                     discards the .2)

40 * 1000 = 40000 < 40008, 
So prof_data.nr_bins++; and final value of prof_data.nr_bins is 1001

Summary: 
Bin size is 40
Number of bins actually used is 1001
First 1000 bins cover 40 * 1000 = the first 40000 bytes,
The last bin used only covers the last 8 bytes of the address range.

If the size of the code to be profiled were 1025 bytes, then only the first 513 bins should be used, each bin representing 2 bytes (the 513th bin representing just 1 byte), and the last 511 bins would remain unused.

6. The getprofile System Call

This is similar to getting the mem_maps except that the data is in the kernel (system task), rather than in a server process.

Note that the caller of getprofile is typically not the same as the process that was profiled.

  1. getprofile sends message to process manager with the address of a user's struct prof_data.
  2. do_getprof in process manager extracts the same address, addr, and calls sys_getprof with the endpoint of the caller and this address:
    ret = sys_getprof(who_e, addr);
    return ret;
    
  3. do_getprof in the system task:
    1. Extract the address of the caller's struct prof_data from the message.
    2. Initialize two struct vir_addr's one for the source (System task's prof_data struct) and one for the caller's prof_data struct.
      struct vir_addr {
        int proc_nr_e;  /* process endpoint */
        int segment;    /* Data segment: D=1 */
        vir_bytes offset; /* virtual address */
      };
      
      

      The endpoint for the system task is SYSTEM = -2:

        src.proc_nr_e = SYSTEM;
      
    3. Use the vir_copy function in the kernel to copy the struct kernel's prof_data struct to the caller's. Return EFAULT if vir_copy fails:
      
      
        ret = virtual_copy(&src, &dst, sizeof(prof_data));
        if ( ret < 0 ) {
          return EFAULT;
        } else {
          return OK;
        }
      
      

7. Collecting Profile Data: The clock_handler Function

The clock_handler function in /usr/src/kernel/clock.c executes on each clock tick (once every 16.67ms):

The relevant section of clock_handler for profiling after a few instructions is:

PUBLIC void clock_handler(irq_hook_t *hook)
{
   ...

  /* Update user and system accounting times. Charge the current process for
   * user time. If the current process is not billable, that is, if a non-user
   * process is running, charge the billable process for system time as well.
   * Thus the unbillable process' user time is the billable user's system time.
   */
  proc_ptr->p_user_time += ticks;
  if (priv(proc_ptr)->s_flags & PREEMPTIBLE) {
    proc_ptr->p_ticks_left -= ticks;

    /* ADD CODE HERE FOR COLLECTING DATA IF PROFILING */
    
  }
  if (! (priv(proc_ptr)->s_flags & BILLABLE)) {
      bill_ptr->p_sys_time += ticks;
      bill_ptr->p_ticks_left -= ticks;
  }
  ...
}
  1. Check if profiling is on (profiling == 1). If not do nothing.
  2. If profiling
    1. if the profiled_endpt is for a process that is no longer alive (use the isokendpt function) or it is alive, but it's p_rts_flags entry in the proc table has NO_PRIORITY set, then stop profiling. NO_PRIORITY is set when a process is stopped.
    2. Otherwise check if the current process (pointed to by proc_ptr) is the the process being profiled (check if proc_ptr's endpoint is the same as the profiled_endpoint value.)
    3. If the current process is the profiled process:
      1. get the program counter address from the proc table entry: addr = proc_ptr->reg.pc;
      2. check if this address is in the range to profile specified by the prof_data struct.
      3. If so, calculate the bin number this address belongs to.
      4. Check if incrementing the bin counter will result in overflow of the counter. If so, turn off profiling.
      5. Otherwise, if overflow will not occur, increment the bin counter.

    Notes:

    • In the kernel, use isokendpt to check that an endpoint value corresponds to a live process and if so, convert the endpoint value to a process number.

    • After getting the process number, it is more reliable to use proc_addr to convert the process number to a pointer to the process table entry for the process:

    For example:

        int proc_nr;
        struct proc *ptr;
        if ( isokendpt(endpt, &proc_nr) ) {
            /* process is alive */
           ptr = proc_addr(proc_nr);
           
          /**  Now access some proc table entry for the process using ptr **/
        }
    

Implementation of profil Utility

The new system calls profile and getprofile can be used to implement the profil utility:

profil [ options ] profiledProg [ profiledProg_args]

options: (in any order)

   -s startaddr 

   -e endaddr 

   -i interval

Outline

  1. Process the command line arguments to get:

    • startaddress
    • endaddress
    • name (path) of the program to profile
    • command line arguments if any for the profiled program
  2. Read the header record of the profiled program file.

  3. Check that it is really an executable file (check the magic number), print error and exit if not.

  4. Fork a child to execute the program to be profiled.

  5. Parent calls waitpid to wait for the profiled program to terminate.

  6. Meanwhile, the child should check addresses. If the endaddress is 0, assign the text size - 1. The text size is in the a_text member of the struct exec header record. Then check make the address checks. If they fail set errno = EFAULT, call the new profile system call function with its own pid and the startaddr, endaddr, and interval that were determined from profil's command line arguments.

  7. If profile returned failure (-1) or the address check failed the child should use a switch statement on errno to determine the reason and print an appropriate error message and exit with a special exit value, e.g. 0xCC.

    Possible messages:

    errno value Message
    ESRCH Profil Error: No process for profiled program
    EFAULT Profil Error: Address range invalid
    EBUSY Profil Error: Already profiling another process
    EGENERIC Profil Error: Generic error
  8. If profile's return is ok (not < 0), the child should call execvp with the profiled program name and an array of the command line arguments for the profiled program.

  9. If execvp returns, it failed. In this case the child should print a message that the program can't be executed and exit with a value of 0xCC

  10. The parent needs to see if the child exited with the 0xCC value or not. If the child exited with value 0xCC, an error message should have been printed by the child and profiling data was not gathered.

        waitpid(child, &status, 0);
        if ( WEXITSTATUS(status) != 0xCC ) {
          printRange();
        }
    
    

What to turn in.

  1. A copy of the functions you added to Minix to implement the two system calls, profile and getprofile.
  2. A copy of your modified clock_handler function.
  3. Sample output runs using the profil utility:
    1. profil -e 0x20 -s 0x300 fib
    2. profil -i 128 fib

      while already running the same command in another Minix's virtual terminals (or other login terminal).

    3. profil fib.c
    4. profil -i 256 -e 0x8000 fib
    5. profil -i 512 -s 0x200 -e 0 fib
    6. profil -s 0x50 -i 0x1a -e 0x247 fib
    7. profil -s 0x50 -e 0x247 -i 0x49 fib

    An easy way to do this:

    Run the test, and use vmw to copy the output to the clipboard.

          $ profil -s 0x50 -i 0x1a -e 0x247 fib  | vmw co
    

    Then in the host machine open an editor and paste.

  4. A readme.txt file. Give a brief explanation of the output you observed for each of the tests above in 3.

    If your profiling additions are not working correctly or not complete, give (a) a brief discussion of what you have added that you think is correct and (b) what problems remain and if you have ideas about solving these problems.