CSC443/343 Project 4

CSC443/343 Project #4

Adding Profiling to Minix

Introduction

Profilers gather statistics about how much time is spent in which part of a program.

Once the program has been run and the statistics gathered, a report can produce valuable information about where to try to improve the efficiency of the program.

Gathering time measurements at the user level is not practical, and so the operating system needs to provide this service.

Although there are system calls related to timing, this involves message passing between the user level program and the system servers processes and system tasks.

Minix doesn't provide any separate profiling functionality.

Your job is to add profiling capability to Minix.

To profile a process, the operating system allocates a large number of counters, called bins, each corresponding to a small range of bytes in the program's code. The more counters, the finer the profiling granularity. You need to take advantage of clock interrupts to have the operating system to periodically check (many times per second) which instruction the process is executing, and increase the respective bin by one. Eventually, the bins' values should give a reasonably good approximation of the percentage of time the CPU spent within each corresponding address range of the program code.

You will also need to write a utility program, prof that runs a program while profiling it and then prints the profiling statistics for that run.

Part I: New Minix System Calls.

You have to modify the kernel to support profiling. You should define and implement two new system calls, profile and getprofile:

profile
```
int profile(int pid, int startaddr, int endaddr, int interval)
```
Starts profiling. Takes three parameters: the PID of the process to be profiled, and the lowest and highest addresses of the range of code to be profiled and the size of size of the intervals in this range.
- pid - the pid of the process to be profiled
- startaddr - the beginning instruction address of the range of the text segment to be profiled.
- endaddr - the ending instruction address of the range of the text segment to be profiled.
- interval - the size of the intervals into which the profiled address range should be split.
  E.g., if staraddr is 0 endaddr is 400, and interval is 64, the bins would be:
```
bin 0:  0 to 63
bin 1: 64 to 127
bin 2: 128 to 191
bin 3: 192 to 255
bin 4: 256 to 319
bin 5: 320 to 383
bin 6: 384 to 447  

Note: Only address references up to address 400 
will be counted in bin 6, the last bin.
```
  There is a maximum of 1024 bins. If the interval size requested is too small, 1024 bins may be insufficient to cover the address range from startaddr to endaddr.
  
  This should be treated as an error; that is, the number of bins is fixed at a maximum of 1024.
Calling profile with PID 0 (zero), should stop the ongoing profiling (if any).
Calling profile with PID != 0 while already profiling some process, should result in an error.
Calling profile with highest address 0, should profile the text segment from startaddr to the end of the text segment. In particular, if both startaddr and endaddr are 0, the entire text segment should be profiled.

Return values: Upon successful completion, profile returns 0. Otherwise, a value of -1 is returned and errno is set to indicate the error. The following error codes are defined for profile:
- ESRCH: no process with the given pid exists (except for the special case of pid 0, which should never return any errors)
- EFAULT: if there is some error related to the memory arguments: E.g,
  - endaddr not 0 and startaddr > endaddr
  - interval too small - would require more than the maximum number of bins.
  - startaddr and/or endaddr not within the text segment
- EBUSY: if the kernel is currently profiling some other process
- EGENERIC: for any other type of failure (if any)
Note: Profiling of a process stops when profile with pid=0 is called, -OR- automatically by the kernel when the process terminates, -OR- automatically by the kernel when any bin counter exceeds its maximum value (i.e., a bin overflows).
getprofile
```
int getprofile(struct prof_data *p)
```
Retrieves the information collected by the last profiling, or the current profiling data if the kernel is currently profiling a process.

Takes one argument: the address of a struct to receive the profiling data of the last process profiled.
- p - a pointer to a struct prof_data to hold the profiling data of the last process profiled.
Return values: Upon successful completion, GETPROF returns 0. If an error occurs, GETPROF returns -1. Errno is set to indicate the error. The following error codes are defined:

EFAULT: a memory error has occurred (e.g., trying to write to memory not belonging to the calling user process)
profile.h

The information gathered in the kernel should be stored in the the struct type struct prof_data. The user of getprofile function will need to know this type and so its declaration should be in the header file, profile.h, that has the declarations of the two system calls.

Here is the profile.h header to use with profile and getprofile:
```
#ifndef PROFILE_H
#define PROFILE_H
struct prof_data
{
  int start_addr;
  int end_addr;
  int nr_bins;
  int bin_sz;
  unsigned long bin[1024];
} prof_data;

int profile(int pid, int start_addr, int end_addr, interval);
int int getprofile(struct prof_data *p);

#endif
```

Part II: The profil Utility

The profil utility can be written using the new system calls. It behaves like a temporary shell that

Starts profiling on a target program
Executes the target program with profiling active
When the target program finishes, retrieves the profile statistics
Prints a report of the profile statistics

The profil utility should take several command line arguments:


profil [ options ] profiledProg [ profiledProg_args]

options: (in any order)

   -s startaddr 
         The starting address of the address range in the profiledProg
         to collect profile statistics. Default is 0 if omitted.


   -e endaddr 
         The ending address of the address range in the profiledProg
         to collect profile statistics. Default is the end of the
         profiledProg's text segment if omitted.

   -i interval
         Profile statistics are to be accumulated for each successive
         block (called a bin) of length 'interval' bytes within the
         profiled address range beginning with the startaddr. Default
         is the minimum possible so that the address range in the
         profiledProg can be covered by the maximum number of bin
         counters.

Note: For -s and -e, addresses may be given in decimal or in hex. To
specify the address in hex the address should be preceded by 0x.
The interval size specified by -i may also be given in either decimal or hex.

Examples:

   profil fib
         profile the fib program
         start address = 0
         end address = end of text segment for fib
         bin size =  smallest possible bin size so that
                     the profiled segment can be covered by the fixed
                     number of bins available in the kernel.

   profil -i 0x100 fib
         profile the fib program
         start address = 0
         end address = end of text segment for fib
         bin size = 0x100 = 256 bytes

   profil -s 0x80 -e 0x1000 -i 128 fib
         profile the fib program
         start address 0x80 = 128
         end address 0x1000 = 4096
         bin size = 128 bytes

   profil -i 128 -s 0x80 -e 0x1000 fib
         same as previous example

   profil -i 0x100 /bin/ls -Rl /
         execute and profile the ls command with ls command line arguments -RL /
	   The -l ls option means long listing of each directory entry.
	   The -R option means recursively list subdirectories
	   The / command argument means ls should begin in the root
	   directory.
         start address = 0 (in the ls text segment)
         end address = end of the text segment for ls
         bin size = 0x100 = 256 bytes

A Test Program

For short simple programs, the profile statistics can easily be 0 since clock ticks are relatively infrequent (only 1 each 16.67ms).

So to observe statistics in user code, test programs must be written take some time.

A second problem is in knowing what part of the program corresponds to what addresses.

Here is a sample program that solves the first problem by:

Calling a the fibonnaci function fib that is doubly recursive for a moderately large value (38).
Calling a function loop that repeatedly computes square roots (without calling a library function), and
Calling a function overflow that increments a counter until integer overflow occurs

The second problem is addressed by having the function print the addresses of main, the 3 functions above, and the address of printf.

Here is the program fib.c (source):

    1	#include <stdio.h>
    2	#include <stdlib.h>
    3	#include <math.h>
    4	
    5	double fib(int n)
    6	{
    7	  double r;
    8	
    9	  if (n == 0 || n == 1 ) {
   10	    return 1.0;
   11	  } else {
   12	    return fib(n-1) + fib(n-2);
   13	  }
   14	}
   15	
   16	void loop(int n)
   17	{
   18	  int i, k;
   19	  double x;
   20	  double d, a;
   21	
   22	  for(i = 0; i < n; i++) {
   23	    for(k = 0; k < n; k++) {
   24	      x = (i < k )? k : i + 1;
   25	      a = x/2;
   26	      d = a*a - x;
   27	      d = ( d < 0 ) ? -d : d;
   28	      while( d > .0001 ) {
   29		a = (x/a + a)/2.0;
   30		d = a * a - x;
   31		d = (d < 0)? -d : d;
   32	      }
   33	    }
   34	  }
   35	}
   36	void overflow(void)
   37	{
   38	  int n = 1;
   39	  while( n > 0 ) {
   40	    n++;
   41	  }
   42	}
   43	int main(int argc, char *argv[])
   44	{
   45	  double x;
   46	  int n = 38;
   47	  x = fib(n);
   48	  loop(1000);
   49	  overflow();
   50	  printf("fib(%d) = %f\n", n, x);
   51	  printf("fib.main: %p\n", main);
   52	  printf("fib.fib: %p\n", fib);
   53	  printf("fib.loop: %p\n", loop);
   54	  printf("fib.overflow: %p\n", overflow);
   55	  printf("fib.printf: %p\n", printf);
   56	  return 0;
   57	}

Sample Run of profil

Here are two sample runs of profil, profiling the fib program.


$ profil -s 0x50 -e 0x300 -i 128 fib

************************* Output of 'fib' ****************************
fib(38) = 63245986.000000
fib.main: 0x247
fib.fib: 0x50
fib.loop: 0x99
fib.overflow: 0x22d
fib.printf: 0x1794

************************* Profile Statistics ****************************
1024 maximum bins
Profiled Text Size = 688 = 0x2b0
688/1024 = 0.672
Minimum bytes/bin: 1
Bin Size: 128
Bins used: 6
Last bin represents last 80 bytes

start_addr = 0x50, end_addr = 0x300, nr_bins = 6, bin_sz = 128
Total ticks in all bins: 296
....................................maximum percentage interval: 70%

 0050-00cf (28.04%) : ************                              (83)
 00d0-014f ( 0.68%) : *                                         (2)
 0150-01cf ( 2.03%) : *                                         (6)
 01d0-024f (69.26%) : ****************************              (205)
Percentage time in shown intervals: 100.00


$ profil -s 0x50 -e 0x300 -i 0x49 fib

************************* Output of 'fib' ****************************
fib(38) = 63245986.000000
fib.main: 0x247
fib.fib: 0x50
fib.loop: 0x99
fib.overflow: 0x22d
fib.printf: 0x1794

************************* Profile Statistics ****************************
1024 maximum bins
Profiled Text Size = 688 = 0x2b0
txtSize/1024 = 0.672
Minimum bytes/bin: 1
Bin Size: 73
Bins used: 10
Last bin represents last 42 bytes
fib
start_addr = 0x50, end_addr = 0x300, nr_bins = 10, bin_sz = 73
Total of all bins: 305
..................................................maximum percentage
interval: 69%
interval = 73
*** ====> prf.bin[0] = 76
 0050-0098 (24.92%) : **********                                (76)
 00e2-012a ( 0.66%) : *                                         (2)
 0174-01bc ( 3.28%) : **                                        (10)
 01bd-0205 ( 2.30%) : *                                         (7)
 0206-024e (68.85%) : ****************************              (210)
Percentage time in shown intervals: 100.00

System Call Implementation Notes

1. User Library Functions

The profile and getprofile user library functions follow the same pattern we've seen before, passing messages to the process manager.

Assuming PROFILE and GETPROF are selected as the system call numbers for the two system calls, profile and getprofile follow the usual pattern:

int profile(int pid, int startaddr, int endaddr, int interval)
{
  message m;
  int ret;

  /* put pid, startaddr, endaddr, and interval into message m
     using an appropriate message type.
  */
  ...
  ret = _syscall(MM, PROFILE, &m);
  return ret;
}

int getprofile(struct prof_data *p)
{
  message m;
  int ret;

  /* put p into message m using an appropriate message type. */
  ...
  ret = _syscall(MM, GETPROF, &m);
  return ret;
}

2. Process Manager's do_prof

The process manager's do_prof function:

Extract pid, startaddr, endaddr, and interval from the message.
If pid is 0, set the endpoint to NONE.
Otherwise,
1. find the corresponding endpoint value.
2. If pid is not found or the endpoint value is not ok, return ESRCH.
  The pm_isokendpt can be used to check that the endpoint value corresponds to a pid that is valid and in use.
3. If endaddr is 0, use the mem_map to set endaddr to the last address in the text segment.
4. Check that 0 <= startaddress <= endadddress. If not return EFAULT.
call sys_prof with the endpoint of the process to be profiled, the startaddr, the endaddr, and the interval:
```
ret = sys_prof(endpt, startaddr, endaddr, interval);
```
Return ret

3. The sys_prof Server Library Function

The sys_prof function should just package the 4 parameters into a message and send it to the system task:

  ...
  ret = _taskcall(SYSTASK, SYS_PROF, &m);
  return ret;

4. System's New Data

Most of the kernel's data structures other than the process table are defined in /usr/src/kernel/table.c.

However, these data structures are declared in /usr/src/kernel/glo.h so other files can just include glo.h.

The table.c file also includes glo.h, but uses the EXTERN trick so that in table.c, these declarations become definitions.

New data is needed for profiling in the both the system and clock tasks.
1. struct prof_data prof_data; - structure for recording profiling data.
2. int profiling; - 0 means not profilling; 1, profiling.
3. int profiled_endpt; - If profiling, the endpoint of the profiled process.
This data can be declared in /usr/kernel/glo.h again using the EXTERN trick which will make it be defined in /usr/kernel/table.c.

It will then be accessible to both the system task (for the do_prof and do_getprof functions) and also in the clock task (for the clock_handler function).

5. The System Task's do_prof Function

Extract the endpoint, startaddr, endaddr, and interval ( = bin size) from the message.
If endpoint is NONE, stop profiling and return OK.
Check if already profiling, if so return EBUSY.
Start profiling if endpoint is valid. (Use isokendpt to check validity and simultaneously get the proc number.)

Zero out the bin counters:

  physaddr = umap_local(proc_addr(-2), D, (vir_bytes) &prof_data,
  sizeof(prof_data));
  if ( physaddr == 0 ) {
    return EGENERIC;
  }
  phys_memset(physaddr, 0, sizeof(prof_data));

Note 1: -2 is the proc number (and endpoint value) for the System Task and proc_addr converts
this to a pointer to the entry in the proc table.

Note 2: umap_local should be used to get the physical address of the
prof_data so that the fast phys_memset function can be used to
reinitialize the prof_data struct's bin counters to all 0's when
profiling a new process.

Next initialize the prof_data's start_addr and end_addr.
Calculate the minimum bin size so that 1024 bins will cover the the address range end_addr - start_addr and set the prof_data bin_sz to this minimum size.
If the interval is > minimum bin_size, set the bin_sz = interval.

Calculate the prof_data.nr_bins, making sure to round up if necessary:

  int ar = prof_data.end_addr - prof_data.start_addr;
  prof_data.nr_bins = ar/prof_data.bin_sz;
  if ( ar > prof_data.bin_sz * prof_data.nr_bins ) 
    prof_data.nr_bins++;

Turn profiling on.
return OK.

Since there are a fixed number of 1024 bins, profiling granularity will depend on the length of the address range being profiled. Each bin should represent the minimum number of bytes so that the whole range being profiled can be contained in the range covered by no more than 1024 bins.

Concretely, here are a few examples of the bit of calculation required for the bin size and number of bins used.

Divide the length of the address range to be profiled by 1024 and round up to find how many bytes fall in one bin. I.e. to profile a region of 40008 bytes of code,

Integer division:
prof_data.bin_sz = 40008 / 1024 = 39    
(39.0703125, but integer division discards the decimal part)

Round up:
Since prof_data.bin_sz * 1024 = 39 * 1024 = 39936 < 40008
we must increase the bin size by 1. So final prof_data.bin_sz = 40.

So we get 40 bytes per bin is the minimum bin size to cover the 40008 address range. How many bins are used?

Divide the address range by the bin size and round up:

prof_dat.nr_bins = 40008/40 = 1000  (1000.2, but integer division
                                     discards the .2)

40 * 1000 = 40000 < 40008, 
So prof_data.nr_bins++; and final value of prof_data.nr_bins is 1001

Summary: 
Bin size is 40
Number of bins actually used is 1001
First 1000 bins cover 40 * 1000 = the first 40000 bytes,
The last bin used only covers the last 8 bytes of the address range.

If the size of the code to be profiled were 1025 bytes, then only the first 513 bins should be used, each bin representing 2 bytes (the 513th bin representing just 1 byte), and the last 511 bins would remain unused.

6. The getprofile System Call

This is similar to getting the mem_maps except that the data is in the kernel (system task), rather than in a server process.

Note that the caller of getprofile is typically not the same as the process that was profiled.

getprofile sends message to process manager with the address of a user's struct prof_data.
do_getprof in process manager extracts the same address, addr, and calls sys_getprof with the endpoint of the caller and this address:
```
ret = sys_getprof(who_e, addr);
return ret;
```
do_getprof in the system task:
1. Extract the address of the caller's struct prof_data from the message.
2. Initialize two struct vir_addr's one for the source (System task's prof_data struct) and one for the caller's prof_data struct.
```
struct vir_addr {
  int proc_nr_e;  /* process endpoint */
  int segment;    /* Data segment: D=1 */
  vir_bytes offset; /* virtual address */
};
```
  The endpoint for the system task is SYSTEM = -2:
```
  src.proc_nr_e = SYSTEM;
```
3. Use the vir_copy function in the kernel to copy the struct kernel's prof_data struct to the caller's. Return EFAULT if vir_copy fails:
```
  ret = virtual_copy(&src, &dst, sizeof(prof_data));
  if ( ret < 0 ) {
    return EFAULT;
  } else {
    return OK;
  }
```

7. Collecting Profile Data: The clock_handler Function

The clock_handler function in /usr/src/kernel/clock.c executes on each clock tick (once every 16.67ms):

The relevant section of clock_handler for profiling after a few instructions is:

PUBLIC void clock_handler(irq_hook_t *hook)
{
   ...

  /* Update user and system accounting times. Charge the current process for
   * user time. If the current process is not billable, that is, if a non-user
   * process is running, charge the billable process for system time as well.
   * Thus the unbillable process' user time is the billable user's system time.
   */
  proc_ptr->p_user_time += ticks;
  if (priv(proc_ptr)->s_flags & PREEMPTIBLE) {
    proc_ptr->p_ticks_left -= ticks;

    /* ADD CODE HERE FOR COLLECTING DATA IF PROFILING */
    
  }
  if (! (priv(proc_ptr)->s_flags & BILLABLE)) {
      bill_ptr->p_sys_time += ticks;
      bill_ptr->p_ticks_left -= ticks;
  }
  ...
}

Check if profiling is on (profiling == 1). If not do nothing.
If profiling
1. if the profiled_endpt is for a process that is no longer alive (use the isokendpt function) or it is alive, but it's p_rts_flags entry in the proc table has NO_PRIORITY set, then stop profiling. NO_PRIORITY is set when a process is stopped.
2. Otherwise check if the current process (pointed to by proc_ptr) is the the process being profiled (check if proc_ptr's endpoint is the same as the profiled_endpoint value.)
3. If the current process is the profiled process:
  1. get the program counter address from the proc table entry: addr = proc_ptr->reg.pc;
  2. check if this address is in the range to profile specified by the prof_data struct.
  3. If so, calculate the bin number this address belongs to.
  4. Check if incrementing the bin counter will result in overflow of the counter. If so, turn off profiling.
  5. Otherwise, if overflow will not occur, increment the bin counter.
Notes:
- In the kernel, use isokendpt to check that an endpoint value corresponds to a live process and if so, convert the endpoint value to a process number.
- After getting the process number, it is more reliable to use proc_addr to convert the process number to a pointer to the process table entry for the process:
For example:
```
    int proc_nr;
    struct proc *ptr;
    if ( isokendpt(endpt, &proc_nr) ) {
        /* process is alive */
       ptr = proc_addr(proc_nr);
       
      /**  Now access some proc table entry for the process using ptr **/
    }
```

Implementation of profil Utility

The new system calls profile and getprofile can be used to implement the profil utility:

profil [ options ] profiledProg [ profiledProg_args]

options: (in any order)

   -s startaddr 

   -e endaddr 

   -i interval

Outline

Process the command line arguments to get:
- startaddress
- endaddress
- name (path) of the program to profile
- command line arguments if any for the profiled program
Read the header record of the profiled program file.
Check that it is really an executable file (check the magic number), print error and exit if not.
Fork a child to execute the program to be profiled.
Parent calls waitpid to wait for the profiled program to terminate.
Meanwhile, the child should check addresses. If the endaddress is 0, assign the text size - 1. The text size is in the a_text member of the struct exec header record. Then check make the address checks. If they fail set errno = EFAULT, call the new profile system call function with its own pid and the startaddr, endaddr, and interval that were determined from profil's command line arguments.

If profile returned failure (-1) or the address check failed the child should use a switch statement on errno to determine the reason and print an appropriate error message and exit with a special exit value, e.g. 0xCC.

Possible messages:

errno value	Message
ESRCH	Profil Error: No process for profiled program
EFAULT	Profil Error: Address range invalid
EBUSY	Profil Error: Already profiling another process
EGENERIC	Profil Error: Generic error

If profile's return is ok (not < 0), the child should call execvp with the profiled program name and an array of the command line arguments for the profiled program.
If execvp returns, it failed. In this case the child should print a message that the program can't be executed and exit with a value of 0xCC
The parent needs to see if the child exited with the 0xCC value or not. If the child exited with value 0xCC, an error message should have been printed by the child and profiling data was not gathered.
```
    waitpid(child, &status, 0);
    if ( WEXITSTATUS(status) != 0xCC ) {
      printRange();
    }
```

What to turn in.

A copy of the functions you added to Minix to implement the two system calls, profile and getprofile.
A copy of your modified clock_handler function.
Sample output runs using the profil utility:
1. profil -e 0x20 -s 0x300 fib
2. profil -i 128 fib
  
  while already running the same command in another Minix's virtual terminals (or other login terminal).
3. profil fib.c
4. profil -i 256 -e 0x8000 fib
5. profil -i 512 -s 0x200 -e 0 fib
6. profil -s 0x50 -i 0x1a -e 0x247 fib
7. profil -s 0x50 -e 0x247 -i 0x49 fib
An easy way to do this:

Run the test, and use vmw to copy the output to the clipboard.
```
      $ profil -s 0x50 -i 0x1a -e 0x247 fib  | vmw co
```
Then in the host machine open an editor and paste.
A readme.txt file. Give a brief explanation of the output you observed for each of the tests above in 3.

If your profiling additions are not working correctly or not complete, give (a) a brief discussion of what you have added that you think is correct and (b) what problems remain and if you have ideas about solving these problems.