CSC443/343 Project #4
Adding Profiling to Minix
Introduction
Profilers gather statistics about how much time is spent in which part of a program.
Once the program has been run and the statistics gathered, a report can produce valuable information about where to try to improve the efficiency of the program.
Gathering time measurements at the user level is not practical, and so the operating system needs to provide this service.
Although there are system calls related to timing, this involves message passing between the user level program and the system servers processes and system tasks.
Minix doesn't provide any separate profiling functionality.
Your job is to add profiling capability to Minix.
To profile a process, the operating system allocates a large number of counters, called bins, each corresponding to a small range of bytes in the program's code. The more counters, the finer the profiling granularity. You need to take advantage of clock interrupts to have the operating system to periodically check (many times per second) which instruction the process is executing, and increase the respective bin by one. Eventually, the bins' values should give a reasonably good approximation of the percentage of time the CPU spent within each corresponding address range of the program code.
You will also need to write a utility program, prof that runs a program while profiling it and then prints the profiling statistics for that run.
Part I: New Minix System Calls.
You have to modify the kernel to support profiling. You should define and implement two new system calls, profile and getprofile:
profile
int profile(int pid, int startaddr, int endaddr, int interval)
Starts profiling. Takes three parameters: the PID of the process to be profiled, and the lowest and highest addresses of the range of code to be profiled and the size of size of the intervals in this range.
- pid - the pid of the process to be profiled
- startaddr - the beginning instruction address of the range of the text segment to be profiled.
- endaddr - the ending instruction address of the range of the text segment to be profiled.
- interval - the size of the intervals into which the profiled
address range should be split.
E.g., if staraddr is 0 endaddr is 400, and interval is 64, the bins would be:
bin 0: 0 to 63 bin 1: 64 to 127 bin 2: 128 to 191 bin 3: 192 to 255 bin 4: 256 to 319 bin 5: 320 to 383 bin 6: 384 to 447 Note: Only address references up to address 400 will be counted in bin 6, the last bin.
There is a maximum of 1024 bins. If the interval size requested is too small, 1024 bins may be insufficient to cover the address range from startaddr to endaddr.
This should be treated as an error; that is, the number of bins is fixed at a maximum of 1024.
Calling profile with PID 0 (zero), should stop the ongoing profiling (if any).
Calling profile with PID != 0 while already profiling some process, should result in an error.
Calling profile with highest address 0, should profile the text segment from startaddr to the end of the text segment. In particular, if both startaddr and endaddr are 0, the entire text segment should be profiled.
Return values: Upon successful completion, profile returns 0. Otherwise, a value of -1 is returned and errno is set to indicate the error. The following error codes are defined for profile:
ESRCH: no process with the given pid exists (except for the special case of pid 0, which should never return any errors)
EFAULT: if there is some error related to the memory arguments: E.g,
- endaddr not 0 and startaddr > endaddr
- interval too small - would require more than the maximum number of bins.
- startaddr and/or endaddr not within the text segment
EBUSY: if the kernel is currently profiling some other process
EGENERIC: for any other type of failure (if any)
Note: Profiling of a process stops when profile with pid=0 is called, -OR- automatically by the kernel when the process terminates, -OR- automatically by the kernel when any bin counter exceeds its maximum value (i.e., a bin overflows).
getprofile
int getprofile(struct prof_data *p)
Retrieves the information collected by the last profiling, or the current profiling data if the kernel is currently profiling a process.
Takes one argument: the address of a struct to receive the profiling data of the last process profiled.
- p - a pointer to a struct prof_data to hold the profiling data of the last process profiled.
Return values: Upon successful completion, GETPROF returns 0. If an error occurs, GETPROF returns -1. Errno is set to indicate the error. The following error codes are defined:EFAULT: a memory error has occurred (e.g., trying to write to memory not belonging to the calling user process)
-
profile.h
The information gathered in the kernel should be stored in the the struct type struct prof_data. The user of getprofile function will need to know this type and so its declaration should be in the header file, profile.h, that has the declarations of the two system calls.
Here is the profile.h header to use with profile and getprofile:
#ifndef PROFILE_H #define PROFILE_H struct prof_data { int start_addr; int end_addr; int nr_bins; int bin_sz; unsigned long bin[1024]; } prof_data; int profile(int pid, int start_addr, int end_addr, interval); int int getprofile(struct prof_data *p); #endif
Part II: The profil Utility
The profil utility can be written using the new system calls. It behaves like a temporary shell that
- Starts profiling on a target program
- Executes the target program with profiling active
- When the target program finishes, retrieves the profile statistics
- Prints a report of the profile statistics
The profil utility should take several command line arguments:
profil [ options ] profiledProg [ profiledProg_args] options: (in any order) -s startaddr The starting address of the address range in the profiledProg to collect profile statistics. Default is 0 if omitted. -e endaddr The ending address of the address range in the profiledProg to collect profile statistics. Default is the end of the profiledProg's text segment if omitted. -i interval Profile statistics are to be accumulated for each successive block (called a bin) of length 'interval' bytes within the profiled address range beginning with the startaddr. Default is the minimum possible so that the address range in the profiledProg can be covered by the maximum number of bin counters. Note: For -s and -e, addresses may be given in decimal or in hex. To specify the address in hex the address should be preceded by 0x. The interval size specified by -i may also be given in either decimal or hex. Examples: profil fib profile the fib program start address = 0 end address = end of text segment for fib bin size = smallest possible bin size so that the profiled segment can be covered by the fixed number of bins available in the kernel. profil -i 0x100 fib profile the fib program start address = 0 end address = end of text segment for fib bin size = 0x100 = 256 bytes profil -s 0x80 -e 0x1000 -i 128 fib profile the fib program start address 0x80 = 128 end address 0x1000 = 4096 bin size = 128 bytes profil -i 128 -s 0x80 -e 0x1000 fib same as previous example profil -i 0x100 /bin/ls -Rl / execute and profile the ls command with ls command line arguments -RL / The -l ls option means long listing of each directory entry. The -R option means recursively list subdirectories The / command argument means ls should begin in the root directory. start address = 0 (in the ls text segment) end address = end of the text segment for ls bin size = 0x100 = 256 bytes
A Test Program
For short simple programs, the profile statistics can easily be 0 since clock ticks are relatively infrequent (only 1 each 16.67ms).
So to observe statistics in user code, test programs must be written take some time.
A second problem is in knowing what part of the program corresponds to what addresses.
Here is a sample program that solves the first problem by:
- Calling a the fibonnaci function fib that is doubly recursive for a moderately large value (38).
- Calling a function loop that repeatedly computes square roots (without calling a library function), and
- Calling a function overflow that increments a counter until integer overflow occurs
The second problem is addressed by having the function print the addresses of main, the 3 functions above, and the address of printf.
Here is the program fib.c (source):
1 #include <stdio.h> 2 #include <stdlib.h> 3 #include <math.h> 4 5 double fib(int n) 6 { 7 double r; 8 9 if (n == 0 || n == 1 ) { 10 return 1.0; 11 } else { 12 return fib(n-1) + fib(n-2); 13 } 14 } 15 16 void loop(int n) 17 { 18 int i, k; 19 double x; 20 double d, a; 21 22 for(i = 0; i < n; i++) { 23 for(k = 0; k < n; k++) { 24 x = (i < k )? k : i + 1; 25 a = x/2; 26 d = a*a - x; 27 d = ( d < 0 ) ? -d : d; 28 while( d > .0001 ) { 29 a = (x/a + a)/2.0; 30 d = a * a - x; 31 d = (d < 0)? -d : d; 32 } 33 } 34 } 35 } 36 void overflow(void) 37 { 38 int n = 1; 39 while( n > 0 ) { 40 n++; 41 } 42 } 43 int main(int argc, char *argv[]) 44 { 45 double x; 46 int n = 38; 47 x = fib(n); 48 loop(1000); 49 overflow(); 50 printf("fib(%d) = %f\n", n, x); 51 printf("fib.main: %p\n", main); 52 printf("fib.fib: %p\n", fib); 53 printf("fib.loop: %p\n", loop); 54 printf("fib.overflow: %p\n", overflow); 55 printf("fib.printf: %p\n", printf); 56 return 0; 57 }
Sample Run of profil
Here are two sample runs of profil, profiling the fib program.
$ profil -s 0x50 -e 0x300 -i 128 fib ************************* Output of 'fib' **************************** fib(38) = 63245986.000000 fib.main: 0x247 fib.fib: 0x50 fib.loop: 0x99 fib.overflow: 0x22d fib.printf: 0x1794 ************************* Profile Statistics **************************** 1024 maximum bins Profiled Text Size = 688 = 0x2b0 688/1024 = 0.672 Minimum bytes/bin: 1 Bin Size: 128 Bins used: 6 Last bin represents last 80 bytes start_addr = 0x50, end_addr = 0x300, nr_bins = 6, bin_sz = 128 Total ticks in all bins: 296 ....................................maximum percentage interval: 70% 0050-00cf (28.04%) : ************ (83) 00d0-014f ( 0.68%) : * (2) 0150-01cf ( 2.03%) : * (6) 01d0-024f (69.26%) : **************************** (205) Percentage time in shown intervals: 100.00 $ profil -s 0x50 -e 0x300 -i 0x49 fib ************************* Output of 'fib' **************************** fib(38) = 63245986.000000 fib.main: 0x247 fib.fib: 0x50 fib.loop: 0x99 fib.overflow: 0x22d fib.printf: 0x1794 ************************* Profile Statistics **************************** 1024 maximum bins Profiled Text Size = 688 = 0x2b0 txtSize/1024 = 0.672 Minimum bytes/bin: 1 Bin Size: 73 Bins used: 10 Last bin represents last 42 bytes fib start_addr = 0x50, end_addr = 0x300, nr_bins = 10, bin_sz = 73 Total of all bins: 305 ..................................................maximum percentage interval: 69% interval = 73 *** ====> prf.bin[0] = 76 0050-0098 (24.92%) : ********** (76) 00e2-012a ( 0.66%) : * (2) 0174-01bc ( 3.28%) : ** (10) 01bd-0205 ( 2.30%) : * (7) 0206-024e (68.85%) : **************************** (210) Percentage time in shown intervals: 100.00
System Call Implementation Notes
1. User Library Functions
The profile and getprofile user library functions follow the same pattern we've seen before, passing messages to the process manager.
Assuming PROFILE and GETPROF are selected as the system call numbers for the two system calls, profile and getprofile follow the usual pattern:
int profile(int pid, int startaddr, int endaddr, int interval) { message m; int ret; /* put pid, startaddr, endaddr, and interval into message m using an appropriate message type. */ ... ret = _syscall(MM, PROFILE, &m); return ret; } int getprofile(struct prof_data *p) { message m; int ret; /* put p into message m using an appropriate message type. */ ... ret = _syscall(MM, GETPROF, &m); return ret; }
2. Process Manager's do_prof
The process manager's do_prof function:
- Extract pid, startaddr, endaddr, and interval from the message.
- If pid is 0, set the endpoint to NONE.
- Otherwise,
- find the corresponding endpoint value.
- If pid is not found or the endpoint value is not ok, return
ESRCH.
The pm_isokendpt can be used to check that the endpoint value corresponds to a pid that is valid and in use.
- If endaddr is 0, use the mem_map to set endaddr to the last address in the text segment.
- Check that 0 <= startaddress <= endadddress. If not return EFAULT.
- call sys_prof with the endpoint of the process to be
profiled, the startaddr, the endaddr, and the interval:
ret = sys_prof(endpt, startaddr, endaddr, interval);
- Return ret
3. The sys_prof Server Library Function
The sys_prof function should just package the 4 parameters into a message and send it to the system task:
... ret = _taskcall(SYSTASK, SYS_PROF, &m); return ret;
4. System's New Data
Most of the kernel's data structures other than the process table are defined in /usr/src/kernel/table.c.
However, these data structures are declared in /usr/src/kernel/glo.h so other files can just include glo.h.
The table.c file also includes glo.h, but uses the EXTERN trick so that in table.c, these declarations become definitions.
-
New data is needed for profiling in the both the system and clock tasks.
- struct prof_data prof_data; - structure for recording profiling data.
- int profiling; - 0 means not profilling; 1, profiling.
- int profiled_endpt; - If profiling, the endpoint of the profiled process.
This data can be declared in /usr/kernel/glo.h again using the EXTERN trick which will make it be defined in /usr/kernel/table.c.
It will then be accessible to both the system task (for the do_prof and do_getprof functions) and also in the clock task (for the clock_handler function).
5. The System Task's do_prof Function
- Extract the endpoint, startaddr, endaddr, and interval ( = bin size) from the message.
- If endpoint is NONE, stop profiling and return OK.
- Check if already profiling, if so return EBUSY.
- Start profiling if endpoint is valid. (Use isokendpt to check validity and simultaneously get the proc number.)
- Zero out the bin counters:
physaddr = umap_local(proc_addr(-2), D, (vir_bytes) &prof_data, sizeof(prof_data)); if ( physaddr == 0 ) { return EGENERIC; } phys_memset(physaddr, 0, sizeof(prof_data)); Note 1: -2 is the proc number (and endpoint value) for the System Task and proc_addr converts this to a pointer to the entry in the proc table. Note 2: umap_local should be used to get the physical address of the prof_data so that the fast phys_memset function can be used to reinitialize the prof_data struct's bin counters to all 0's when profiling a new process.
- Next initialize the prof_data's start_addr and end_addr.
- Calculate the minimum bin size so that 1024 bins will cover the the address range end_addr - start_addr and set the prof_data bin_sz to this minimum size.
- If the interval is > minimum bin_size, set the bin_sz = interval.
- Calculate the prof_data.nr_bins, making sure to round up
if necessary:
int ar = prof_data.end_addr - prof_data.start_addr; prof_data.nr_bins = ar/prof_data.bin_sz; if ( ar > prof_data.bin_sz * prof_data.nr_bins ) prof_data.nr_bins++;
- Turn profiling on.
- return OK.
Since there are a fixed number of 1024 bins, profiling granularity will depend on the length of the address range being profiled. Each bin should represent the minimum number of bytes so that the whole range being profiled can be contained in the range covered by no more than 1024 bins.
Concretely, here are a few examples of the bit of calculation required for the bin size and number of bins used.
Divide the length of the address range to be profiled by 1024 and round up to find how many bytes fall in one bin. I.e. to profile a region of 40008 bytes of code,
Integer division: prof_data.bin_sz = 40008 / 1024 = 39 (39.0703125, but integer division discards the decimal part) Round up: Since prof_data.bin_sz * 1024 = 39 * 1024 = 39936 < 40008 we must increase the bin size by 1. So final prof_data.bin_sz = 40.
So we get 40 bytes per bin is the minimum bin size to cover the 40008 address range. How many bins are used?
Divide the address range by the bin size and round up:
prof_dat.nr_bins = 40008/40 = 1000 (1000.2, but integer division discards the .2) 40 * 1000 = 40000 < 40008, So prof_data.nr_bins++; and final value of prof_data.nr_bins is 1001 Summary: Bin size is 40 Number of bins actually used is 1001 First 1000 bins cover 40 * 1000 = the first 40000 bytes, The last bin used only covers the last 8 bytes of the address range.
If the size of the code to be profiled were 1025 bytes, then only the first 513 bins should be used, each bin representing 2 bytes (the 513th bin representing just 1 byte), and the last 511 bins would remain unused.
6. The getprofile System Call
This is similar to getting the mem_maps except that the data is in the kernel (system task), rather than in a server process.
Note that the caller of getprofile is typically not the same as the process that was profiled.
- getprofile sends message to process manager with the address of a user's struct prof_data.
- do_getprof in process manager extracts the same address, addr, and
calls sys_getprof with the endpoint of the caller and this address:
ret = sys_getprof(who_e, addr); return ret;
- do_getprof in the system task:
- Extract the address of the caller's struct prof_data from the message.
- Initialize two struct vir_addr's one for the source
(System task's prof_data struct) and one for the caller's
prof_data struct.
struct vir_addr { int proc_nr_e; /* process endpoint */ int segment; /* Data segment: D=1 */ vir_bytes offset; /* virtual address */ };
The endpoint for the system task is SYSTEM = -2:
src.proc_nr_e = SYSTEM;
- Use the vir_copy function in the kernel to copy
the struct kernel's prof_data struct to the
caller's. Return EFAULT if vir_copy fails:
ret = virtual_copy(&src, &dst, sizeof(prof_data)); if ( ret < 0 ) { return EFAULT; } else { return OK; }
7. Collecting Profile Data: The clock_handler Function
The clock_handler function in /usr/src/kernel/clock.c executes on each clock tick (once every 16.67ms):
The relevant section of clock_handler for profiling after a few instructions is:
PUBLIC void clock_handler(irq_hook_t *hook) { ... /* Update user and system accounting times. Charge the current process for * user time. If the current process is not billable, that is, if a non-user * process is running, charge the billable process for system time as well. * Thus the unbillable process' user time is the billable user's system time. */ proc_ptr->p_user_time += ticks; if (priv(proc_ptr)->s_flags & PREEMPTIBLE) { proc_ptr->p_ticks_left -= ticks; /* ADD CODE HERE FOR COLLECTING DATA IF PROFILING */ } if (! (priv(proc_ptr)->s_flags & BILLABLE)) { bill_ptr->p_sys_time += ticks; bill_ptr->p_ticks_left -= ticks; } ... }
- Check if profiling is on (profiling == 1). If not do nothing.
- If profiling
- if the profiled_endpt is for a process that is no longer alive (use the isokendpt function) or it is alive, but it's p_rts_flags entry in the proc table has NO_PRIORITY set, then stop profiling. NO_PRIORITY is set when a process is stopped.
- Otherwise check if the current process (pointed to by proc_ptr) is the the process being profiled (check if proc_ptr's endpoint is the same as the profiled_endpoint value.)
- If the current process is the profiled process:
- get the program counter address from the proc table entry: addr = proc_ptr->reg.pc;
- check if this address is in the range to profile specified by the prof_data struct.
- If so, calculate the bin number this address belongs to.
- Check if incrementing the bin counter will result in overflow of the counter. If so, turn off profiling.
- Otherwise, if overflow will not occur, increment the bin counter.
Notes:
In the kernel, use isokendpt to check that an endpoint value corresponds to a live process and if so, convert the endpoint value to a process number.
After getting the process number, it is more reliable to use proc_addr to convert the process number to a pointer to the process table entry for the process:
For example:
int proc_nr; struct proc *ptr; if ( isokendpt(endpt, &proc_nr) ) { /* process is alive */ ptr = proc_addr(proc_nr); /** Now access some proc table entry for the process using ptr **/ }
Implementation of profil Utility
The new system calls profile and getprofile can be used to implement the profil utility:
profil [ options ] profiledProg [ profiledProg_args] options: (in any order) -s startaddr -e endaddr -i interval
Outline
Process the command line arguments to get:
- startaddress
- endaddress
- name (path) of the program to profile
- command line arguments if any for the profiled program
Read the header record of the profiled program file.
Check that it is really an executable file (check the magic number), print error and exit if not.
Fork a child to execute the program to be profiled.
Parent calls waitpid to wait for the profiled program to terminate.
Meanwhile, the child should check addresses. If the endaddress is 0, assign the text size - 1. The text size is in the a_text member of the struct exec header record. Then check make the address checks. If they fail set errno = EFAULT, call the new profile system call function with its own pid and the startaddr, endaddr, and interval that were determined from profil's command line arguments.
If profile returned failure (-1) or the address check failed the child should use a switch statement on errno to determine the reason and print an appropriate error message and exit with a special exit value, e.g. 0xCC.
Possible messages:
errno value Message ESRCH Profil Error: No process for profiled program EFAULT Profil Error: Address range invalid EBUSY Profil Error: Already profiling another process EGENERIC Profil Error: Generic error If profile's return is ok (not < 0), the child should call execvp with the profiled program name and an array of the command line arguments for the profiled program.
If execvp returns, it failed. In this case the child should print a message that the program can't be executed and exit with a value of 0xCC
The parent needs to see if the child exited with the 0xCC value or not. If the child exited with value 0xCC, an error message should have been printed by the child and profiling data was not gathered.
waitpid(child, &status, 0); if ( WEXITSTATUS(status) != 0xCC ) { printRange(); }
What to turn in.
- A copy of the functions you added to Minix to implement the two system calls, profile and getprofile.
- A copy of your modified clock_handler function.
- Sample output runs using the profil utility:
- profil -e 0x20 -s 0x300 fib
- profil -i 128 fib
while already running the same command in another Minix's virtual terminals (or other login terminal).
- profil fib.c
- profil -i 256 -e 0x8000 fib
- profil -i 512 -s 0x200 -e 0 fib
- profil -s 0x50 -i 0x1a -e 0x247 fib
- profil -s 0x50 -e 0x247 -i 0x49 fib
An easy way to do this:
Run the test, and use vmw to copy the output to the clipboard.
$ profil -s 0x50 -i 0x1a -e 0x247 fib | vmw co
Then in the host machine open an editor and paste.
A readme.txt file. Give a brief explanation of the output you observed for each of the tests above in 3.
If your profiling additions are not working correctly or not complete, give (a) a brief discussion of what you have added that you think is correct and (b) what problems remain and if you have ideas about solving these problems.