typedef struct {
endpoint_t m_source; /* who sent the message */
int m_type; /* what kind of message is it */
union {
mess_1 m_m1;
mess_2 m_m2;
mess_3 m_m3;
mess_4 m_m4;
mess_5 m_m5;
mess_7 m_m7;
mess_8 m_m8;
mess_6 m_m6;
mess_9 m_m9;
} m_u;
} message;
Minix message structures are C struct's with two fixed
members for the message type and the message source and a union
member.
The union member is a union of 9 different struct types.
Instead Minix3 uses 3 virtual segments for each process:
Text (code) segment
Data segment
Stack segment
Unlike pages, segments are not all the same
size.
A segment is specified by its beginning address and its length.
A Minix3 process's segments are loaded completely into memory;
that is, there are no page faults or segment faults to deal
with.
However, a segments do have virtual address and can be
loaded anywhere in physical memory where there is room for its
size. The beginning physical address for the segment must be
recorded in a table.
This table is represented by the C struct mem_map:
/* Memory map for local text, stack, data segments. */
struct mem_map {
vir_clicks mem_vir;/* virtual address */
phys_clicks mem_phys;/* physical address */
vir_clicks mem_len;/* length */
};
Note: whtat is a click? It is determined in /usr/src/include/minix/const.h:
#define CLICK_SIZE 4096/* unit in which memory is allocated */
The pm's process table begins with an array of 3
mem_map's:
EXTERN struct mproc {
struct mem_map mp_seg[NR_LOCAL_SEGS]; /* points to text, data, stack */
char mp_exitstatus;/* storage for status when process exits */
char mp_sigstatus;/* storage for signal # for killed procs */
pid_t mp_pid;/* process id */
...
mp_seg[0] is for the text segment,
mp_seg[1] is for the data segment,
mp_seg[2] is for the stack segment.
There are #defines for 0,1, and 2 to help avoid getting
the segments mixed up:
#define NR_LOCAL_SEGS 3/* # local segments per process (fixed) */
#define T 0/* proc[i].mem_map[T] is for text */
#define D 1/* proc[i].mem_map[D] is for data */
#define S 2/* proc[i].mem_map[S] is for stack */
EXTERN struct proc proc[NR_TASKS + NR_PROCS];/* process table */
Compare this with pm's process table mproc:
struct mproc mproc[NR_PROCS];
NR_TASKS and NR_PROCS are defined:
#define NR_PROCS 100
#define NR_TASKS 4
/* Kernel tasks. These all run in the same address space. */
#define IDLE -4/* runs when no one else can run */
#define CLOCK -3/* alarms and other clock functions */
#define SYSTEM -2/* request system functionality */
#define KERNEL -1/* pseudo-process for IPC and scheduling */
/* Number of tasks. Note that NR_PROCS is defined in
<minix/config.h>. */
#define NR_TASKS 4
This just means that the kernel manages 4 more processes than
the process manager. That is, the process manager only manages the
processes that run in user mode. The kernel manages all processes
including the tasks which run in kernel mode.
Translating between Kernel's proc and PM's mproc[6] [top]
A bit of book keeping is necessary to match up the entries in
the two tables.
Since the first 4 entries (at indices 0 - 3) of the kernel's table describe the
tasks, the entry at index 4 of the kernel's table corresponds to
entry 0 of the pm's table.
Since NR_TASKS is the number of tasks (4), in general,
kernel table pm's tableproc[k + NR_TASKS] corresponds to mproc[k]
When a hardware interrupt occurs or a trap instruction is
executed the registers for the RUNNING process are saved in this
member of the process's proc table entry.
Conversely, when a process becomes scheduled and is dispatched,
the the values from this entry are reloaded into the cpu registers.
However, endpoint values are in general larger numbers and
depend on a "generation" value.
For a valid endpoint, the generation number is:
generation = endpoint / M
(same constant M)
M is the generation "size"
M = NR_TASKS + _MAX_MAGIC_PROC + 1
NR_TASKS = 4
_MAX_MAGIC_PROC = 0x8ace
(See /usr/src/include/minix/endpoint.h )
When a process is forked, a generation value one
greater than the parent's generation is used to compute the
child's endpoint value together with the index value of a free
slot in the process table.
The endpoint for a process can be set to an invalid
endpoint value (E.g., NONE). For example, the kernel may need to
prevent this process from receiving any more messages
if it is terminating.
Here is a portion of the proc struct. The process
table managed by the kernel tasks is also called
proc.
It is an array of these proc structs of size NR_TASKS
+ NR_PROCS.
#ifndef PROC_H
#define PROC_H
/* Here is the declaration of the process table. It contains all process
* data, including registers, flags, scheduling priority, memory map,
* accounting, message passing (IPC) information, and so on.
*
* Many assembly code routines reference fields in it. The offsets to these
* fields are defined in the assembler include file sconst.h. When changing
* struct proc, be sure to change sconst.h to match.
*/
#include <minix/com.h>
#include <minix/portio.h>
#include "const.h"
#include "priv.h"
struct proc {
struct stackframe_s p_reg; /* process' registers saved in stack frame */
struct segframe p_seg; /* segment descriptors */
proc_nr_t p_nr; /* number of this process (for fast access) */
struct priv *p_priv; /* system privileges structure */
short p_rts_flags; /* process is runnable only if zero */
short p_misc_flags; /* flags that do not suspend the process */
char p_priority; /* current scheduling priority */
char p_max_priority; /* maximum scheduling priority */
char p_ticks_left; /* number of scheduling ticks left */
char p_quantum_size; /* quantum size in ticks */
struct mem_map p_memmap[NR_LOCAL_SEGS]; /* memory map (T, D, S) */
struct pagefault p_pagefault; /* valid if PAGEFAULT in p_rts_flags set */
struct proc *p_nextpagefault; /* next on PAGEFAULT chain */
clock_t p_user_time; /* user time in ticks */
clock_t p_sys_time; /* sys time in ticks */
clock_t p_virt_left; /* number of ticks left on virtual timer */
clock_t p_prof_left; /* number of ticks left on profile timer */
struct proc *p_nextready; /* pointer to next ready process */
struct proc *p_caller_q; /* head of list of procs wishing to send */
struct proc *p_q_link; /* link to next proc wishing to send */
int p_getfrom_e; /* from whom does process want to receive? */
int p_sendto_e; /* to whom does process want to send? */
sigset_t p_pending; /* bit map for pending kernel signals */
char p_name[P_NAME_LEN]; /* name of the process, including \0 */
endpoint_t p_endpoint; /* endpoint number, generation-aware */
message p_sendmsg; /* Message from this process if SENDING */
message p_delivermsg; /* Message for this process if MF_DELIVERMSG */
vir_bytes p_delivermsg_vir; /* Virtual addr this proc wants message at */
vir_bytes p_delivermsg_lin; /* Linear addr this proc wants message at */
/* A few more members to save state if a handler detects it has to continue later */
};
For the some system calls we need to get data located in a
server (or in the kernel) that is larger than any message structure.
It is easy to transfer small amounts of information between the
process manager and a user process through a system call without
directly copying the data by including the data in the message
that is sent/received.
The implementation of message passing has
to do the copying between the process manager's memory and the
user's memory.
But that copying was not explicit in the code we
write for adding a mygetpid or getnc system
call.
But the amount of data we need for the profiling cannot be
handled the same way. The message structs are too small to hold the
profile data.
However, one or more of the message variants has pointer
members.
So a user system call can insert the address of a
user struct into the message being passed to the process manager
and the process manager can insert this same pointer into a
message sent to the system task.
But this pointer value is a virtual address in the
user's address space, not a physical address.
Similarly, the address of the collected data is a virtual
address in the sysyem task's address space.
The kernel task has the privilege to copy data from any part of
physical memory to any other part. To copy from virtual address A
in process 1's address space to a virtual address B in process 2's
address space, the kernel task needs to know
the virtual address A and the process (1)
the virtual address B and the process (2)
the number of bytes to copy
The system task could then copy the data as follows
translate virtual address A to a physical address
translate virtual address B to a physical address
copies the specified bytes from physical address to physical
address
Minix 3 uses segments without paging. A process has 3
local segments:
text
data
stack
Memory allocation in Minix 3 is not dynamic. A process gets a
fixed amount of physical memory. E.g., the illustration shows the
segments and gap for a process that has been allocated a block of
size 10K from physical address 200K up to 210K.
Text
\__ 200K (0x32000)
Data
\__ 203K (0x32c00)
Gap
\__ 207K (0x33c00)
Stack
\__ 208K (0x34000)
\__ 210K (0x34800)
Virtual
Physical
Length
Text
0
0xC8
3
Data
0
0xCB
4
Stack
0x5
0xD0
1
In Minix 3 a contiguous block of physical memory is allocated for a
process. This includes the three segments and the gap. During
execution the stack will "grow" into the gap reducing the gap
size during function calls, but no new physical memory is
allocated. Similarly in Minix 3, if a process invokes
malloc no new physical memory is
actually allocated to the process. Minix 3 just adjusts the size
of the data segment within the physical block of memory that is already
allocated to the process.
For each process its entry in the process manager's process
table and also its entry in the kernel's process table contains a
data structure to keep track of the segments for that process:
Addresses and segment lengths are both expressed in clicks
rather than bytes. Check the definition in the source code to
determine the number of bytes in a click. (I found it to be
different than stated in the text.) Memory is allocated in
integral multiples of this unit. Given access to a process's mem_map
structure, a segment, and a virtual address in that segment, it straightforward to
to translate the virtual address to a physical address.
For a virtual address in a segment, its physical address (in bytes)
is:
phys_addr = segment_base_phys_addr + byte_offset
The segment_base_phys_addr is in the mem_map, but it
is in clicks instead of bytes.
The byte_offset is equal to the difference between the byte virtual address and the
segment_base_virtual_addr (in bytes). The segment_base_virtual_addr is also
in the mem_map, but also in clicks instead of bytes.
Putting everything in terms of clicks and the data in the mem_map:
va: Virtual address in bytes
csize: Click size
bva: Segment base virtual address in clicks
bpa: Segment base physical address in clicks
pa: Physical Address in bytes
Copying data from any physical memory location to any other
physical memory location is possible when the protection level of
the processor (2 bits in the PSW register in the Intel IA32 processor) is at its most
privileged of 4 possible levels.
The function:
PUBLIC void phys_copy(phys_bytes source,
phys_bytes destination,
phys_bytes bytecount);
is compiled into the kernel image.
User level processes and server processes can only use this
function indirectly through system/task calls.
For example, the process manager can make a task call to
to a kernel routine that uses this function. That is, there is a
function in the system library that server processes can call to
request the kernel to copy bytes from one virtual memory location to another:
PUBLIC int sys_vircopy(src_proc, src_seg, src_vir,
dst_proc, dst_seg, dst_vir, bytes)
int src_proc; /* source process */
int src_seg; /* source memory segment */
vir_bytes src_vir; /* source virtual address */
int dst_proc; /* destination process */
int dst_seg; /* destination memory segment */
vir_bytes dst_vir; /* destination virtual address */
phys_bytes bytes; /* how many bytes */
The source and destination processes parameters
are needed in order for the kernel routine that this system library
function sends a message to will be able to translate the virtual
addresses to physical addresses and then use the phys_copy
function to implement the task call.
The process manager manages (its view) of the process table and
so can provide process endpoint values, process number values, etc. to
identify the source and destination processes in the message sent to
the system task by sys_vircopy. It is the endpoint values that
sys_vircopy is expecting or SELF for the source and/or
destination processes.
Copying data from one process's address space to another
process's address space presents some challenges and seems like a
general feature that may be needed by the operating system.
But more concretely, a user may want data, say from the process
manager's process table and the data is too big or has too many
parts to fit into a message. So in this case we would need to be
able to copy data from the process manager (from its process table)
into the awaiting variable or struct in the user's process.
We would need to implement a system call from the user to the
process manager passing the address of the user's struct in a
message.
The do_xxxx function in the process manager that
handles the system can in most circumstances do the copying from its
own address space into the user process by using the
sys_vircopy function rather than duplicating the work of
translating virtual to physical addresses that sys_vircopy
does.
The system task's do_copy function handles the task call
initiated by sys_vircopy.
The sys_vircopy function just hides the task call from a
server process (e.g., process manager) to the
system task (the kernel). If the kernel needs to copy data to a
user process, the sys_vircopy function is of no further use. The
kernel already has privileges and access to functions to do the copying.
The function the system task already uses to handle the
sys_vircopy is
PUBLIC int virtual_copy(src_addr, dst_addr, bytes)
struct vir_addr *src_addr; /* source virtual address */
struct vir_addr *dst_addr; /* destination virtual address */
vir_bytes bytes; /* # of bytes to copy */
This function can be used for the system task handler for any
new task call that copies data from one process to another.
To call virtual_copy you have to first construct the
parameters. First, the two struct vir_addr parameters specifying
source virtual address and the destination virtual address need to be
built. Here is that struct's definition:
struct vir_addr {
int proc_nr_e;
int segment;
vir_bytes offset;
};
proc_nr_e: endpoint value of the process
segment: This should be T, D, or S (0,1, or 2)
T for Text, D for Data, S for Stack
offset: Byte offset from the beginning of the segment.
These must of course be set correctly in order for virtual_copy
to work. Several tests are made in that function and the most
likely result is an error code return value if the parameters are
not set up as expected.
To illustrate some points to consider when using
virtual_copy suppose that you are writing a system call for
a user process to get a copy of the kernel's mem_map struct for
the process. That mem_map is in the process table at the
entry for the user process.
You will need to write one system call from the user process to
the process manager and then a task call from the process manager
to the system task.
The user cannot send its endpoint value, but the process
manager can get this value from the pm's process table. The
process manager can then insert the user's endpoint value in a
message used in the taskcall to the system task.
The user also needs to insert information in the message used
in the system call for a struct mem_map declared in the
user's address space to receive the kernel's copy. The virtual
address of this struct is specified by the segment
and by the offset within the segment. So this requires a
bit of reflection. The user can clearly pass the address of its
struct mem_map in the system call (i.e. it can be inserted
into the message sent in the system call to the process
manager). Is this the offset? Which segment is it? Data,
Stack, or possibly even the Text segment?
The user's struct mem_map is data, not text, but that
doesn't completely nail down which segment it is in. The user
could declare this struct locally in a function or globally at
file scope. For the first case, it would be in the stack
segment and for the second, the data segment.
As we saw in the example calculation of the physical address,
the stack segment may not begin at virtual address 0. The
offset from the beginning of the segment is the
difference between the virtual address and the beginning virtual
address of the segment. So a non-zero beginning segment address
would need to be subtracted to get that offset.
Virtual
Physical
Length
Text
0
0x151A
2
Data
0
0x151C
1
Stack
0x20
0x153C
1
For virtual address 0x00020DE8, for a process with this
mem_map, in which segment is this address located? We first
have to convert the byte address to clicks. Assuming the
click size is 4096 = 212 = 0x1000, dividing by 4096 is the same
as right shifting by 12 bits. So 0x00020DE8 is in click
0x0020 or 0x20.
So this address is in the Stack segment, which starts at
virtual address 0x20 * click_size = 0x20000.
The offset in the Stack segment is then clearly
0x20DE8 - 0x20000 = 0xDE8.
The Stack segment begins at physical address 0x153C *
click_size = 0x153C000, so the physical address of 0x20DE8 is
0x153C000 + 0xDE8 = 0x153CDE8.
This we saw before. But determining which segment to use in
this way depends on how Minix 3 manages memory. That is, there are
Minix 3 segments and there are Intel segments with hardware
protection. As far as Intel is concerned the entire memory for a
Minix 3 process is just one Intel segment. The Minix 3 data
segment is followed by the gap, which is followed by the
stack. The virtual addresses will for adjacent Minix 3
segments will be consecutive because of this.
Suppose we made a mistake and said that 0x00020DE8 was in the
Data segment. If we proceed with the translation anyway, we would
use 0x0 for the beginning virtual address of the data segment and
calculate the offset in the Data segment as
0x20DE8 - 0x0 = 0x20DE8. But the Data segment begins at
physical address 0x151C * click_size = 0x151C000. So the
calculation would give the physical address of 0x20DE8 as
0x151C000 + 0x20DE8 = 0x153CDE8 - the same as
before.
The virtual_copy function calls umap_local to
translate a virtual address for a process to a physical
address.
The equivalence between using the Data segment and the Stack
segment just noted ignores the fact that an address could fall in
the gap between the two segments. Since the Intel processor
hardware is only being used by Minix 3 to protect the whole memory
allocated to a process, the processor just considers the
gap just to be part of the single Intel segment. The
umap_local function is careful if the segment number passed
in indicates D or S. It assumes the consecutive
address (Intel) segment and checks whether the address is really
in the data, stack, or gap and returns invalid (0) if it is
in the gap. Otherwise, umap_local uses the correct segment D or
S and makes the usual protection check on the virtual address
va assumed to reference a multibyte data that extends
through virtual address vc:
va >> CLICK_SIZE < mem_vir + mem_len
and
vc >> CLICK_SIZE < mem_vir + mem_len
The example is to implement the handler for the task call that
will use virtual_copy to copy a large struct from kernel
memory to a user process's memory.
We have to specify a struct vir_addr to describe the
virtual address of the kernel's source data and another struct
vir_addr to describe the virtual address of the destination
user's variable to receive the copy.
struct vir_addr {
int proc_nr_e;
int segment;
vir_bytes offset;
};
proc_nr_e: endpoint value of the process
segment: This should be T, D, or S (0,1, or 2)
T for Text, D for Data, S for Stack
offset: Byte offset from the beginning of the segment.
From the discussion above, for copying data, you can set the
segment to be D or S and specify the byte offset as the address of
the data item (using C expression to calculate this address). This
should work if the data is either in the data segment or the stack
segment. The umap_local function will change the segment if
it is not the correct one before calculating the physical address.
Setting Up the Source Data Virtual Address[28] [top]
The source data to copy "from" is the process table
p_memmap entry for the user process.
To set up the source vir_addr we first need to
get the endpoint value of the system task. The process
number of the system task is easy. It is SYSTEM. This
is a macro defined in /usr/src/include/minix as (-2). So it
is not quite the index into the process table. To get the
index into the kernel's version of the process table you have to
add NR_TASKS (which is 4). The kernel's process table has 4
more entries than the process manager's process table.
So you could use the system task's process number to get
its endpoint value by remembering to translate in the proc
table by NR_TASKS:
The "+ NR_TASKS" is necessary but easy to forget. Another macro
is defined, proc_addr(nr), that returns a pointer to the
proc entry of the process with process number nr. So using
this macro we can write alternatively:
struct vir_addr src;
src.proc_nr_e = proc_addr(SYSTEM)->p_endpoint;
src.segment = D; /* D is a macro equal to 1 for Data segment */
src.offset = ??
The offset should be the virtual address in bytes of the
mem_map for the user process. The process manager made this
task call and must send something in the message identifying the
user process. It could be the user process's pid, but the pid is
not in the kernel's proc, it is in the process manager's
mproc. The process manager could send the process
number or the process endpoint of the user process. The
process number is not stored in mproc, the endpoint
is. Since the process manager system call handler will get a
pointer to the user's mproc entry, it is easy to get the user's
endpoint. Besides it will be needed anyway to fill in the
destination vir_addr for the call to virtual_copy.
So... to get the user's entry, in the proc table, the
message will contain the user's endpoint value. Suppose for
a minute that we can translate this endpoint to a
proc_nr for the user process. Then the src.offset can be
set:
Translating an endpoint to a process number[29] [top]
Another macro (or three):
int proc_nr;
int isokendpt(endpt, &proc_nr)
requires: endpt be a valid endpoint value of a process
postcondition: return 1 and set proc_nr to the process number
of the process with the input endpoint OR
return 0 if an error occurred
Setting Up the Destination Data Virtual Address[30] [top]
The destination is a bit easier since all the appropriate
values will be sent in the incoming message from the process
manager's task call:
struct vir_addr dst;
dst.proc_nr_e = (extract this from the message)
dst.segment = D;
dst.offset = (vir_bytes) (extract this from the message)
Note: vir_bytes is defined as unsigned int in my Minix 3.
PUBLIC int virtual_copy(src_addr, dst_addr, bytes)
struct vir_addr *src_addr; /* source virtual address */
struct vir_addr *dst_addr; /* destination virtual address */
vir_bytes bytes; /* # of bytes to copy */
After setting up the first two parameters, the last parameter
to pass is the number of bytes to copy. In the example, this
should be the size of the p_memmap array of mem_map
structures. Since this array has 3 elements, this size should be
the same as the sizeof(struct mem_map) multiplied by the
number of local segments, NR_LOCAL_SEGS (3).
This seems as though it should be the same as copying from
kernel space to a user address space. However, the process manager
does not have the privilege to call virtual_copy or
phys_copy directly. It must send a "ask" the system task. The
system library function sys_vircopy is provided in Minix 3
for that purpose.
PUBLIC int sys_vircopy(src_proc, src_seg, src_vir,
dst_proc, dst_seg, dst_vir, bytes)
int src_proc; /* source process */
int src_seg; /* source memory segment */
vir_bytes src_vir; /* source virtual address */
int dst_proc; /* destination process */
int dst_seg; /* destination memory segment */
vir_bytes dst_vir; /* destination virtual address */
phys_bytes bytes; /* how many bytes */
Note: src_proc and dst_proc should be endpoint values
There are no struct parameters here, but they are comparable to
the values we needed in using virtual_copy in the system task.
We can use this as a sort of check on the task call for a
user's mem_maps. Both the process tables mproc and
proc have an array of mem_map for the 3 local
segments of each process. These should have the same
information. As a check a user process could declare two such
arrays and get one from the process manager's mproc and the
other from the kernel's proc table and print the
results. They should be the same. All that is needed is to pass
the pid of the user whose mem_maps we want (not necessarily the
calling user) and pointers to the two user arrays.
The endpoint of the calling user is available in several
ways:
mp->mp_endpopint
m_in.m_source
who_e
How do you get the endpoint for a different user given only
that user's pid?
The handler for the system call in the process manager will
need to copy some of its data (in the mproc table). So the
source process is the process manager. So you either need
to determine the endpoint value of the process manager or
you can use the macro SELF. This is defined as:
/* These may not be any valid endpoint (see <minix/endpoint.h>). */
#define ANY 0x7ace /* used to indicate 'any process' */
#define NONE 0x6ace /* used to indicate 'no process at all' */
#define SELF 0x8ace /* used to indicate 'own process' */
#define _MAX_MAGIC_PROC (SELF) /* used by <minix/endpoint.h>
to determine generation size */
Notice that SELF is not a valid endpoint! So how
can it be used where one is expected? E.g., by sys_vircopy?
The code in sys_vircopy checks for the SELF value in the
source or destination. If so, sys_vircopy, uses the message
source for the endpoint value. In particular
SELF is not used to translate to a process number.
This isn't a big deal for our problem. It isn't hard to find out
the endpoint value of say the process manager (its 0).
*proc_nr = _ENDPOINT_P(endpt)
where _ENDPOINT_P is defined by
#define _ENDPOINT_P(e) \
((((e)+NR_TASKS) % _ENDPOINT_GENERATION_SIZE) - NR_TASKS)
NR_TASKS = 4
_ENDPOINT_GENERATION_SIZE = 4 + SELF + 1
_ENDPOINT_P(e) = ( (e + 4) % (SELF + 5) ) - 4
Proc number =
endpoint _ENDPOINT_P(endpoint)
Process manager's 0
File System server 1
Reincarnation server 2
init 7
emacs 71089 (71093 % 35539) - 4 = 11
* SELF (SELF + 4) % (SELF + 5) - 4 = SELF
emacs process is at entry 11 of the mproc table
emacs process is at entry 15 of the proc table
* The formula for SELF yields SELF, but it is neither a valid endpoint
or a valid process number.