Write
Read
DEBUG
Address Spaces and Address Translation
Page Tables (versus TLB)
User Program Context Switches (Again)
Nachos Object File Format (Noff)
Loading User Programs into Mips Memory
It seems to me a mistake that the nachos Write system call returns void.
The ususal return value would be an int. If >=0 it would mean the actual number of bytes written. If < 0, it would indicate an error.
How do we change this system call to return an int?
The prototype for Write is declared in userprog/syscall.h.
The assembler code to invoke Write and all the other system calls is in thread/start.s.
The start.s file header comment informs us that the calling convention for nachos system calls and ordinary C calls on the Mips is:
------------------------------------- System call number goes in register 2. Then invoke syscall instruction. -------------------------------------
The other arguments and return value for System calls follow the standard C calling conventions on the Mips:
--------------------------------- First argument goes in register 4 Second argument goes in register 5 Third argument goes in register 6 Fourth argument goes in register 7 Return value goes in register 2 ---------------------------------
So we need to
Both the Read and the Write nachos system calls must transfer data between user space and nachos kernel space.
As alluded to before, one has to be careful about this because of the way that addresses are translated.
Problems can be avoided now and later writing your additions to the nachos kernel to use the machine->ReadMem and machine->WriteMem functions to transfer data between the nachos kernel address space and the current user program address space.
Address translation and its relation to Read and Write will be discussed more fully today.
The Read system call presents the same issues as the Write system call, except that Read already returns an int.
So the remarks above about where to put the return value apply to Read as well. Namely, the return value should be put in register 2.
void DEBUG (char flag, char* format, ...);Example:
DEBUG('t', "Entering SimpleTest");
DEBUG('e', "SC_Exec not yet implemented, halting...\n");
The DEBUG call above will print if nachos is invoked like this:
hawk% nachos -d t ... or hawk% nachos -d tm ...
Predefined 'flag' characters and their meaning:
'+' -- turn on all debug messages
't' -- thread system
's' -- semaphores, locks, and conditions
'i' -- interrupt emulation
'm' -- machine emulation (USER_PROGRAM)
'd' -- disk emulation (FILESYS)
'f' -- file system (FILESYS)
'a' -- address spaces (USER_PROGRAM)
'n' -- network emulation (NETWORK)
You can add DEBUG calls and new flags without having to change the DEBUG implementation.
For example, suppose you want to print debugging information for exceptions.
Pick an unused character, say 'e'. Insert DEBUG calls where you like:
DEBUG('e', "SC_Exec not yet implemented, halting...\n");
Run nachos passing the 'e' flag to the -d option:
hawk% nachos -d e -x ../test/simpleExec
(Turns on DEBUG('e', ...) printing.)
or
hawk% nachos -d etm -x ../test/simpleExec
(Turns on DEBUG('e',...), DEBUG('t',...), and DEBUG('m',...)
printing.)
threads/utility.h
- Predefined debug 'flag' characters
threads/utility.cc
- Implementation of DEBUG, etc.
In a real operating system, kernel code and data structures are stored in physical memory, and user programs and data are also stored in physical memory.
In a virtual memory OS, however, each user program refers to its code and data through virtual, not physical addresses.
So user (virtual) addresses must be translated to physical addresses.
The kernel can bypass address translation if desired and use absolute addresses or use translated addresses.
In nachos, as we know, the user code and data is stored in the Mips physical memory, but the kernel code and data structures are not. They are in the separate address space represented by the C++ nachos code and data structures.
In a real OS, exchanging data between kernel and user address spaces must still be done carefully.
This is because the user addresses will have to be translated.
The kernel has the ability to do this for any user, but can only do it efficiently for the current user.
This is because address translation is done at least partly through hardware, and the hardware can only translate for one address space at a time.
Virtual memory uses:
When a user process is executing, the hardware context may include a special register which points to the memory location for the page table for that process.
The memory management unit of the processor will have usually have a TLB, which can contain only the most recent pagetable entries for the currently executing process.
So when a context switch occurs, the hardware context must also be switched to refer to the new process.
E.g., the special register must be changed to point to the new process's page table.
The TLB must be flushed of all the translations for the old process.
So this hardware context that is used for address translation allows the kernel to translate user addresses only for the current user.
This makes it tricky when the kernel needs to exchange information between two different user programs. E.g., between a parent and a child process.
Essentially, two copy operations are necessary for the kernel to transfer data between user A and user B.
When the kernel executes for current user A, it can copy data from A's address space into a kernel structure since it can translate A's addresses.
Later, when the kernel executes for current user B, the data in the kernel structure can be copied into B's address space since it can now translate B's addresses.
In nachos, the two functions:
machine->ReadMem machine->WriteMem
play the role of the hardware context for the currentThread.
That is, these two functions translate virtual addresses, but not for an arbitrary user program stored in memory.
They translate addresses only for the current user process, currentThread.
The nachos code is set up for you to implement virtual memory either through page tables alone, or by using a TLB.
We will only use the page tables.
The nachos data structure for page table (and also for TLB) is defined as a C++ class:
class TranslationEntry {
public:
int virtualPage; // The page number in virtual memory.
int physicalPage; // The page number in real memory (relative to the
// start of "mainMemory"
bool valid; // If this bit is set, the translation is ignored.
// (In other words, the entry hasn't been initialized.)
bool readOnly; // If this bit is set, the user program is not allowed
// to modify the contents of the page.
bool use; // This bit is set by the hardware every time the
// page is referenced or modified.
bool dirty; // This bit is set by the hardware every time the
// page is modified.
};
The Thread class contains the following entry that specifies the page table for a given nachos user thread (in threads/thread.h):
public:
void SaveUserState(); // save user-level register state
void RestoreUserState(); // restore user-level register state
AddrSpace *space;
and AddrSpace is the class defined in userprog/addrspace.h:
class AddrSpace {
public:
AddrSpace(OpenFile *executable); // Create an address space,
// initializing it with the program
// stored in the file "executable"
~AddrSpace(); // De-allocate an address space
void InitRegisters(); // Initialize user-level CPU registers,
// before jumping to user code
void SaveState(); // Save/restore address space-specific
void RestoreState(); // info on a context switch
private:
TranslationEntry *pageTable; // Assume linear page table translation
// for now!
unsigned int numPages; // Number of pages in the virtual
// address space
};
The pageTable member will be initialized to the correct number of TranslationEntry's as determined from the Mips user executable file. This value will also be store in the member numPages.
It seems to me that you will need to provide additional accessor member functions to the AddrSpace class so that the nachos kernel code you will write will be able to manipulate the pageTable for a process.
It is true that the hardware in a real OS will detect page faults and such based on the page table. But the page fault handler is OS code, and this code will need to be able to access the user's page table.
Context switches in nachos change the value of currentThread.
A user Thread object has an AddrSpace object, which has a pageTable of TranslationEntry's.
So a context switch also switches which page table is used by the Mips machine. This is analogous to a machine that has a special register to point to the current user's page table. A context switch that switches all the registers can switch this register too.
The actual translation using the pagetable of TranslationEntry's is in the file machine/translate.cc in the function Machine::Translate.
The nachos object file format (noff) is described using the nachos (C style) kernel structures in the file code/bin/noff.h:
#define NOFFMAGIC 0xbadfad /* magic number denoting Nachos
* object code file
*/
typedef struct segment {
int virtualAddr; /* location of segment in virt addr space */
int inFileAddr; /* location of segment in this file */
int size; /* size of segment */
} Segment;
typedef struct noffHeader {
int noffMagic; /* should be NOFFMAGIC */
Segment code; /* executable code segment */
Segment initData; /* initialized data segment */
Segment uninitData; /* uninitialized data segment --
* should be zero'ed before use
*/
} NoffHeader;
How is a Mips user program loaded into Mips memory and the associated page table for the user set up?
The function StartProcess in the file userprog/progtest.cc is a place to begin understanding how this works.
StartProcess(filename) is called from main in the nachos executable when the -x filename option is used:
hawk% nachos -x filename
Here is the nachos StartProcess function:
void
StartProcess(char *filename)
{
OpenFile *executable = fileSystem->Open(filename);
AddrSpace *space;
if (executable == NULL) {
printf("Unable to open file %s\n", filename);
return;
}
space = new AddrSpace(executable);
currentThread->space = space;
delete executable; // close file
space->InitRegisters(); // set the initial register values
space->RestoreState(); // load page table register
machine->Run(); // jump to the user progam
ASSERT(FALSE); // machine->Run never returns;
// the address space exits
// by doing the syscall "exit"
}
As you can see, most of the work is done in the initial nachos code by the AddrSpace constructor.
AddrSpace::AddrSpace(OpenFile *executable)
{
NoffHeader noffH;
unsigned int i, size;
executable->ReadAt((char *)&noffH, sizeof(noffH), 0);
if ((noffH.noffMagic != NOFFMAGIC) &&
(WordToHost(noffH.noffMagic) == NOFFMAGIC))
SwapHeader(&noffH);
ASSERT(noffH.noffMagic == NOFFMAGIC);
// how big is address space?
size = noffH.code.size + noffH.initData.size + noffH.uninitData.size
+ UserStackSize; // we need to increase the size
// to leave room for the stack
numPages = divRoundUp(size, PageSize);
size = numPages * PageSize;
ASSERT(numPages <= NumPhysPages); // check we're not trying
// to run anything too big --
// at least until we have
// virtual memory
DEBUG('a', "Initializing address space, num pages %d, size %d\n",
numPages, size);
// first, set up the translation
pageTable = new TranslationEntry[numPages];
for (i = 0; i < numPages; i++) {
pageTable[i].virtualPage = i;
// for now, virtual page # = phys page #
pageTable[i].physicalPage = i;
pageTable[i].valid = TRUE;
pageTable[i].use = FALSE;
pageTable[i].dirty = FALSE;
pageTable[i].readOnly = FALSE; // if the code segment was entirely on
// a separate page, we could set its
// pages to be read-only
}
// zero out the entire address space, to zero the
// unitialized data segment and the stack segment
memset(machine->mainMemory, 0, size);
// then, copy in the code and data segments into memory
if (noffH.code.size > 0) {
DEBUG('a', "Initializing code segment, at 0x%x, size %d\n",
noffH.code.virtualAddr, noffH.code.size);
executable->ReadAt(&(machine->mainMemory[noffH.code.virtualAddr]),
noffH.code.size, noffH.code.inFileAddr);
}
if (noffH.initData.size > 0) {
DEBUG('a', "Initializing data segment, at 0x%x, size %d\n",
noffH.initData.virtualAddr, noffH.initData.size);
executable->ReadAt(&(machine->mainMemory[noffH.initData.virtualAddr]),
noffH.initData.size, noffH.initData.inFileAddr);
}
}