CSC374 Feb03

slide version

single file version

Contents

  1. Memory Layout of a Linux Process
  2. Virtual Addresses
  3. Page Tables
  4. Virtual Address Translation Requirements
  5. MMU Requirements
  6. Trace of a Virtual Memory Reference (1)
  7. Trace of a Virtual Memory Reference (2)
  8. Trace of a Virtual Memory Reference (1)
  9. Trace of a Virtual Memory Reference (3)
  10. Trace of a Virtual Memory Reference (4)
  11. Trace of a Virtual Memory Reference (5)
  12. Trace of a Virtual Memory Reference (6)
  13. L1 and L2 Caches
  14. L1 and L2 Cache Memory Organization
  15. Cache Memory Parameters
  16. Cache Memory Lookup
  17. Example
  18. Virtual Address Translation
  19. Notation for Virtual Address References
  20. Example: Address Translation
  21. Example: Address Formats
  22. VPN and VPO
  23. TLB Lookup
  24. VPN is the index into Page Table
  25. Physical Address
  26. Data at Physical Address
  27. L1 Cache Lookup for Physical Address

Memory Layout of a Linux Process[1] [top]

For 32 bit Linux, the virtual memory layout of a process:

Kernel Virtual Memory Address Range: 0xc0000000 - 0xffffffff

Process Virtual Memory Address Range: 0x00000000 - 0xbfffffff

Process specific data structures
(Different for each process)
Physical memory
(Identical for each process)
0xc0000000 ___ Kernel code and data
(Identical for each process)
User stack
(Unmapped)
Memory mapped region
for shared libraries
brk ___ (Unmapped)
Run-time heap (via malloc)
Uninitalized data (.bss)
Initialized data (.data)
0x08048000__ Program text (.text)
0x00000000___ (Unmapped)

How can multiple process ALL exist with their program text, stack, etc. at the same virtual addresses?

 

Virtual Addresses[2] [top]

Answer: Virtual addresses are NOT the same as physical addresses.

Each of the logical units (program text, stack, heap, etc) are are grouped by address ranges (call virtual pages) of fixed size (such as 4K bytes).

Physical memory is also grouped into the same fixed size ranges (called physical pages).

Then when a virtual page is loaded into physical memory, it can be loaded into any physical page.

Note that the physical address where a virtual page is loaded is typically NOT the same as the virtual address.

Virtual pages process 1
vir add: 0x08048000
vir add: 0x08049000
vir add: 0x0804a000
...

Virtual pages process 2
vir add: 0x08048000
vir add: 0x08049000
vir add: 0x0804a000
...
Physical Address Physical Pages
0x06000000___ proc 1: 0x08048000
0x06001000___ proc 1: 0x08049000
0x06002000___ proc 2: 0x08049000
0x06003000___ free
0x06004000___ proc 2: 0x08048000
0x06005000___ free
0x06006000___ proc 1: 0x0804a000
0x06007000___ proc 2: 0x0804a000
...   ___ ...

 

Page Tables[3] [top]

The operating system's virtual memory management routines are responsible for loading virtual pages into physical pages.

For each virtual page, it must be possible to find which physical page contains that virtual page.

Associated with each process is a page table in which the virtual memory management records this information.

For the previous example, part of the page tables for process 1 and process 2:

Process 1
Virtual page Physical page
0x0804800 0x0600000
0x0804900 0x0601000
0x0804a00 0x0606000
Process 2
Virtual page Physical page
0x0804800 0x0604000
0x0804900 0x0602000
0x0804a00 0x0607000

 

Virtual Address Translation Requirements[4] [top]

Using virtual addresses in this way involves translating a virtual address to a physical address on every processor instruction and data operand fetch.

In order for this to be viable, this translation can't be done soley in software, which would require additional processor instructions which would also require translation.

Processors have a memory management unit (MMU) that translates virtual addresses to the physical address.

So the processor effectively executes machine programs that are compiled into instructions and data at virtual addresses.

MMU Requirements[5] [top]

The MMU on the processor chip has to translate virtual addresses for every process and needs the information in each process's page table do so.

There isn't enough room on the chip to store each process table of every processes.

This is a typical problem that calls for using a cache.

The cache for the MMU will contain a portion of the page table for the currently executing process.

This cache is traditionally called the TLB - translation lookaside buffer.

Trace of a Virtual Memory Reference (1)[6] [top]

                
                  +----------------------------------------+
                  |  prog counter                          |  
                  |  (virtual address)                     |
                  |   |                                    |  
       processor  |  MMU ---------->MAR (physical address) |  
                  |   |\            |\                     |  
                  |   | \           | \                    |  
                  |  TLB \         L1  \                   |  
                  |       \             \                  |  
      ------------+--------\-----------+-\-----------------+
                  |         \          |  \                |  
 physical memory  |         |          |  |                | 
                  |     page tables    |  instr/data       |
                  |         |          |                   |
                  +-- kernel memory ---+-- process pages --+
    

Trace of a Virtual Memory Reference (2)[7] [top]

                
                  +----------------------------------------+
                  |  prog counter                          |  
                  |  (virtual address)                     |
                  |   |                                    |  
       processor  |  MMU ---------->MAR (physical address) |  
                  |   |\            |\                     |  
                  |   | \           | \                    |  
                  |  TLB \         L1  \                   |  
                  |       \             \                  |  
      ------------+--------\-----------+-\-----------------+
                  |         \          |  \                |  
 physical memory  |         |          |  |                | 
                  |     page tables    |  instr/data       |
                  |         |          |                   |
                  +-- kernel memory ---+-- process pages --+
    

Trace of a Virtual Memory Reference (1)[8] [top]

                
                  +----------------------------------------+
                  |  prog counter                          |  
                  |  (virtual address)                     |
                  |   |                                    |  
       processor  |  MMU ---------->MAR (physical address) |  
                  |   |\            |\                     |  
                  |   | \           | \                    |  
                  |  TLB \         L1  \                   |  
                  |       \             \                  |  
      ------------+--------\-----------+-\-----------------+
                  |         \          |  \                |  
 physical memory  |         |          |  |                | 
                  |     page tables    |  instr/data       |
                  |         |          |                   |
                  +-- kernel memory ---+-- process pages --+
    

Trace of a Virtual Memory Reference (3)[9] [top]

                
                  +----------------------------------------+
                  |  prog counter                          |  
                  |  (virtual address)                     |
                  |   |                                    |  
       processor  |  MMU ---------->MAR (physical address) |  
                  |   |\            |\                     |  
                  |   | \           | \                    |  
                  |  TLB \         L1  \                   |  
                  |       \             \                  |  
      ------------+--------\-----------+-\-----------------+
                  |         \          |  \                |  
 physical memory  |         |          |  |                | 
                  |     page tables    |  instr/data       |
                  |         |          |                   |
                  +-- kernel memory ---+-- process pages --+
    

Trace of a Virtual Memory Reference (4)[10] [top]

                
                  +----------------------------------------+
                  |  prog counter                          |  
                  |  (virtual address)                     |
                  |   |                                    |  
       processor  |  MMU ---------->MAR (physical address) |  
                  |   |\            |\                     |  
                  |   | \           | \                    |  
                  |  TLB \         L1  \                   |  
                  |       \             \                  |  
      ------------+--------\-----------+-\-----------------+
                  |         \          |  \                |  
 physical memory  |         |          |  |                | 
                  |     page tables    |  instr/data       |
                  |         |          |                   |
                  +-- kernel memory ---+-- process pages --+
    

Trace of a Virtual Memory Reference (5)[11] [top]

                
                  +----------------------------------------+
                  |  prog counter                          |  
                  |  (virtual address)                     |
                  |   |                                    |  
       processor  |  MMU ---------->MAR (physical address) |  
                  |   |\            |\                     |  
                  |   | \           | \                    |  
                  |  TLB \         L1  \                   |  
                  |       \             \                  |  
      ------------+--------\-----------+-\-----------------+
                  |         \          |  \                |  
 physical memory  |         |          |  |                | 
                  |     page tables    |  instr/data       |
                  |         |          |                   |
                  +-- kernel memory ---+-- process pages --+
    

Trace of a Virtual Memory Reference (6)[12] [top]

                
                  +----------------------------------------+
                  |  prog counter                          |  
                  |  (virtual address)                     |
                  |   |                                    |  
       processor  |  MMU ---------->MAR (physical address) |  
                  |   |\            |\                     |  
                  |   | \           | \                    |  
                  |  TLB \         L1  \                   |  
                  |       \             \                  |  
      ------------+--------\-----------+-\-----------------+
                  |         \          |  \                |  
 physical memory  |         |          |  |                | 
                  |     page tables    |  instr/data       |
                  |         |          |                   |
                  +-- kernel memory ---+-- process pages --+
    

L1 and L2 Caches[13] [top]

Level 1 and level 2 caches (L1 and L2) can provide a very fast cpu with a copy of a portion of main memory currently being accessed.

L1 and L2 Cache Memory Organization[14] [top]

Cache Memory Parameters[15] [top]

Lookup in a cache for contents of a given address depends on:

Notation Item
S Number of sets
B Cache line block size
E Number of lines per cache set

The number of lines per set determines (and is determined by) the total data capacity of the cache, C:

        C = S * E * B  (Sets * Lines/Set * Block Data size of each Line )
      

Cache Memory Lookup[16] [top]

To check if the cache holds the contents of an address, the address is partitioned into three parts.

For example, for a direct mapped cache with 512 sets and a block size of 32 bytes and 32 bit addresses, the cache data capacity is 512 * 1 * 32 = 4096 = 4K bytes and

Parameter Size Number of address bits
S 512 9
B 32 5
Tag - 18

The address is partioned into

Tag Set Block Offset
bits 31 - 14 bits 13 - 5 bits 4 - 0

Example[17] [top]

Lookup a 4 byte integer at address 0x08048088. Assume the machine is little endian.

	 hex > 0x08048088
      binary > 0000 1000 0000 0100 1000 0000 1000 1000
      T:S:B  > 0000 1000 0000 0100 10:00 0000 100:0 1000
	  T  > 00 0010 0000 0001 0010 = 0x02012
	  S  > 0 0000 0100 = 0x004 = 4
	  B  > 0 1000 = 0x08
      

So look in set S = 4 and compare the tag in the (only) line of set 4 with 0x02012.

  1. Suppose the first 16 bytes of the data block of the line in set 4 of the cache is

    Tag: 0x2012 Block data: 01 02 00 00 FF FF FF FF EF BE AD DE 01 02 03 04

    So address 0x08048090 is a cache hit

    Since B = 8, the integer data starts at byte offset 8 in the block data.

    Integers are 4 bytes and the bytes are stored in reverse order on little endian machines.

    So the integer value is 0xDEADBEEF

  2. If the cache line had been:

    Tag: 0x2080 Block data: 01 02 00 00 FF FF FF FF EF BE AD DE 01 02 03 04

    The tag would not match and address 0x08048090 would be a cache miss.

Virtual Address Translation[18] [top]

The key to making this scheme possible is:

1. Compilers still generate code as if the program (segments) are to
   be loaded into contiguous blocks of storage.
2. This means the PC will still work as before - after fetching an
   instruction, the PC is incremented.
3. The executable program is also thought of being split into pages
   of the same size as used for physical memory (e.g., 1K byte
   pages). These are called virtual pages as opposed to the
   physical pages of memory.
4. A virtual page is loaded into any free physical page. (A
   data structure, the page table must record for each virtual
   page number, the physical page number where it was stored.)
5. The virtual address in the PC is not simply copied into the MAR,
   however. It is translated by the hardware memory management unit
   (MMU) in the cpu to the corresponding physical address where the
   code instructin is actually located. The MMU must have the page
   table information in order to do this. 
      

Notation for Virtual Address References[19] [top]

The following notation is used in connection with translation from virtual to physical addresses:

Address Cache
virtual TLB
VPN Virtual page number
VPO Virtual page offset (in bytes)
TLBI TLBI index
TLBT TLB tag

After translating the virtual address to a physical address, another cache is checked to see if it contains the memory contents of the physical address. This notation is used:

Address Cache
physical L1
(physical adddresses)
PPN physical page number
PPO physical page offset (PPO = VPO)
CI Cache index
CO Byte offset in cache block
CT Cache tag

Example: Address Translation[20] [top]

This example uses the following assumptions (See practice problem 10.4):

Problem: Translate virtual address 0x03d7 to a physical address

Example: Address Formats[21] [top]

Virtual Address Format

13 12 11 10  9  8  7  6  5  4  3  2  1  0
                           
  VPN   VPO

Physical Address Format

11 10  9  8  7  6  5  4  3  2  1  0
                       
  PPN   PPO

VPN and VPO[22] [top]

First write 0x03d7 in binary:

      0000 0011 1101 0111 (but this is 16 bits, so discard left 2 bits)
    

and partition the bits into the VPN and VPO parts:

         VPN      VPO
       00001111 010111
    

Now convert VPN and VPO back to hex

      VPN = 0000 1111 = 0x0F
      VPO =   01 0111 = 0x17
    

TLB Lookup[23] [top]

The VPN is the index into the page table for the current process.

However, the page table is in memory.

We do not want to have to access memory just to translate the virtual address!

So first see if the page table entry we need is in the TLB cache in the CPU.

      VPN = 00001111 = TLBT : TLBI = 000011 : 11
    

There are 4 sets: 0,1,2,3. The right 2 bits of the VPN form the TLB index, which is the same as the set number.

So the TLBI = 11 (binary) = 3 (in decimal).

The TLB tag must be compared with the 4 entries in set 3.

(Remember that the TLB is a 4-way-associative cache of page table entries.)

      TLBT = 000011 = 00 0011 = 0x03
    

The four tags for set 3 are

      Tag PPN Valid
      07   -    0
      03   0D   1
      0A   34   1
      02   -    0
    

So the physical page number, PPN= 0x0D.

This information is also in the page table in memory, but if we have a TLB hit, we avoid having to access memory for the PTE.

VPN is the index into Page Table [24] [top]

VPN = 0x0F, VPO = 0x17

Entry at index 0x0F is valid, so PPN = 0x0D

PPO always = VPO, so the physical address is PPN:PPO.

VPN PPN Valid
00 28 1
01 - 0
02 33 1
03 02 1
04 - 0
05 16 1
06 - 0
07 - 0
08 13 1
09 17 1
0A 09 1
0B - 0
0C - 0
0D 2D 1
0E 11 1
0F 0D 1

Physical Address[25] [top]

PPN = 0x0D, PPO = VPO = 0x17.

But we have to concatenate these bits to get the physical address = PPN:PPO and remember that PPN is 6 bits and PPO is 6 bits

      PPN = 0x0D = 0000 1101 (but discard left 2 bits) = 00 1101
      PPO = 0x17 = 0001 0111 (but discard left 2 bits) = 01 0111

      Physical address = PPN:PPO = 00 1101 01 0111 =  0011 0101 0111 = 0x357
     0000 11
    

Data at Physical Address[26] [top]

The physical address 0x357 was determined by the MMU hardware in the cpu from the virtual address since there was a hit in the TLB for the page table entry.

Now a lookup in the L1 cache would check to see if the contents of the physical address are available (a hit in the L1 cache).

Summary:

If a cache miss occurs in either case, memory must be accessed. (In this case the corresponding cache is updated.)

L1 Cache Lookup for Physical Address[27] [top]

Problem: Look in the L1 cache for the byte contents of the physical address just found: 0x357

Physical address: 0x357 = 0011 0101 0111

CT = 0011 01 = 0x0D
CI = 01 01 = 0x5
CO = 11 = 0x3

L1 cache:

Idx Tag Valid Blk 0 Blk 1 Blk 2 Blk 3
0 19 1 99 11 12 11
1 15 0 - - - -
2 1B 1 00 02 04 08
3 36 0 - - - -
4 32 1 43 6D 8F 09
5 0D 1 36 72 F0 1D
6 31 0 - - - -
7 16 1 11 C2 DF 03
8 24 1 3A 00 51 89
9 2D 0 - - - -
A 2D 1 93 15 DA 3B
B 0B 0 - - - -
C 12 0 - - - -
D 16 1 04 96 34 15
E 13 1 83 77 1B D3
F 14 0 - - - -