CIS 307: TOPIC: Virtual Memory
Storage Management, and Virtual Memory, are treated in Chapter 3 of Tanenbaum
Introduction
A Virtual Memory system is a system where the addresses used in executing
images, Virtual Addresses, are different from the addresses, Physical
Addresses, used at the hardware level, on the bus between processor
and main memory. There is a map from the Virtual Address Space to the Physical
Address Space. This map is usually different in different processes. The map
is set and evaluated repeatedly during program execution.
Virtual Memory was invented in the early Sixties. It is now almost universally
used for the following reasons:
- Size Independence:
- It allows us to have programs whose size is limited by the size of the
virtual memory, not by the size of the physical memory. Since virtual memory
is usually (NOT always) larger than physical memory, large programs can run
even on machines with small main memory.
- Protection:
- When my array index computation goes out of range, it is bad because it
aborts my program. But it would be very bad if it would abort your program.
Protection controls the access rights of a process to various portions of its
virtual address space. With appropriate operating systems support it becomes
possible to achieve any desired protection policy where sharing and
separation between processes is as desired. This property is so important that
many want virtual memory even in diskless machines, or in machines where the
physical address space is larger than the virtual address space.
- Relocatability:
- If in location A we store the address of location B then, if the program
(or the portion containing B), is relocated then we need to change the
content of A. Alternatively, as seen in machines with Base and Bound registers,
and in Virtual Memory, we can use addresses relative to the base, or virtual
addressses, and avoid all recomputations.
The performance of Virtual Memories, on average, is good. The hardware required
is complex but, these days, it is cheap. Overall, a great idea.
[When considering machines with Virtual Memory always look at what they do to
reduce the main memory required to accomodate page tables.]
Fragmentation
- Internal Fragmentation:
- Suppose you are in a paged Virtual Memory, with pages, say, of size p.
Then, if we are given a program of size S, we will require Ceiling(S/p) pages.
That is, we will waste
p*Ceiling(S/p) - S bytes
This is called Internal Fragmentation.
- External Fragmentation:
- Suppose you are in a Segmented Virtual Memory. Then storage is allocated in
variable size blocks. This results in the creation of holes of unallocated
memory that are smaller than requests for new segments. These holes are
essentially wasted storage (unless we compact storage). This we call
External Fragmentation. Tanenbaum (page 88) discusses the
Fifty percent Rule and the Unused Memory
Rule. It is easy to see using those results, that in most situations
external fragmentation is much more wasteful than internal fragmentation.
Performance
We will now try to determine, given a map from a Paged Virtual Memory to
Physical Memory, and the access time to Physical memory, what is the
access time (expected) to Virtual Memory. We will assume that the translation
cost is zero (we can always include translation cost as a percentage of the
memory access). We define some terms.
- Hit Cost = H :
- Time it takes to access virtual memory in the case that there is no page
fault. Under the assumption that translation cost is 0, then H is equal to the
access time of physical memory.
- Hit Ratio = R :
- Ratio between the number of times virtual memory is accessed without
giving origin to a page fault and the number of time it is accessed giving
origin to a page fault.
- Miss Penalty = M :
- Time it takes to access virtual memory in the case that there is a page
fault.
It becomes easy to see that the expected access time to virtual memory becomes:
R*H + M R*H + M M
E(access time) = ------- ~= ------- = H + -
R+1 R R
For example, if H is 0.1microseconds, M is 15milliseconds, and R is 100,000
then the expected access time to virtual memory is 0.25microseconds, that is,
it is 2.5 times the access time to physical memory.
The beauty of virtual memory is that it is like a game that can be played
more than once, not only between main memory and secondary storage, but also
between cache memory and main memory. In all case things work well for us
because programs show "Locality" in their behavior. Let's consider the
combination of cache and main memory virtual address translation.
Let Hc, Rc, and Mc and Hm, Rm, and Mm be respectively the hit cost, Hit ratio,
and Miss penalty in the case of cache to main memory, and main memory to disk
virtual addresses.
We can consider two cases: the case where the cache virtual address space
operates on top of the main memory virual address space, and the converse case.
Clearly case 2 is less efficient than case 1. But it is the way that is usually
implemented in hardware. In order to take full advantage of the cache it is
necessary to increase the hit ratio for main memory, for example, from 100,000
to 600,000 (in this case the total access time becomes 0.51microseconds).
You may have heard that if in a car you improve the power of the engine by a
lot, you have also to improve brakes, suspensions, steering, body rigidity,
etc. The same is true in our case, if we add cache, we should also increase
the size of main memory so as to decrease main memory fault rate and bring the
performance of the whole memory system close to the one of the cache.
Of course the memory mamangement hardware unit will support translation from
virtual to physical address. But it does much more. Here is a partial list of
other services it may provide:
- Vectored Interrupt in case of page fault with information for identifying
what caused the fault (for example, access violation versus true page fault on
which address)
- Support for replacement policy (use bit and dirty bit)
- Support for choosing when to have the VM active
- Support for debugging when using VM (selecting which addresses to translate
and which not to translate)
- Translation of adresses across distinct virtual address spaces
Suppose you have a page fault. If there is a free frame we can immediately
load the needed page into that frame. But if no frame is available (and
no replaceable page is available which has not been written to) then we have
first to write to disk the replaced page, then read in the needed page.
This means that for one page fault we have more than one IO operation: double
jeopardy. I know that may be other processes can use the processor while
I wait for my page, but still it is bad for my program.
A possibility is to keep available a pool of "free" frames.
The content of these frames is copied to buffers and written to disk before
replacement is required and the frames are marked available for replacement.
Then in the case of a page fault: if it is for one of the pages that were
occupying the "free" frames, then we have a Soft Page Fault
since we can still find in the frame the original page; if it is for another
page, then we have a Hard Page Fault since the page has to
be brought in from disk. Now for a Hard page fault we will have
to wait for a single IO operation. Of course Soft page faults will be handled
much more quickly than Hard page faults.
If you go over the various replacement policies you find that they assume that
a program, say with n pages, executes with m frames. Nothing is said about how
m was chosen. The replacement policy says only how to use those frames. Here
we discuss how to choose m.
Suppose that we are in a system where pages are backed to a single disk and
this disk has an access time that is 10ms. Then we know that the maximum
number of page faults that our system can handle is 100 per second. If we run
my program on this system and give to it a lot of page frames, we may get, say,
20 page faults per second. If the number of frames available to my program is
reduced we may get 40 page faults per second. By reducing further the number
of frames, we may get to 100 faults per second. But beyond that point, no
matter how few frames my program is given, we will not get more than 100
faults per second. If I want my program to keep the cpu busy, if f is the
number of faults it generates per second, M is the page fault penalty, and
c is the compute time per second, we should have c >= f*M [In our case
c >= 10*f] so that the program runs at least as much as it waits for faults
and the cpu does not have to wait for the io channels.
If instead of thinking only of my program, I think of all the programs
currently executing, if cumulatively they create 60 page faults, then the disk
is not fully utilized, if the faults are close to 100, then probably
programs are waiting on page faults more than they should. If a program in
a second has more time waiting for page faults than cpu time, then that program
has more page faults than it should.
Clearly, if programs have a mean time between faults that is no less that
the fault penalty, then, on average the cpu will not be idle while the disk
is handling faults. If we make them equal, then, both cpu and IO are equally
busy. Going back to our analysis of the performance of virtual memory, we
see that in the "ideal" case, the access time to virtual memory is 2 times
the access time to physical memory.
We call Working Set of a program the number of distinct
pages used by that program in a specified time interval. We (the operating
system) try to allocate to a program enough
frames to accomodate a desired working set. A desirable working set is one, as
discussed above, where for the chosen process the cpu time is not less than
the fault handling delay time. Since the operating system may be unable to
give to each process the desired number of frames, the OS has two choices,
to let programs run even if they have too many page faults (they are
trashing), or to swap out the programs whose desired
working set cannot be satisfied. The OS follows the second policy.
This is the Working Set Principle: Let a program run only if we can
accomodate its desired working set.
ingargiola.cis.temple.edu