Section 12.9
Trade-offs

Let's discuss some of the trade-offs that virtual memory systems entail. First, let us look at page size.

Remember that in most systems, pages and frames are the same size, often 512 bytes up to 8192 bytes, and always a power of 2. The size of the page is perhaps the most critical factor in the design and tuning of a virtual memory system. If it is too small, the page tables will be too large and the associative memory will have to be larger to achieve the same hit ratio. If the page size is too large, internal fragmentation becomes unacceptably high and each page fault causes many more bytes to be transferred between disk and memory.

Let's go through a calculation. Suppose that addresses are 24 bits wide, meaning there are 16 megabytes of virtual memory. If pages were 8K (8192 bytes), then 2048 pages would exist in a program's page table since 16M ÷ 8K = 2048. 13 bits are required to address 2048 words. Also suppose there are only 2 megabytes of real memory. 2097152 (2M) divided by 8192 gives 256, which is the actual number of frames. Thus, a 13 bit page number must be mapped into a 8 bit frame number.

Assuming two additional bits are needed in the page table for each entry (present/absent and dirty bits), each page table entry, without disk addresses, needs 2+8=10 bits. Since it is difficult to allocate part of a byte, this would have to be rounded up to 16 bits per entry, or 2 bytes. Thus the page table would be 2048*2=4096 bytes long, which is 4K. This is how much each individual process or program would need just for its page table.

But there is usually more than one program running at any given time, so the total amount of memory devoted to page tables is 4K times the number of jobs that can be allowed at any given time. Supposing there can be 50 jobs, then 200K would have to be set aside for page tables. This is only the "skinny" version, too, without disk addresses, so more than 200K would have to be set aside on the disk drive. Adding 40 bytes for a disk address to 2 bytes per page table entry, we get 42 bytes times 2048, or 86,016 which is 84K for one job's complete page table. Fifty jobs, each getting 84K, would eat up an astonishing 4.2 megabytes for page tables, which is twice as much memory as we assumed the entire computer had! Further, these assumptions of 8K pages are large. More typically, pages are 4096 bytes, which would double all the numbers.

Most jobs are not allowed to use all of their address space, or allowed to store on disk in the swap area a page for each byte of their virtual address space. For a 32-bit machine, this would require about 4 billion bytes per job. (This is 4 gigabytes.) The more typical situation is to assign a small amount of memory to each job, such as 8 megabytes, and if it grows into unused regions of its stack or heap, additional pages are allocated, up to a limit determined by the system administrator.

Another factor in determining the size of pages is the size of the working set and typical loops inside programs. If a loop stays within one page then all of its addresses will be within that page, causing the program not to page fault since every dynamic address translation will find the page number in the TLB. If the page size is too small, however, the program will frequently go outside the page containing the loop code, possibly causing page faults. Of course, this could happen even if the typical loop is the same size as the page but the loop spanned a page break. In this case, both pages would soon find themselves resident in frames and would continue to sit there until the loop was finished.

The solution to these woes is to increase the number of frames that a job is allowed to have at any given time. Yet if the page size is too large, we must run fewer programs at any one time or increase the size of the total memory.

It is worthwhile to see if virtual memory solves or exacerbates the protection problems mentioned in the last chapter. Recall that we do not want malicious or errant programs writing into any region of memory that does not belong to them, and the OS must handcuff any that do. Some systems like the CDC use a separate field length register to compare the address against the allowed range of addresses for the currently running program. Other machines, such as the IBM mainframes, use a short key for each block of memory and compare it against the currently running program's key for equality.

Surprisingly, virtual memory systems do not need any extra protection mechanisms because a program does not even "know" about addresses that are not currently mapped into its virtual address space. The only way a program can get at real memory location X is if X is in one of the frames currently allocated to the program via the page table. And since frames are allocated to jobs on an all or nothing basis, with no jobs sharing memory inside a single frame, it is impossible for one program to mess up another's memory.

There are many other issues involved in the design of virtual memory systems and a huge amount of effort has been expended to get them to work efficiently. We have only touched on the major points.

However, virtual memory has been enormously successful and today is considered as essential as the adder or the disk drive! Even personal computers use virtual memory, even when they are not truly multiprogrammed. Since personal computers seem to lag behind mainframes by about 20 years, anything that appeared in the mainframes of the past should be showing up on the shelves of your local computer store any day now.