Every Windows administrator has to field user complaints about client performance. Client-system performance can...
be affected by factors such as memory, CPU, disk and the network. Of these factors, the most confusing is memory management, which admins need to understand for making informed decisions and troubleshooting.
Users typically equate adding memory to resolving performance bottlenecks, and it's relatively cheap and easy to add memory. But does adding memory really improve performance? This article boils down memory management to simple terms. Part 2 in this series will tie these concepts to the information shown in Windows 7 Task Manager features and Resource Monitor and identify common memory-related issues.
It's important to know the difference between physical and virtual memory. Physical memory is the amount of physical RAM available in the computer. Physical memory can be visualized as a table shown in Figure 2, where data is stored. Each cell shown in the table is a unique "address" where data is stored.
Virtual memory essentially allows each process -- applications, dynamic link libraries (DLLs), etc. -- to operate in a protected environment where it thinks it has its own private address space. Figure 1 shows the virtual memory table for a process on a computer with 2 GB of RAM. The CPU translates or maps the virtual addresses into physical addresses in RAM using page table entries (PTEs).
Virtual memory limits
The virtual address space for 32-bit architecture has a physical limit of about 4 GB, regardless of the amount of RAM in the computer. Windows divides this into two sections, as shown in Figure 2: user space and kernel space. The addresses in the kernel space are reserved for system processes. Only those in the user space are accessible for applications. So, each application has a virtual memory limit of 2 GB. Again, this is regardless of physical RAM. That means that no process can ever address more than 2 GB of virtual address space by default. Exceeding this limit produces an "out of virtual memory" error and can occur even when plenty of physical memory is available.
In a simple example, consider having several applications open at the same time on a 32-bit platform. Each application or process has 2 GB of virtual address space and has no idea that it is sharing physical RAM with other processes. Memory management will then use PTEs to translate the virtual address space for each process to physical RAM.
Note that, as shown in Figure 2, the use of virtual memory allows the three applications, each with 2 GB of virtual address space, to share the 2 GB RAM in the computer. This is accomplished by paging infrequently used data to disk, then paging it back to RAM when needed.
Processes will run faster if they reside in memory as opposed to requiring the memory manager page data in from the disk and put it back in memory. Thus, more memory in the system allows more processes to reside in memory and reduces paging from disk.
However, since x86 processors can address only 4 GB address space, and the computer has 12 GB of RAM, you may wonder if you have 8 GB of wasted RAM. The answer is the Physical Address Extension (PAE). This is a processor feature -- supported by Intel and AMD -- that extends the address space to about 64 GB. This requires chipsets capable of PAE and applications to be written to take advantage of PAE, which is enabled by default.
We can also steal 1 GB from kernel space and add it to user space by using the /3 GB switch in Boot.ini and Boot Configurable Data. This is commonly done for server applications such as 32-bit Microsoft Exchange Server, which uses almost 3 GB of virtual address space. It is potentially dangerous, though, because we're reducing the memory that the kernel may need.
Figure 1 conceptually shows how multiple processes map their virtual address space to physical RAM. Note that in order to do this above 4 GB, you must use the PAE, as shown in Figure 3. The 64-bit architecture (x64) permits 264 or about 16 EB of address space. Because of limitations in the first x64 processors, however, Windows current implementation is limited to 16 TB -- 8 TB in user space and 8 TB in Kernel space, as shown in Figure 4.
This makes a huge difference in memory addressing. Rather than having to use the PAE and /3 GB options and another level of complexity to address more virtual memory, x64 addresses it in a flat model. This allows up to 1 TB of RAM to be addressed, allowing more processes to run in memory without paging, significantly increasing performance for memory-intensive processes.
When a process starts up, it reserves a certain amount of memory -- as dictated by the developer in the code -- but it requires few physical resources. This is analogous to making a hotel reservation. Making the reservation just takes some time to enter it in the system and talk to the customer. It does not block off the room until you arrive.
Committed memory occurs when the process requests the memory manager to back the reservation with memory (RAM and page file). In the hotel example, this would be checking into the hotel.
You have now blocked off a room, but you haven't used any resources -- no power, no water, etc. Similarly, at this point, nothing has been written to memory. When the memory is written to, virtual memory pages will then be mapped into RAM, and memory will be consumed. In the hotel, this is where you enter the room and consume water, electricity, housekeeping services, etc. Each process performs these actions as they run.
Committing to memory
The sum of all the memory used by active processes in the system is referred to as the system commit charge and is displayed in Task Manager along with the system commit limit. In Figure 5, the Windows 7 Task Manager shows the committed memory values in the "Commit (MB)" line.
The first number (3,528 in the figure) is the commit charge or currently used memory, while the second number (7,990) is the system commit limit (RAM + page file). In the Physical Memory section, the Total (RAM) is 3,996 (4 GB), so we have about a 4 GB page file. In XP, the Commit Peak was also displayed, which is the peak amount used in a given time period to aid in planning purposes.
Figure 6 demonstrates the memory management process described above. Here we see the 8 GB virtual address space (4 GB RAM, 4 GB page file) for a process that has reserved 2 GB and committed 500 MB but has only written 50 MB to memory. The PTEs map this virtual address space to physical space. These pages show up in the Working Set and may or may not be mapped contiguously in physical RAM.
Figure 7 shows a graphical representation of the Windows page frames, which will help explain how this works. On system boot, all memory resides in the Free Page List. The zero page thread places pages in the Zero Page List.
When a process starts, it takes only the pages it needs from the Zero Page List and creates a working set. Each process has its own working set, and pages are added as needed. Note that a program such as a DLL or executable file does not completely "load" into memory. Only the portion of the program that is being used is loaded into memory, with the remainder staying on disk. So if only 50 K of a 500 K .exe is required, only 50 K will be loaded into memory.
As the demand grows, additional pages will be added to the working set. At some point, Windows Memory Management (WMM) decides that process' working set is large enough and will start trimming pages that have been idle the longest. Rather than clearing them and then having to create and write them again when needed, they will be moved to the standby list or the modified page list.
Pages that are modified -- such as text entered in a Word document -- will be moved to the modified page list. Periodically, the modified page writer takes a bunch of pages from the modified page list, writes the data to disk and then sends the pages to the standby list. Pages on the standby list can be reused because the WMM knows where the data is (on disk in the page file).
Pages that are not modified are sent directly to the standby list since they are already on disk. The advantage of the standby list is that if the page is required again -- such as when you start working on a document that has been idle for a long time -- it can be easily be retrieved from the modified page list or the standby list and written back into the working set very quickly. Note that a modified page would have to be written to disk and put on the standby list before it can be faulted back to the working set.
Windows will only load shareable data such as programs or DLLs into memory once. For instance, I opened Word three times to edit three documents. Figure 8 shows the Task Manager view, listing Winword.exe once. If the working set becomes large enough or if I leave the Word docs idle long enough, the pages holding Winword.exe will be moved to the standby list and may be reused.
The pages holding my documents were modified, so they would go to the modified list and eventually be written to the page file (on disk) and then to the standby list, unless the document was saved (in which case, it is on disk). Thus, the standby list is a file cache because it is a copy of something already on disk.
The process of paging between the process working sets and the modified and standby lists is called "soft page faulting" and is fairly inexpensive because it happens in physical memory. Writing data to the disk is called "hard page faulting" and is more expensive in performance as it is going to disk.
When a program terminates or a process willingly releases private memory, its working set pages are released back to the Free List, which gets zeroed and moved to the Zero Page List. The purpose of zeroing pages is security -- to ensure that no process can reuse and expose sensitive data such as a password.
Once you understand memory management, you can diagnose memory-related performance problems. Part 2 of this series will explain how to interpret the information provided in Windows 7 Task Manager features and Resource Monitor, along with several examples. Remember, though, just adding memory won't necessarily solve a problem.
Thanks to Clint Huffman at Microsoft for his technical contribution to this article and content for Figure 6. Figure 7 was created using concepts and ideas from Clint Huffman and Windows Internals 5th Edition by David A. Solomon and Mark Russinovich.
ABOUT THE AUTHOR:
Gary Olsen is a solution architect in Hewlett-Packard's Technology Services organization and lives in Roswell, Ga. He has worked in the IT industry since 1981 and holds an M.S. in computer-aided manufacturing from Brigham Young University. Olsen has authored numerous technical articles for TechTarget, Redmond Magazine and TechNet magazine, and he has presented numerous times at the HP Technology Forum. He is a Microsoft MVP for Directory Services and is the founder and president of the Atlanta Active Directory Users Group.