Simplify Programming

From PEL Wiki

Jump to: navigation, search

Dan Olsen mentioned that our large memory device could simplify programming. What he called hunks could be allocated by programs and linked together to help the memory subsystem know what is persistent and what are related.

He also made the comment that the disk head could be sent to empty portions of the disk while idle and do writes of dirty blocks. Wherever they are written the mapping could be updated. This would reduce disk thrashing for writes.

I searched for flat address spaces on Google and found the following. I think the interesting concept is that it makes programming simpler. Wouldn’t our approach have the same impact?

Almost all popular processors have a flat address space, but the Intel x86 family has a segmented address space. A flat address space greatly simplifies programming because of the simple correspondence between addresses (pointers) and integers.


How about the following from Microsoft. Notice the two advantages outlined below. Wouldn’t our technique be the extreme case of this approach? We would map all of every file. The more the better, right?


File Mapping

File mapping is the association of a file's contents with a portion of the virtual address space of a process. The system creates a file mapping object to maintain this association. A file view is the portion of virtual address space that the process uses to access the file's contents. Processes read from and write to the file view using pointers, just as they would with dynamically allocated memory. Processes can also manipulate the file view with the \VirtualProtect function. File mapping provides two major advantages:

•Faster and easier file access

•Shared memory between two or more applications

File mapping allows a process to access files more quickly and easily by using a pointer to a file view. Using a pointer improves efficiency because the file resides on disk, but the file view resides in memory. File mapping allows the process to use both random input and output (I/O) and sequential I/O. It also allows the process to efficiently work with a large data file, such as a database, without having to map the whole file into memory. When the process needs data from a portion of the file other than what is in the current file view, it can unmap the current file view, then create a new file view. The file mapping functions allow a process to create file mapping objects and file views to easily access and share data. The following illustration shows the relationship between the file on disk, a file mapping object, and a file view.

The file on disk can be any file that you want to map into memory, or it can be the system page file. The file mapping object can consist of all or only part of the file. It is backed by the file on disk. This means that when the system swaps out pages of the file mapping object, any changes made to the file mapping object are written to the file. When the pages of the file mapping object are swapped back in, they are restored from the file. A file view can consist of all or only part of the file mapping object. A process manipulates the file through the file views. A process can create multiple views for a file mapping object. The file views created by each process reside in the virtual address space of that process. Windows Me/98/95: All file views reside in the shared address space. The shared address space exists in the range between 2 and 3 gigabytes in the virtual address space for each process. It contains the 16-bit heap and shared system DLLs, as well as file views. When multiple processes use the same file mapping object to create views for a local file, the data is coherent. That is, the views contain identical copies of the file on disk. The file cannot reside on a remote computer if you want to share memory between multiple processes.

More stuff

Memory mapped files are seductive because they offer the lure of reading and writing data on disk using only a memory pointer. Advance the pointer to a new address, and presto! the data magically appears there. The system takes care of reading the data from disk on demand, using the memory page protection architecture of the x386 virtual memory controller. If you refer to an address that has not yet been loaded into RAM, a page fault occurs behind the scenes and reads the data into RAM for you. Your program doesn't notice this activity because your thread is suspended while the page fault is processed.

MMFs give you simple access to data on disk without all the source code overhead of file I/O and buffering. It's simple and it's fast. So it must be better than the old way of doing things, right? Not necessarily.


Memory mapped files are not always faster than custom data loading algorithms. You have no control over how much of the MMF is kept in memory or for how long. This means that using an MMF may push other things out of RAM, such as code or data pages that you will need back "soon".

Also, page faults are not free. A page fault can take a lot longer for the system to process than a simple file I/O call. The additional system overhead of using page faults is hidden by the fact that fault processing is performed in the system kernel on a different thread, not in your process. More ==================================================== http://whitepapers.zdnet.co.uk/0,39025945,60025031p-39000493q,00.htm