CSR CISE Outline
From PEL Wiki
Contents |
Proposal
We are unifying disk and memory in personal computers, making main memory a cache for secondary storage. This combination (DiskRAM) presents a cleaner abstraction to the programmer, enables hardware-controlled run-time optimization of data locations, decouples file systems from disk technology, and enables persistent memory.
Combining disk and memory presents storage to the programmer at a more logical level of encapsulation and allows for run-time optimization of the location of the data. Why should a programmer worry about the amount of free memory when his program is running on a system. This is something that can not be known a priori on today's multitasking operating systems. Rather than have the programmer guess, let the programmer worry about the locality of the data he accesses (the algorithm). Clean abstraction means more efficient and more portable code.
File systems have been optimized for disks.
Persistent memory allows the programmer to skip the overhead of flushes to disk to make sure that data is persistent.
Background
Memory management is suboptimal due to lack of hit data. When memory management is implemented in software, the only information it can collect on page usage is page misses. This means that pages that are accessed once during a time quantum appear to be just as used as pages which had millions of hits during the same period. In order to collect usage data, the paging algorithm periodically marks all pages as unused and flushes the translation lookaside buffer (TLB). Increasing the frequency of this flushing improves the amount of information available for making swapping decisions, but increases the overhead of memory management. Pandey et. al, in their ASPLOS '04 paper, make the case for better tracking of page misses [1]. They show that even a small increase in the amount of memory usage information available for decision making allows them to improve the response time of interactive applications, even though it adds about 7% in overhead. They also propose a hardware implementation for collecting memory usage information which eliminates that overhead.
Another interesting paper is from Ekman and Stenstrom [3]. They make a case for multi-level main memory by observing that much of the working set of a program can be kept in a memory which is an order of magnitude slower with negligible performance degradation.
Some of the earliest computers, as well as the IBM System 38 and AS/400 included this notion of mapping the contents of secondary storage into the memory space, but physical address size limitations forced them to manage the mappings in software [2].
Mainstream processors are available now with terabyte physical address spaces.
Disk gets used as storage, not input/output. The differences between disk and memory lie not in what information is stored there, but in access time, cost, and permanence.
Memory and disk overlap in file caches, swap, picture?? I'm not sure what kind of an illustration you'd like here. Are you thinking along the lines of a Venn Diagram? Most illustrations of system memory do not include the types of data that are being stored.
Research Plan
We are creating a system to enable us to study the interactions between memory and disk usage patterns in a running system.
Our metrics for improvement include bandwidth, latency, power consumption, and programmability (ease of use). We will further demonstrate that the unification of disk drives with memory can provide improvement in each of these areas, by utilizing the strengths of each technology. Our goal is to approach the programmability, latency, and bandwidth of memory while nearing the cost and power consumption of disk.
- Implement memory controller
- Implement disk controller
- Collect data for interesting benchmarks
- Streaming media
- Databases
- Formal verification
Status
We have acquired a reconfigurable system which is based on a dual-processor server workstation. One of the processors is replaced by a Field Programmable Gate Array (FPGA), which allows us to use it as a reconfigurable chipset. It has access to the memory in the machine, and the PCI address space through the link to the remaining processor.
Other Research Lines
- Persistent storage (Only a small portion is dirty and in RAM at any one time)
- Reduce database overhead
- Reduce power consumption in laptops
- Rethink file systems and process state
- Smart disks/Intelligent RAM (The FPGA could process data before returning it if so instructed)
References (The complete citations are embedded in the first citations at the top of the page)
- ↑ 1.1 1.2 Vivek Pandey, Jagadeesan Sundaresan, Anand Raghuraman, Yuanyuan Zhou, and Sanjeev Kumar. Dynamic Tracking of Page Miss Ratio Curve for Memory Management. The Proceedings of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'04), October, 2004.
- ↑ 2.1 2.2 Frank G. Soltis. Inside the AS/400: Featuring the AS/400e Series. 29th Street Press/NEWS/400 Books. 1997.
- ↑ 3.1 3.2 Magnus Ekman and Per Stenstrom. A Case For Multi-level Main Memory. Proceedings of the 3rd workshop on Memory performance issues: in conjunction with ISCA'04. 2004.
Categories: Pel | Myles
