We propose to develop a series of low latency file system solutions
based on a programmable disk. This is accomplished by
combining the knowledge of both the file system semantics and disk
mechanism, and by avoiding storage system bottlenecks and interference
with host operations. Unlike the ``Active Disk'' approach that
offloads processing onto the disk
processors, our approach obeys
existing file system interfaces to transparently improve performance
of legacy software without excluding sophisticated customizations.
The proposal covers a full spectrum of options, ranging from
single-disk solutions to multiple-disk ones, from logical disks to
more closely integrated solutions, and from single-host systems to
clustered systems.
Local File Systems for Programmable Disks Our initial goal is exceptional transaction latency. This is accomplished by two key technologies. The first is eager writing, the strategy of writing to the closest free sectors near the current disk head location. The second is a virtual log, a transactional log whose entries are not physically contiguous. We then generalize this approach for multiple disks (called a MimdRAID) by aggressively exploiting parallelism. Our preliminary study has demonstrated an order of magnitude improvement in transaction latency. We believe it is possible to widen the gain to three orders of magnitude by further attacking the weaknesses uncovered in the current study and by riding the impressive trend of the disk density growth. This also provides a powerful mechanism for addressing real time constraints. A Network File System for a Cluster of Intelligent Disks We will develop a low latency cluster file system that centers around the SHRIMP multicomputer interconnect. There are three interesting features. First, this new system exploits the distributed device intelligence and allows a rich array of communication patterns including disk-to-disk and disk-to-device. Second, unlike existing network attached disks, this closely coupled system can take advantage of the ``virtual memory-mapped communication'' to perform protected low-latency communication. We plan to investigate the construction of a reliable and coherent global file cache using this powerful mechanism. Third, we plan to explore the MimdRAID concepts described above in the cluster context. By combining these features, we believe we can exceed the performance gain seen on the local file system. Application Demonstrations and I/O Programming Models The Princeton Display Wall Project presents a challenging I/O application. We will orchestrate an end-to-end I/O solution which includes new I/O programming models to allow explicit application control. This programming model can elegantly and efficiently support multi-user and multi-process workloads in a way that is impossible with traditional stream based approaches. Innovative security mechanisms will allow us to safely run injected code without sacrificing performance. |