The Collective

A Virtual Appliance Computing Infrastructure

Objective

To develop a new computing system architecture that is secure, reliable, easy to administer, and provides ubiquitous access to users' computing environments.

Research Overview

Computers today run billions of operations per second, dazzle us with video and sound, store libraries of data, and quickly exchange data with millions of other computers. Yet, computers and software are still difficult to deploy and maintain. Applications and user data are tied to individual computers, making it harder to deal with hardware failures. When users move around, they must remember to bring their computer with them. And, keeping software up-to-date on each computer is a challenge; too many computers suffer security compromises because people fail to apply available updates or correctly configure protections.

Virtual Appliances

We propose restructuring our software and services as collections of virtual appliances. Computer appliances like TiVos and NetApp filers come pre-configured with the software needed to perform their task. In the case of TiVo, the software is kept up-to-date by TiVo rather than requiring the user to install patches. A virtual appliance is the software and hardware of a real computer appliance, but hosted on a virtual machine monitor. A virtual appliance contains the software needed to perform its task along with a description of the hardware resources it need. To take advantage of the existing software base, our Collective prototype operates on x86 virtual machines, allowing us to support all applications that run on Windows and Linux at the same time.

In our model, each user would have multiple virtual appliances. They may have a desktop appliance for communication and editing documents, a firewall appliance, a video editing appliance, and a movie viewing appliance. In general, users do not install software into virtual appliances; this allows makers to do updates with more confidence. Instead, users acquire additional appliances to gain more features. Using the network, each appliance is kept up-to-date by its maker. The appliances can use the network to communicate with each other and provide a more seamless experience to the user.

Virtual appliances have the following properties:

Useful software bundles. An appliance is a useful, working bundle of applications, not a single application that requires others to be installed to work correctly.
Reliable software updates. Due to the hardware-level isolation of virtual appliances, a maker can automatically update a user's appliance without worrying about conflicts with user-installed programs or other appliances.
Limiting effects of security compromise. Root privileges in one virtual appliances does not necessarily imply access to any other virtual appliances.
Cross computer mobility. The virtual machine monitor can suspend a running appliance and restart it on another. This can be used to move appliances.
Hardware expense. The virtual machine monitor (VMM) run multiple appliances on a single computer, even simultaneously.

System Architecture

In the Collective system architecture, virtual appliances, and their updated versions, are deposited in repositories. Individual computers run a universal appliance receiver that retrieves the latest copies of virtual machines from repositories upon request. In other words, the computers operate as a cache of appliances. The system uses a number of optimizations to minimize the cost of the storage and transfer of appliances. This approach allows a small number of professional staff to create fully tested, integrated environments that are made available quickly to all users anywhere on the network.

Major Research Findings

Efficient migration of appliances. X86 appliances, complete with operating systems, application programs, and possibly user data, can be very large. We found that the storage and transfer of appliances can be effectively optimized using the techniques of caching, demand paging, memory ballooning to reduce the memory state, and copy-on-write disks to capture changes. The time to transfer an appliance on a DSL link (384 kbps) is typically less than 20 minutes.
Virtual appliance networks. Generalizing the concept of virtual appliances to include a virtual network enables the encapsulation of network management knowledge and sets of related services.
The CVL (Collective Virtual appliance Language). We have developed a language for describing composition of virtual appliances to create virtual networks of appliances; the language uses the concept of inheritance to allow appliances be individually configured and customized appliances while retaining the ability to be upgraded automatically.
Livewire: An intrusion detection system for virtual machines. Through the Livewire prototype, we demonstrated that virtual machine technology can be used to build an intrusion detection system (IDS) that is both difficult to evade and difficult to attack. Like a host-based IDS, it has excellect visibility since it can access all the states of the computer being watched. Like a network-based IDS, it is not vulnerable to being disabled by the attacker.
Terra: A virtual machine-based platform for trusted computing. We have developed a flexible architecture for trusted computing, called Terra. Terra allows applications to run in an "open box" VM with the semantics of a modern open platform, or in a "closed box" VM with those of dedicated, tamper-resistant hardware. We have developed attestation primitives to cryptographically identify the contents of closed-box VMs to remote parties and showed how to implement them efficiently.
Remote timing attacks. We demonstrated the first remote timing attack where a private key can be extracted from a web server. Patches to eliminate such a vulnerability were developed and applied to the OpenSSL library. Our paper on the topic won the best paper award at the 2003 Usenix Security conference.
CRED: a dynamic buffer overrun detector. We have developed a practical detector called CRED (C Range Error Detector) that finds all buffer overrun attacks as it directly checks for the bounds of memory accesses. Unlike the original referent-object based bounds-checking technique, CRED does not break existing code because it uses a novel solution to support program manipulation of out-of-bounds addresses. Finally, by restricting the bounds checks to strings in a program, CRED's overhead is greatly reduced without sacrificing protection in the experiments we performed. CRED is implemented as an extension of the GNU C compiler version 3.3.1, and has been tested on over 20 open-source programs, comprising over 1.2 million lines of C code. The software is publicly available at http://sourceforge.net/projects/boundschecking/

People

Monica Lam (PI), Mendel Rosenblum (co-PI), Dan Boneh (co-PI), Ramesh Chandra, Jim Chow, Tal Garfinkel, Jim Norris, Ben Pfaff, Joel Sandin, Constantine Sapuntzakis, Hovav Shacham, Nickolai Zeldovich

Publications

Software

MetaVNC
MetaVNC seamlessly mixes windows from multiple operating systems into one desktop through a straightforward extension to the VNC protocol. Satoshi UCHINO, who visited us from August 2002 to February 2004, created and maintains the MetaVNC protocol and reference implementation.
Dynamic Bounds Checking
The dynamic bounds techniques we have developed for CRED have been integrated into a GCC release maintained by Herman ten Brugge.

This research is supported in part by the National Science Foundation under Grant No. 0121481, NSF student fellowships, and Stanford Graduate Fellowships. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.