Efficient Communication I/O in Virtual Machines

Although originally focused on resource sharing, current virtual machine (VM) technologies provide a wide range of benefits such as ease of management, system security, performance isolation, checkpoint/restart and live migration. Cluster-based High Performance Computing can take advantage of these desirable features of virtual machines, which is especially important when ultra-scale clusters are posing additional challenges on performance, scalability, system management, and administration of these systems.

In spite of such advantages, VM technologies have not yet been widely adopted in performance critical areas including HPC. This is mainly due to:

This project, in collaboration with Dr. Jiuxing Liu and Dr. Bulent Abali from the IBM T. J. Waston Research Center, focused on low overhead I/O operation in Virtual Machine environment. Current objectives of this project are two folds:


Publications:

Conferences

Technical Reports


Results:

Our testbed is an eight node InfiniBand cluster. Each node in the cluster is equipped with dual Intel Xeon 3.0GHz CPUs, 2 GB of memory and a Mellanox MT23108 PCI-X InfiniBand HCA. For VM-based environments, we use Xen 3.0. The Xen domain0 hosts RedHat AS4 with kernel 2.6.12 with 256MB memory. All user domains are running with a single virtual CPU and 896 MB memory, which allows two DomUs per physical machine. Each guest OS in DomUs uses the 2.6.12 kernel. The OS is derived from ttylinux, with minimum changes in order to host MPI applications. In the native environment, we use RedHat AS4 with SMP mode.

   

Above figures show the basic latency and bandwidth achieved in Xen and native environment. We observe that they are virtually the same with VMM-bypass I/O.

Above Figure shows the normalized execution time of NAS Parallel Benchmarks based on the native environment. We observed that Xen-based environment performs comparably with the native environment.

Above Figure shows the Gflops achieved in HPL on 2, 4, 8 and 16 processes. We observe very close performance here with native case outperforms at most 1%.


Graduate Student Researcher(s):

Wei Huang


D. K. Panda
Last modified: Fri July 20 18:51:18 EDT 2007 Site Meter