Design of Scalable Data-Centers over Emerging Technologies
Network-based Computing Laboratory
Department of Computer Science and Engineering
Ohio State University

Overview        Publications        Sample Results        On-going Work

Current data-centers lack in efficient support for intelligent services, such as requirements for caching documents and cooperation of caching servers, efficiently monitoring and managing the limited physical resources, load-balancing, controlling overload scenarios, that are becoming a common requirement today. On the other hand, the System Area Network (SAN) technology is making rapid advances during the recent years. Besides high performance, these modern interconnects are providing a range of novel features and their support in hardware (e.g., RDMA, atomic operations, IOAT, etc.). We propose a framework comprising of three layers (communication protocol support, data-center service primitives and advanced data-center services) that work together to tackle the issues associated with existing data-centers. The following figure shows the main components of our framework.




Main Objectives:


Publications

Conferences/Workshops:

Technical Reports:

PhD Thesis:

MS Thesis:

Current Graduate Student Researchers:

Sundeep Narravula, Karthikeyan Vaidyanathan, Ping Lai and Hari Subramoni

Past Graduate Student Researchers:

Pavan Balaji, Savitha Krishnamoorthy and Jiesheng Wu

Project Sponsors:

This research is supported in part by NSF grants #CNS-0403342 and #CNS-050942; an NSF-STTR grant with RNet; and equipment donations from Intel, Mellanox and Dell.

Sample Results

Our testbed consists of a cluster of 64 nodes (8 cores each) with proxy servers using Apache or Squid, web servers using Apache 2.0, application servers using PHP and database servers using MySQL/DB2. For our evaluation, we use several traces such as TPC-W, SPECweb, auction benchmarks from Rice University such as RUBiS and RuBBoS, worldcup traces, etc. As mentioned earlier, we work on several research directions: Advanced Communication Protocols and Subsystems for Data-Centers

Existing data-center components such as Apache, PHP, MySQL, etc., are typically written using the sockets interface over the TCP/IP communication protocol. The advanced communication protocols layer aims at transparently improving the communication performance of such applications by taking advantage of the mechanisms and features provided by modern networks such as IBA and 10GigE. The goals of these advanced protocols and subsystems are to maintain the sockets semantics so that existing data-center components do not need to be modified. In particular, we evaluate the Sockets Direct Protocol (SDP) and propose several design alternativessuch as Zero-Copy SDP (ZSDP), Asynchronous Zero-Copy SDP (AZ-SDP) to improve the performance of data-center applications. Our micro-benchmark results [ISPASS'04] show that SDP is able to provide up to 2.7 times better bandwidth as compared to the native sockets implementation over InfiniBand (IPoIB) and significantly better latency for large message sizes. Further, our evaluations with AZSDP [CAC'06] stack show upto 35% improvement over ZSDP stack and upto a factor of two bandwidth improvement as compared to SDP stack.

Results:

The figures above show the client response times and splitup of the response time seen in a data-center environment using IPoIB and SDP. The figure clearly shows a better performance for SDP, as compared to IPoIB for large file transfers above 128K. To understand the lack of performance benefits for small file sizes, we took a similar split up of the response time perceived by the client. Though the ``web-server time'' reduces significantly, the time taken at the proxy is higher for SDP as compared to IPoIB. A comparison of this splitup for SDP with IPoIB showed a significant difference in the time for the proxy to connect to the back-end server. This high connection time of the current SDP implementation, about 500 usecs higher than IPoIB, makes the data-transfer related benefits of SDP imperceivable for low file size transfers.

Advanced System Services for Emerging Data-Centers

The advanced data-center services are intelligent services that are critical for the efficient functioning of data-centers. For example, requirements for caching documents, managing limited physical resources, admission control, and prioritization and QoS mechanisms are handled by these. In our proposed design, we utilize the novel features of emerging technologies and provide efficient data-center services which can lead to higher data-center throughput and lesser response time. Specifically, the dynamic content caching service deals with efficient and load-resilient caching techniques for dynamically generated content, while the active resource adaptation (used interchangeably with resource reconfiguration) service deals with on-the-fly and scalable management and adaptation for various system resources.

Results:

The figure on the left shows the benefits of our RDMA-based services [SAN'04] in a data-center environment. As the number of compute threads increases, we see a considerable degradation in the performance in the no-cache case as well as the Socket-based implementations using IPoIB and SDP. However, the client-polling architecture using VAPI shows no degradation in performance due to the one-sided semantics of RDMA. The figure on the right shows the benefits of RDMA-based resource monitoring mechanism [RAIT'06]. Due to the highly efficient, synchronous and accurate resource monitoring mechanism (RDMA-Sync and e-RDMA-Sync), we observe close to 35% improvement in comparison with traditional sockets-based implementation (Socket-Sync and Socket-Async).

This figure shows the benefits of admission control service using the RDMA-based resource monitoring mechanism [CCGrid'08]. Due to the accurate and efficient resource monitoring mechanism (AC(RDMA)), we see close to 17% and 36% improvement in response time with worldcup trace as compared to admission control using TCP/IP (AC(TCP/IP)) and system with no admission control mechanism (No AC), respectively.

Lower-Level Service Primitives for Emerging Data-Centers

The data-center service primitives and advanced data-center services layers aim at supporting intelligent services for current data-centers. Specifically, the data-center service primitives take advantage of the advanced communication protocols as well as the mechanisms and features of modern networks to provide higher-level utilities that can be utilized by applications as well as the advanced data-center services. For the most efficient design of the upper-level data-center services, several primitives such as soft shared state, enhanced point-to-point communication, distributed lock manager, and global memory aggregator are necessary.

Results:

The graph on the left shows the application improvement seen using our shared state primitive, namely the distributed data sharing substrate (DDSS) [HiPC'06]. We observe that the performance of STORM is improved by around 19% for 1K, 10K and 100K record dataset sizes using DDSS in comparison with the traditional implementation. The graph on the right presents the basic performance improvement that our scheme (N-CoShED) [CCGrid'07] shows over existing schemes: (i) basic Distributed Queue based Non-shared Locking (DQNL) and (ii) traditional Send/Receive-based Server Locking (SRSL). N-CoShED scheme shows 39% improvement over the SRSL scheme. We also observe a significant (up to 317% for 16 nodes) improvement over the DQNL scheme.

On-going Work

Currently, we are also extending our research focus in several directions:


D. K. Panda
Last modified: Thu Apr 10 12:47:00 EDT 2008