Efficient Shared/Remote Memory over High Speed Interconnects
Software Distributed Shared Memory (DSM) systems do not perform well because
of the combined effects of increase in communication, slow networks and the
large overhead associated with processing the coherence protocol. Modern
interconnects like Myrinet, Quadrics and InfiniBand offer reliable, low
latency, and high bandwidth. These networks also support efficient memory
based communication primitives like Remote Memory Direct Access (RDMA). These
supports can be leveraged to reduce overhead in a software DSM system.
To efficiently implement the cache consistency protocol such as Home-based
Lazy Release Consistency (HLRC) protocol, we have employed RDMA and atomic
operations and presented a significant performance improvement. We also have
taken on a challenge of developing a communication substrate over GM and VIA
such that applications using the TreadMarks DSM package can take advantage
of the enhanced communication performance of user-level protocols.
In addition to the shared memory based programming model, Remote Memory
Access (RMA) operations facilitate an intermidiate programming model
between message passing and shared memory. This model combines some
advantages of shared memory, such as direct access to shared/global data,
and the message passing model, namely the control over locality and data
distribution. In the context of this model, we study latency hiding
techniques: overlapping communication with computation and coalescing small
put/get messages on Aggregate Remote Memory Copy Interface (ARMCI).
Modern interconnects also provide individual node the opportunity to exploit
remote resources efficiently in cluster environment. In particular, using
remote memory as swap area can improve application performance significantly,
especially for memory intensive applications. We study the issues of using
remote memory to enhance local memory hierarchy with efficient user-level protocols.
M. Banikazemi, D. K. Panda, and P. Sadayappan,
Implementing TreadMarks on Virtual Interface Architecture (VIA):
Design Issues and Alternatives, Ninth Workshop
on Scalable Shared Memory Multiprocessors, held in conjunction
with ISCA, June 2000.
Workshop Presentation Slides
R. Noronha, and D. K. Panda.
Designing High Performance DSM Systems using InfiniBand: Opportunities, Challenges and Experiences.
Oct. 2003.
Technical Report .