9 :0 0 a m - 1 :0 0 p m
TUTORIAL I
Title: Designing Next Generation HPC Clusters, Storage/File Systems, and Datacenters with InfiniBand: Opportunities and Challenges
Dhabaleswar K. (DK) Panda
The Ohio State University, Columbus, Ohio, USA
Audience: This tutorial is targeted for scientists, engineers, managers, developers, educators, and students working in the areas of high performance interconnect, communication, I/O, storage, networking, middleware, and applications related to next generation high-end systems (such as clusters for high performance computing, cluster-based file systems, cluster-based servers (web, database), multi-tier data centers, etc.).
Course Description: The emerging InfiniBand Architecture (IBA) is generating a lot of excitement as an open interconnect standard for building next generation high-end systems (as indicated above) in a radical different manner. This is leading to the following common questions among many scientists, engineers, managers, developers, and users of these high-end systems:
1) What is InfiniBand Architecture?
2) How is it different from other on-going developments and standardization effort including Virtual Interface Architecture (VIA), PCI-X, 10.0 GigE, TCP Off-load Engines, Direct Data Placement (DDP), RDMA over IP, Rapid I/O, Hyper-transport, PCI-Express, etc.?
3) How does it perform compared to other proprietary cluster interconnects (such as Myrinet and Quadrics)?
4) What unique features and benefits does IBA bring to designing next generation high-end systems (clusters for high performance computing, cluster-based file systems for scientific and enterprise applications, cluster-based servers (web and database), and multi-tier datacenters)?
5) How to exploit novel features of InfiniBand to build such systems with high performance, scalability, and reliability?
This tutorial is designed to provide answers to the above questions. We will start with the background behind the origin of the IBA standard. Then we will quickly make the attendees familiar with the novel features of IBA. We will compare and contrast the features and performance of IBA with those of other interconnects (as mentioned above). The emerging software interfaces and standards (such as Sockets Direct Protocol (SDP), uDAPL, IB Access Layer (IBAL)) on top of IBA will be discussed. An overview of IBA products and their capabilities will be presented. Research challenges in designing clusters for high-performance computing (with different programming models such as MPI, DSM, Get/Put), cluster-based file systems, cluster-based servers (web and database) and multi-tier data centers will be outlined. Case studies in designing such systems with IBA (different components of the systems, their interfaces, benefits, and bottlenecks) will be presented. Performance comparisons across different interconnects while designing such systems will be presented. The tutorial will conclude with an overview of on-going IBA related research projects.
Lecturer(s): Dhabaleswar K. Panda is a Professor of Computer Science at the Ohio
State University. He obtained his Ph.D. in computer engineering from the University of Southern California. His research interests include parallel computer architecture, high performance computing, user-level communication protocols, interprocessor communication and synchronization, network-based computing, and quality of service. He has published over 140 papers in major journals and international conferences related to these research areas. Dr. Panda and his research group members have been doing extensive research on InfiniBand. His research group is currently collaborating with Sandia National Laboratory, IBM T.J. Watson, and leading InfiniBand companies on designing various subsystems of next generation High Performance Computing systems with InfiniBand. The MVAPICH (MPI over VAPI for IBA) package developed by his research group (http://nowlab.cis.ohio-state.edu/projects/mpi-iba/) is being used by more than 110 organizations world-wide to extract the potential of IBA-based clusters for HPC applications. This software has enabled clusters to achieve 3rd, 111th, and 116th rank during the Nov. ‘03 ranking of the TOP500 list.
Dr. Panda has served on Program Committees and Organizing Committees of several parallel processing and high performance computing conferences and on editorial boards for several parallel processing journals. He was General Co-Chair for the 2001 Int’l Conference on Parallel Processing; Program Co-Chair of the 1999 Int’l Conference on Parallel Processing, 1997 and 1998 Workshops on Communication and Architectural Support for Network-Based Parallel Computing (CANPC); Program Co-Chair of the Int'l Workshop on Communication Architecture for Clusters (CAC '01 - CAC '04); an Associate Editor of the IEEE Transactions on Parallel and Distributed Computing; Co-Guest-Editor for two special issue volumes of Journal of Parallel and Distributed Computing on "Workstation Clusters and Network-based Computing", an IEEE Distinguished Visitor Speaker; and an IEEE Chapters Tutorials Program Speaker. Dr. Panda is a recipient of the NSF Faculty Early CAREER Development Award, the Lumley Research Award (1997 and 2001) at the Ohio State University, and an Ameritech Faculty Fellow Award. Dr. Panda is listed as a distinguished scientist in "Who'sWho in America" and in "American Men & Women of Science".
|