Design for High Reliability, Availability and Serviceability
Dhiraj K. Pradhan
University of Bristol

Audience: Engineers and Researchers

Course Description: This tutorial discusses the factors that cause system failure (hardware fault, noise, software bugs, etc.) and then presents the wide range of techniques, both hardware and software, that have been developed to protect the system from these threats. Also discussed are design techniques to enhance fault tolerance in multiprocessor distributed systems. This part concludes with a discussion of models for evaluating the effectiveness of these techniques in terms of reliability and availability improvements versus the hardware, software, and/or performance overhead.

Lecturer: Dhiraj K. Pradhan is currently a Chair Professor in the Department of Computer Science at the University Bristol (U.K.). Recently, he was a Professor in the Electrical & Computer Engineering Department of Oregon State University, in Corvallis. Previously, Dr. Pradhan held the COE Endowed Chair Professorship in Computer Science at Texas A&M University, in College Station, as well as serving as Visiting Professor at Stanford University, in California. Before that, Professor Pradhan was Professor/Coordinator of Computer Engineering at the University of Massachusetts, in Amherst, along with other positions. Dr. Pradhan's contributions include two patents, serving as co-author/editor of several books, including Fault-Tolerance Computing: Theory & Techniques, Vol. I & II (Prentice-Hall, '86), Fault-Tolerant Computer Systems Design (Prentice-Hall, '98), IC Manufacturability: The Art of Process and Design Integration (IEEE Press, '99). Prof. Pradhan's honors include the '96 IEEE Transactions on Computer Aided Design Best Paper Award; Fellow, ACM; Humboldt Distinguished Senior Scientist Award/ Germany; '97-'98 Fulbright Flad Chair in Computer Science.