This book presents the most important fault-tolerant distributed programming abstractions and their associated distributed algorithms, in particular in terms of reliable communication and agreement, which lie at the heart of nearly all distributed applications. These programming abstractions, distributed objects or services, allow software designers and programmers to cope with asynchrony and the most important types of failures such as process crashes, message losses, and malicious behaviors of computing entities, widely known under the term "Byzantine fault-tolerance". The author introduces these notions in an incremental manner, starting from a clear specification, followed by algorithms which are first described intuitively and then proved correct. The book also presents impossibility results in classic distributed computing models, along with strategies, mainly failure detectors and randomization, that allow us to enrich these models. In this sense, the book constitutes an introduction to the science of distributed computing, with applications in all domains of distributed systems, such as cloud computing and blockchains. Each chapter comes with exercises and bibliographic notes to help the reader approach, understand, and master the fascinating field of fault-tolerant distributed computing.
This book presents the most important fault-tolerant distributed programming abstractions and their associated distributed algorithms, in particular in terms of reliable communication and agreement, which lie at the heart of nearly all distributed applications. These programming abstractions, distributed objects or services, allow software designers and programmers to cope with asynchrony and the most important types of failures such as process crashes, message losses, and malicious behaviors of computing entities, widely known under the term "Byzantine fault-tolerance". The author introduces these notions in an incremental manner, starting from a clear specification, followed by algorithms which are first described intuitively and then proved correct. The book also presents impossibility results in classic distributed computing models, along with strategies, mainly failure detectors and randomization, that allow us to enrich these models. In this sense, the book constitutes an introduction to the science of distributed computing, with applications in all domains of distributed systems, such as cloud computing and blockchains. Each chapter comes with exercises and bibliographic notes to help the reader approach, understand, and master the fascinating field of fault-tolerant distributed computing.
Über den Autor
Prof. Michel Raynal is among the top researchers in the world on the topic of distributed algorithms. He is a full professor at IRISA (Université de Rennes, France), where he founded in 1984 one of the very first research groups on Distributed Algorithms. He has been the principal investigator in numerous related research national and international projects, and he has been invited by many universities around the world to give lectures on distributed algorithms and distributed computing. He has over 400 academic publications on this topic, he has authored twelve books on related topics, and he was involved in all the key conferences in distributed computing. His current research interests include distributed algorithms, distributed computing systems, distributed computability and dependability, and the fundamental principles that underlie the design and construction of distributed computing systems. Michel Raynal is also Distinguished Chair Professor at the Polytechnic University of Hong Kong.
Zusammenfassung
Author among the world's leading researchers in distributed computing
Useful for graduate students and researchers in distributed systems
Content supplemented throughout with exercises, summaries, and bibliographic notes
Includes supplementary material: [...]
Inhaltsverzeichnis
Part I: Introductory : Chapter: a Few Definitions and Two Examples.- Part II : I The Reliable Broadcast Communication Abstraction.- Reliable Broadcast in the Presence of Process Crash Failures.- Reliable Broadcast in the Presence of Process Crashes and Unreliable Channels.- Reliable Broadcast in the Presence of Byzantine Processes.- Part III : The Read/Write Register Communication Abstraction.- The Read/Write Register Abstraction.- Building Read/Write Registers Despite Asynchrony and Less Than Half of Processes Crash (t < n/2).- Circumventing the t < n/2 Read/Write Register Impossibility: the Failure Detector Approach.- A Broadcast Abstraction Suited to the Family of Read/Write Implementable Objects.- Atomic Read/Write Registers in the Presence of Byzantine Processes.- Part IV: Agreement in Synchronous Systems.- Consensus and Interactive Consistency in Synchronous Systems Prone to Process Crash Failures.- Expedite Decision in Synchronous Systems with Process Crash Failures.- Consensus Variants: Simultaneous Consensus and k-Set Agreement.- Non-blocking Atomic Commit in Synchronous Systems with Process Crash Failures.- Consensus in Synchronous Systems Prone to Byzantine Process Failures.- Part V: Agreement in Asynchronous Systems.- Implementable Agreement Abstractions Despite Asynchrony and a Minority of Process Crashes.- Consensus: Power and Implementability Limit in Crash-Prone Asynchronous Systems.- Implementing Consensus in Enriched Crash-Prone Asynchronous Systems.- Implementing Oracles in Asynchronous Systems with Process Crash Failures.- Implementing Consensus in Enriched Byzantine Asynchronous Systems.-Part VI : Appendix - Bibliography.- Index.