
Distributed Systems
Beschreibung
Comprehensive textbook resource on distributed systems--integrates foundational topics with advanced topics of contemporary importance within the field
Distributed Systems: Theory and Applications is organized around three layers of abstractions: networks, middleware tools, and application framework. It presents data consistency models suited for requirements of innovative distributed shared memory applications. The book also focuses on distributed processing of big data, representation of distributed knowledge and management of distributed intelligence via distributed agents. To aid in understanding how these concepts apply to real-world situations, the work presents a case study on building a P2P Integrated E-Learning system. Downloadable lecture slides are included to help professors and instructors convey key concepts to their students.
Additional topics discussed in Distributed Systems: Theory and Applications include:
* Network issues and high-level communication tools
* Software tools for implementations of distributed middleware.
* Data sharing across distributed components through publish and subscribe-based message diffusion, gossip protocol, P2P architecture and distributed shared memory.
* Consensus, distributed coordination, and advanced middleware for building large distributed applications
* Distributed data and knowledge management
* Autonomy in distributed systems, multi-agent architecture
* Trust in distributed systems, distributed ledger, Blockchain and related technologies.
Researchers, industry professionals, and students in the fields of science, technology, and medicine will be able to use Distributed Systems: Theory and Applications as a comprehensive textbook resource for understanding distributed systems, the specifics behind the modern elements which relate to them, and their practical applications.
Weitere Details
Weitere Ausgaben
Personen
Hiranmay Ghosh, PhD, is a former Adviser and Principal Scientist of TCS Research. He received his PhD degree from IIT-Delhi and his B.Tech. degree from Calcutta University. He is a Senior Member of IEEE, Life Member of IUPRAI, and a Member of ACM. He authored the Wiley title Computational Models for Cognitive Vision (2020) and co-authored of the CRC Press title Multimedia Ontology: Representation and Applications (2015).
Inhalt
1
Introduction
A distributed system consists of many independent units, each performing a different function. The units work in coordination with each other to realize the system's goals. We find many examples of distributed systems in nature. For instance, a human body consists of several autonomous components such as eyes and ears, hands and legs, and other internal organs. Yet, coordinated by the brain, it behaves as a single coherent entity. Some distributed systems may have hierarchic organizations. For example, the coordinated interaction among human beings performing various roles realizes the goals of human society. We find such well-orchestrated activities in lower forms of animals too. For example, in a beehive an ensemble of bees exhibit coordinated and consistent social behaviors fulfilling their goals of foraging.
Inspired by nature, researchers have developed a distributed systems paradigm for solving complex multi-dimensional computation problems. This book aims to provide a narrative for the various aspects of distributed systems and the computational models for interactions at multiple levels of abstractions. We also describe the application of such models in realizing practical distributed systems. In our journey through the book, we begin with the low-level interaction of the system components to achieve performance through parallelism and concurrency. We progressively ascend to higher levels of abstractions to address the issues of knowledge, autonomy, and trust, which are essential for large distributed systems spanning multiple administrative domains.
1.1 Advantages of Distributed Systems
A distributed system offers many advantages. Let us illustrate them with a simple example. Figure 1.1 depicts a distributed system for evaluation of simple arithmetic expressions. The expression-evaluator in the system divides the problem into smaller tasks of multiplications and additions and engages other modules, namely, a set of adders and multipliers, to solve them. Hosting the modules on different computers connected over a network is possible. It schedules the activities of those modules and communicates the final result to the user. We can notice several advantages of a distributed computing even through this trivial example:
Figure 1.1 Illustrating distributed computing.
- Performance enhancement: The system may engage multiple components to perform subtasks, e.g., multiplications, in parallel, resulting in performance improvement. However, the distribution of the components over multiple hardware elements causes increased communication overheads. So, an analysis of trade-off is necessary between parallel computation and communication.
- Specialization and autonomy: Each module may be designed independently for performing a specific task, e.g., addition or multiplication. A component can implement any specific algorithm irrespective of the type of algorithms deployed in the other modules. So, localization of task-dependent knowledge and the local optimization of the modules for performance enhancements are possible. It simplifies the design of the system. The modules can even be implemented on disparate hardware and in different programming environments by various developers. A change in one module does not affect others, so long as the interfaces remain unchanged.
- Geographic distribution and transparency: It is possible to locate the components on machines at various geographical locations and administrative domains. The geographical distribution of the components is generally transparent to the applications, introducing flexibility of dynamic redistribution. For example, the a piece of computation can be scheduled on a computing node that has the least load at a given point of time, and can be shifted to another node in case of a failure. It results in reuse and optimal utilization of the resources. As another example, the replicas of a storage system can be distributed across multiple geographical locations to guard against accidental data loss.
- Dynamic binding and optimization: A distributed system can have a pool of similar computational resources, such as adders and multipliers. These resources may be dynamically associated with different computing problems at different points in time. Further, even similar resources, like the multipliers, may have different performance metrics, like speed and accuracy. The system can choose an optimal set of modules in a specific problem context. Such optimum and dynamic binding of the resources leads to improvement of overall system performance.
- Fault tolerance: The availability of a pool of similar resources aids in fault tolerance in the system. If one of the system components fails, then the task can migrate to another component. The system can experience a graceful performance degradation in such cases, rather than a system failure.
- Openness, scalability, and dynamic reconfigurability: A distributed system can be designed as an open system, where individual components can interact with a set of standard protocols. It facilitates the independent design of the components. Loose coupling between the system components helps in scalability. Further, we can replace deprecated components by new components without shutting down a system.
1.2 Defining Distributed Systems
Leslie Lamport's seminal work [Lamport 2019] laid down the theoretical foundations of time, clock, and event ordering in a distributed system. Lamport realized that the concept of sequential time and system state does not work in distributed systems. A failure in a distributed system is one of the toughest problems to understand. The failure is meaningful only in the context of time. Whether a computing system or a link has failed is indistinguishable from an unusually late response. Lamport recognized the importance of failure detection and recovery in a distributed system through the following famous quip [Malkh 2013]:
"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."
Understandably, fault tolerance [Neiger and Toueg 1988, Xiong et al. 2009], which includes detection of failures and recovery from faults, is a dominant area of research in distributed systems.
There are many technical-sounding definitions, but all seem to converge on the importance of fault tolerance in distributed systems. We plan to discuss fault tolerance in this book sometime later. However, to get a flavor of different ways of defining a distributed system, let us examine a few of those found in the literature [Kshemkalyani and Singhal 2011].
Definition 1.1 (Collection and coordination): A distributed system is a collection of computers not sharing a common memory or a common physical clock that communicates by messages over a communication network and where each computer has its memory and runs on its OS. Typically computers are semi-automatic, loosely coupled when they cooperate to address a problem collectively.
Definition 1.2 (Single system view): A collection of independent computers that appear to the users of the system as a single coherent computer.
Definition 1.3 (Collection): A term used to describe a wide range of computer systems from a weakly coupled system such as a wide area network to strongly coupled systems such local area network, to very strongly coupled multiprocessor systems.
The running idea behind all three definitions stated earlier is to capture certain basic characteristics of a distributed system; namely,
- There is no common clock in a distributed system.
- It consists of several networked autonomous computers, each having its clock, memory, and OS.
- It does not have a shared memory.
- The computers of a distributed can communicate and coordinate through message passing over network links.
However, we feel that the definitions are still inadequate in missing out on two key aspects of Lamport's observation of a distributed system. We propose the following new definition.
Definition 1.4 (Proposed definition): A distributed system consists of several independent, geographically dispersed, and networked computing elements such as computers, smartphones, sensors, actuators, and embedded electronic devices. These devices communicate among themselves through message passing to coordinate and cooperate in satisfying common computing goals, notwithstanding the occasional failures of a few links or devices.
The proposed definition covers the basic characteristics of a collection of networked computing devices. It indicates that a collections of independent components integrated as a unified system is a distributed system that
- Subsumes Definitions 1.3 and 1.2,
- Covers coordination aspect as in Definition 1.1,
- Includes fault tolerance and message passing aspects of Lamport's observation.
1.3 Challenges of a Distributed System
Some of the well-understood bottlenecks for implementing a distributed system are the following:
- Centralized algorithms: A single computer is responsible for program control decisions. These algorithms are suitable for client-server model of computation where a server may be overwhelmed by many...
Systemvoraussetzungen
Dateiformat: ePUB
Kopierschutz: Adobe-DRM (Digital Rights Management)
Systemvoraussetzungen:
- Computer (Windows; MacOS X; Linux): Installieren Sie bereits vor dem Download die kostenlose Software Adobe Digital Editions (siehe E-Book Hilfe).
- Tablet/Smartphone (Android; iOS): Installieren Sie bereits vor dem Download die kostenlose App Adobe Digital Editions oder die App PocketBook (siehe E-Book Hilfe).
- E-Book-Reader: Bookeen, Kobo, Pocketbook, Sony, Tolino u.v.a.m. (nicht Kindle)
Das Dateiformat ePUB ist sehr gut für Romane und Sachbücher geeignet – also für „fließenden” Text ohne komplexes Layout. Bei E-Readern oder Smartphones passt sich der Zeilen- und Seitenumbruch automatisch den kleinen Displays an.
Mit Adobe-DRM wird hier ein „harter” Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.
Bitte beachten Sie: Wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!
Weitere Informationen finden Sie in unserer E-Book Hilfe.