
Practical System Reliability
Wiley (Publisher)
1st Edition
Published on 29. April 2009
Book
Hardback
288 pages
978-0-470-40860-5 (ISBN)
Description
This book explains how system availability and software reliability relate to real-world telecommunications systems. Readers will gain knowledge of how to understand, model, predict, and manage system availability throughout the development cycle. The methods and concepts discussed are practical in nature, and the modeling and prediction techniques and tools are customer-focused, data-driven, and aligned with industry standards. This is a valuable resource for system/software architects, engineers, testers, and product managers working in software in the industrial, IT, telecommunications, aerospace, military, and medical fields.
More details
Edition
1., Auflage
Language
English
Place of publication
Hoboken
United Kingdom
Publishing group
John Wiley and Sons Ltd
Target group
Professional and scholarly
Dimensions
Height: 23.9 cm
Width: 16.4 cm
Thickness: 21 mm
Weight
592 gr
ISBN-13
978-0-470-40860-5 (9780470408605)
Schweitzer Classification
Other editions
Additional editions

Eric Bauer | Xuemei Zhang | Douglas A. Kimber
Practical System Reliability
E-Book
03/2009
Wiley-IEEE Press
€63.99
Available for download
Persons
Eric Bauer is a manager of reliability engineering in Alcatel-Lucent's wireline business in Murray Hill, New Jersey. He has designed, modeled, and analyzed reliability for many different products and solutions, and architected and developed software for a variety of communications devices, platforms, and products.
Xuemei Zhang, PhD, is a principal member of the technical staff in the Network Design and Performance Analysis Department at AT&T Labs. She has been working on reliability and performance analysis of wireline and wireless communications systems and networks. Her major work and research areas are in system and architectural reliability and performance, product and solution reliability and performance modeling, and software reliability.
Douglas A. Kimber retired from Alcatel-Lucent as a staff reliability engineer. Throughout his career at Bell Labs, Lucent Technologies, and Alcatel-Lucent, he developed high reliability hardware and software platforms, applications, and systems, and then transitioned to reliability engineering where he did reliability modeling and analysis.
Content
Preface. Acknowledgments. 1 Introduction. 2 System Availability. 2.1 Availability, Service and Elements. 2.2 Classical View. 2.3 Customers' View. 2.4 Standards View. 3 Conceptual Model of Reliability and Availability. 3.1 Concept of Highly Available Systems. 3.2 Conceptual Model of System Availability. 3.3 Failures. 3.4 Outage Resolution. 3.5 Downtime Budgets. 4 Why Availability Varies Between Customers. 4.1 Causes of Variation in Outage Event Reporting. 4.2 Causes of Variation in Outage Duration. 5 Modeling Availability. 5.1 Overview of Modeling Techniques. 5.2 Modeling Definitions. 5.3 Practical Modeling. 5.4 Widget Example. 5.5 Alignment with Industry Standards. 6 Estimating Parameters and Availability from Field Data. 6.1 Self-Maintaining Customers. 6.2 Analyzing Field Outage Data. 6.3 Analyzing Performance and Alarm Data. 6.4 Coverage Factor and Failure Rate. 6.5 Uncovered Failure Recovery Time. 6.6 Covered Failure Detection and Recovery Time. 7 Estimating Input Parameters from Lab Data. 7.1 Hardware Failure Rate. 7.2 Software Failure Rate. 7.3 Coverage Factors. 7.4 Timing Parameters. 7.5 System-Level Parameters. 8 Estimating Input Parameters in the Architecture/Design Stage. 8.1 Hardware Parameters. 8.2 System-Level Parameters. 8.3 Sensitivity Analysis. 9 Prediction Accuracy. 9.1 How Much Field Data Is Enough? 9.2 How Does One Measure Sampling and Prediction Errors? 9.3 What Causes Prediction Errors? 10 Connecting the Dots. 10.1 Set Availability Requirements. 10.2 Incorporate Architectural and Design Techniques. 10.3 Modeling to Verify Feasibility. 10.4 Testing. 10.5 Update Availability Prediction. 10.6 Periodic Field Validation and Model Update. 10.7 Building an Availability Roadmap. 10.8 Reliability Report. 11 Summary. Appendix A System Reliability Report outline. 1 Executive Summary. 2 Reliability Requirements. 3 Unplanned Downtime Model and Results. Annex A Reliability Definitions. Annex B References. Annex C Markov Model State-Transition Diagrams. Appendix B Reliability and Availability Theory. 1 Reliability and Availability Definitions. 2 Probability Distributions in Reliability Evaluation. 3 Estimation of Confidence Intervals. Appendix C Software Reliability Growth Models. 1 Software Characteristic Models. 2 Nonhomogeneous Poisson Process Models. Appendix D Acronyms and Abbreviations. Appendix E Bibliography. Index. About the Authors.