The document discusses quantitative reliability evaluation and fault tolerance frameworks. It analyzes several systems - ESS NO.1A, SIFT, and FTMP - and identifies their main components like CPUs, memory, and I/O modules. It also categorizes reliability strategies and estimates the expected downtime from factors like hardware failures, software bugs, procedural faults, and fault tolerance deficiencies. Fault tolerance is implemented through techniques like redundancy, exception handling, and synchronized replication.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
22 views
Quantitative Reliability Evaluation
The document discusses quantitative reliability evaluation and fault tolerance frameworks. It analyzes several systems - ESS NO.1A, SIFT, and FTMP - and identifies their main components like CPUs, memory, and I/O modules. It also categorizes reliability strategies and estimates the expected downtime from factors like hardware failures, software bugs, procedural faults, and fault tolerance deficiencies. Fault tolerance is implemented through techniques like redundancy, exception handling, and synchronized replication.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 8
Quantitative Reliability Evaluation
► Hard ware components failure
► Quantitative measures connecting physical faults are based on three assumptions: ► Component failures will occur independently in independent replicated units ► The behavior of a physical component can be predicted from data gathered from observations of other components that are assumed to be similar. ► The design of the system is free from faults Frame work for implementing FT ► Redundancy ► Exceptions- principles ► Exception handling ► interface exceptions ► signaling exceptions ► failure exceptions Fault Tolerant systems
ESS NO.1A System
► CPU ► Program stores ► Call stores ► Auxiliary units Reliability Strategies The four categories and their expected contributions to system down-time are ► Hardware unreliability-0.4min/yr ► Software unreliability-0.3min/yr ► Procedural faults -0.6min/yr ► Fault tolerance deficiencies-0.7min/yr ► Tolerance of software faults is limited to attempts at maintaining the consistency of the data base. ► The programs that perform these error detection & recovery actions are referred to as the Audit programs. SIFT System
► Main processing modules
► I/O processing modules ► Buses Software system ► Application ► Executive Reliability Strategies ► The processing modules & the I/O processing modules do not contain special hardware provisions for implementing fault tolerance. FTMP System ► Number of processing modules ► Global memory modules ► I/O modules ► Buses ► BGU ► BIU Reliability Strategies ► The replicated executions of a program occur in tight synchronization ► It supports TMR replication ► Implementation of the replication