0% found this document useful (0 votes)
261 views

Unit 11 Dependability-and-Security

This document discusses the importance of dependability in systems and defines key concepts. It notes that system failures can affect many users and lead to rejection, high costs, and data loss. Dependability is defined as the degree of confidence users have that a system will operate as expected without failure. The four main dimensions of dependability are availability, reliability, safety, and security. Reliability is the probability a system delivers services correctly over time, while availability is the probability a system can deliver services upon request. Ensuring dependability requires avoiding errors, effective testing, and fault tolerance mechanisms.

Uploaded by

Mysto Gan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
261 views

Unit 11 Dependability-and-Security

This document discusses the importance of dependability in systems and defines key concepts. It notes that system failures can affect many users and lead to rejection, high costs, and data loss. Dependability is defined as the degree of confidence users have that a system will operate as expected without failure. The four main dimensions of dependability are availability, reliability, safety, and security. Reliability is the probability a system delivers services correctly over time, while availability is the probability a system can deliver services upon request. Ensuring dependability requires avoiding errors, effective testing, and fault tolerance mechanisms.

Uploaded by

Mysto Gan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Why Dependability important

 System failures affect a large number of people.


 Users often reject systems that are unreliable,
unsafe, or insecure
 System failure costs may be enormous.
 Undependable systems may cause information
loss
Remember following things while developing
dependable system
◦ Hardware Failure
◦ Software Failure
◦ Operational Failure
Dependability Properties
 The dependability of a computer system is a
property of the system that reflects its
trustworthiness.
 Trustworthiness here essentially means the
degree of confidence a user has that the system
will operate as they expect, and that the system
will not ‘fail’ in normal use.
 It is not meaningful to express dependability
numerically.
 Programs running on computers may not
operate as expected and occasionally may
corrupt the data that is managed by the system
Principles of Dependability
Four principle dimension of dependability
 Availability Informally, the availability of a system is
the probability that it will be up and running and
able to deliver useful services to users at any given
time.
 Reliability Informally, the reliability of a system is the
probability, over a given period of time, that the
system will correctly deliver services as expected by
the user.
 Safety Informally, the safety of a system is a
judgment of how likely it is that the system will
cause damage to people or its environment.
 Security Informally, the security of a system is a
judgment of how likely it is that the system can
resist accidental or deliberate intrusions.
Four main dependability properties
 Reparability System failures are inevitable, but the
disruption caused by failure can be minimized if the
system can be repaired quickly.
 Maintainability As systems are used, new
requirements emerge and it is important to maintain
the usefulness of a system by changing it to
accommodate these new requirements.
 Survivability A very important attribute for Internet-
based systems is survivability. Survivability is the
ability of a system to continue to deliver service whilst
under attack and, potentially, whilst part of the system
is disabled.
 Error Tolerance This property can be considered as
part of usability and reflects the extent to which the
system has been designed so that user input errors
are avoided and tolerated.
Ensure these points before developing
dependable software
 You avoid the introduction of accidental errors into
the system during software specification and
development.
 You design verification and validation processes
that are effective in discovering residual errors that
affect the dependability of the system.
 You design protection mechanisms that guard
against external attacks that can compromise the
availability or security of the system.
 You configure the deployed system and its
supporting software correctly for its operating
environment.
Availability and reliability
 System availability and reliability are closely related
properties that can both be expressed as numerical
probabilities.
 The availability of a system is the probability that
the system will be up and running to deliver these
services to users on request. The reliability of a
system is the probability that the system’s services
will be delivered as defined in the system
specification.
 Reliability and availability are closely related but
sometimes one is more important than the other. If
users expect continuous service from a system then
the system has a high availability requirement.
Availability and reliability
 The definition of reliability states that the
environment in which the system is used and the
purpose that it is used for taken into account.
 If you measure system reliability in one
environment, you can’t assume that the reliability
will be the same if the system is used in a different
way.
 Reliability The probability of failure-free operation
over a specified time, in a given environment, for a
specific purpose.
 Availability The probability that a system, at a point
in time, will be operational and able to deliver the
requested services.
Availability and reliability Continue
 A strict definition of reliability relates the system
implementation to its specification.
 Availability and reliability are obviously linked as
system failures may crash the system.
 Availability does not just depend on the number
of system crashes, but also on the time needed
to repair the faults that have caused the failure.
 System reliability and availability problems are
mostly caused by system failures.
 Some of these failures are a consequence of
specification errors or failures in other related
systems such as a communications system.
Reliability Terminology
 Human error or mistake
 Human behavior that results in the introduction of faults
into a system.
 System fault
 A characteristic of a software system that can lead to a
system error. The fault is the inclusion of the code to
add 1 hour to the time of the last transmission, without a
check if the time is greater than or equal to 23.00.
 System error
 An erroneous system state that can lead to system
behavior that is unexpected by system users.
 System failure
 An event that occurs at some point in time when the
system does not deliver a service as expected by its
users. No weather data is transmitted because the time
System Reliability and Availability
 When an input or a sequence of inputs causes
faulty code in a system to be executed, an
erroneous state is created that may lead to a
software failure.
 Most inputs do not lead to system failure. However,
some inputs or input combinations,
 shown in the shaded ellipse Ie in below fig. cause
system failures or erroneous outputs to be
generated.
 If inputs in the set Ie are executed by frequently
used parts of the system, then failures will be
frequent.
 However, if the inputs in Ie are executed by code
that is rarely used, then users will hardly ever see
failures.
System error and System failure
 Not all code in a program is executed. The code
that includes a fault (e.g., the failure to initialize
a variable) may never be executed because of
the way that the software is used.
 Errors are transient. A state variable may have
an incorrect value caused by the execution of
faulty code. However, before this is accessed
and causes a system failure, some other system
input may be processed that resets the state to
a valid value.
 The system may include fault detection and
protection mechanisms. These ensure that the
erroneous behavior is discovered and corrected
before the system services are affected.
complementary approaches that are used to
improve the reliability of a system:
 Fault avoidance Development techniques are used that
either minimize the possibility of human errors and/or
that trap mistakes before they result in the introduction
of system faults. Examples of such techniques include
avoiding error-prone programming language constructs
such as pointers and the use of static analysis to detect
program anomalies.
 Fault detection and removal The use of verification and
validation techniques that increase the chances that
faults will be detected and removed before the system is
used. Systematic testing and debugging is an example
of a fault detection technique.
 Fault tolerance These are techniques that ensure that
faults in a system do not result in system errors or that
system errors do not result in system failures.
Safety
 Safety-critical systems are systems where it is
essential that system operation is always safe; that
is, the system should never damage people or the
system’s environment even if the system fails.
 Examples of safety-critical systems include control
and monitoring systems in aircraft, process control
systems in chemical and pharmaceutical plants,
and automobile control systems.
 Hardware control of safety-critical systems is simple
to implement and analyze than software control.
 we now build systems of such complexity that they
cannot be controlled by hardware alone. Software
control is essential because of the need to manage
large numbers of sensors and actuators with
complex control laws.
Two classes of Safety Critical
 Primary safety-critical software This is software
that is embedded as a controller in a system.
Malfunctioning of such software can cause a
hardware malfunction, which results in human
injury or environmental damage.
 Secondary safety-critical software This is
software that can indirectly result in an injury. An
example of such software is a computer-aided
engineering design system whose
malfunctioning might result in a design fault in
the object being designed. This fault may cause
injury to people if the designed system
malfunctions.
Why all reliable system are not safe
 We can never be 100% certain that a software
system is fault-free and fault tolerant.
Undetected faults can be dormant for a long
time and software failures can occur after many
years of reliable operation.
 The specification may be incomplete in that it
does not describe the required behavior of the
system in some critical situations.
 Hardware malfunctions may cause the system
to behave in an unpredictable way, and present
the software with an unanticipated environment.
 The system operators may generate inputs that
are not individually incorrect but which, in some
situations, can lead to a system malfunction.
Safety Terminology
 Accident
 An unplanned event or sequence of events
which results in human death or injury, damage
to property, or to the environment.
 Hazard
 A condition with the potential for causing or
contributing to an accident.
 Damage
 A measure of the loss resulting from a mishap.
Damage can range from many people being
killed as a result of an accident to minor injury or
property damage.
Safety Terminology
 Hazard Severity
 An assessment of the worst possible damage
that could result from a particular hazard.
 Hazard Probability
 The probability of the events occurring which
create a hazard. Probability values tend to be
arbitrary but range from ‘probable’ (say 1/100
chance of a hazard occurring) to ‘implausible’
(no conceivable situations are likely in which the
hazard could occur).
 Risk
 This is a measure of the probability that the
system will cause an accident.
Ways of assuring safety
 Hazard avoidance The system is designed so that
hazards are avoided. For example, a cutting system
that requires an operator to use two hands to press.
separate buttons simultaneously avoids the hazard of
the operator’s hands being in the blade pathway.
 Hazard detection and removal The system is
designed so that hazards are detected and removed
before they result in an accident. Example, a chemical
plant system may detect excessive pressure and
open a relief valve to reduce these pressures before
an explosion occurs.
 Damage limitation The system may include protection
features that minimize the damage that may result
from an accident. Example, an aircraft engine
normally includes automatic fire extinguishers. If a fire
occurs, it can often be controlled before it poses a
Security
 Security reflects the ability of a system to protect
itself against external attacks. Security failures may
lead to loss of availability, damage to the system or
its data, or the leakage of information to
unauthorized people.
 Security is a system attribute that reflects the ability
of the system to protect itself from external attacks,
which may be accidental or deliberate.
 If you really want a secure system, it is best not to
connect it to the Internet.
 Military systems, systems for electronic commerce,
and systems that involve the processing and
interchange of confidential information must be
designed so that they achieve a high level of
security.
Security Terminology
 Asset The records of each patient that is receiving or
has received treatment.
 Exposure Potential financial loss from future patients
who do not seek treatment because they do not trust
the clinic to maintain their data. Financial loss from
legal action by the sports star. Loss of reputation.
 Vulnerability A weak password system which makes
it easy for users to set guessable passwords. User ids
that are the same as names.
 Attack An impersonation of an authorized user.
 Threat An unauthorized user will gain access to the
system by guessing the credentials (login name and
password) of an authorized user.
 Control A password checking system that disallows
user passwords that are proper names or words that
are normally included in a dictionary.
Types of Security threats in Security System
 Threats to the confidentiality of the system
and its data These can disclose information to
people or programs that are not authorized to
have access to that information.
 Threats to the integrity of the system and its
data These threats can damage or corrupt the
software or its data.
 Threats to the availability of the system and
its data These threats can restrict access to the
software or its data for authorized users.
security are comparable to those for reliability and
safety:
 Vulnerability avoidance Controls that are intended
to ensure that attacks are unsuccessful. The
strategy here is to design the system so that
security problems are avoided.
 Attack detection and neutralization Controls that
are intended to detect and repel attacks. These
controls involve including functionality in a system
that monitors its operation and checks for unusual
patterns of activity.
 Exposure limitation and recovery Controls that
support recovery from problems. These can range
from automated backup strategies and information
‘mirroring’ to insurance policies that cover the costs
associated with a successful attack on the system.
Success Criteria
 Generally, complex sociotechnical systems are
developed to tackle what are sometimes called
‘wicked problems’.
 A wicked problem is a problem that is so complex
and which involves so many related entities that
there is no definitive problem specification.
 Different stakeholders see the problem in different
ways and no one has a full understanding of the
problem as a whole.
 The nature of security and dependability attributes
sometimes makes it even more difficult to decide if
a system is successful.
 The intention of a new system may be to improve
security by replacing an existing system with a
more secure data environment.
System Engineering
Systems engineering encompasses all of the
activities involved in procuring, specifying,
designing, implementing, validating, deploying,
operating, and maintaining sociotechnical systems.
Three Overlapping stages of Sociotechnical System
 Procurement or acquisition During this stage,
the purpose of a system is decided; high-level
system requirements are established; decisions
are made on how functionality will be distributed
across hardware, software, and people; and the
components that will make up the system are
purchased.
System Engineering
 Development During this stage, the system is
developed. Development processes include all of
the activities involved in system development such
as requirements definition, system design,
hardware and software engineering, system
integration, and testing. Operational processes are
defined and the training courses for system users
are designed.
 Operation At this stage, the system is deployed,
users are trained, and the system is brought into
use. The planned operational processes usually
then have to change to reflect the real working
environment where the system is used.
 Over time, the system evolves as new requirements
are identified. Eventually, the system declines in
Stages of System Engineering
System Procurement
 The initial phase of systems engineering is
system procurement (sometimes called system
acquisition).
 At this stage, decisions are made on the scope
of a system that is to be purchased, system
budgets and timescales, and the high-level
system requirements.
 Using mentioned information, further decisions
are then made on whether to procure a system,
the type of system required, and the supplier or
suppliers of the system.
Drivers of System Procurement Decisions
 The state of other organizational systems
 The need to comply with external regulations
 External competition
 Business reorganization
 Available budget
Procurement Process for COTS
Important points of Procurement Process
 Off-the-shelf components do not usually match
requirements exactly, unless the requirements have
been written with these components in mind.
 When a system is to be built specially, the
specification of requirements is part of the contract
for the system being acquired. It is therefore a legal
as well as a technical document.
 After a contractor has been selected, to build a
system, there is a contract negotiation period
where you may have to negotiate further changes
to the requirements and discuss issues such as the
cost of changes to the system.
 Once a COTS system has been selected, you may
negotiate with the supplier on costs, license
conditions, possible changes to the system, etc.
System Development
 The goals of the system development process are
to develop or acquire all of the components of a
system and then to integrate these components to
create the final system.
 During procurement, business and high-level
functional and nonfunctional system requirements
are defined.
 This systems engineering process was an
important influence on the ‘waterfall’ model of the
software process.
 Although it is now accepted that the ‘waterfall’
model is not usually appropriate for software
development, most systems development
processes are plan-driven processes that still follow
this model.
System Development
Fundamental Activities of System Development
 Requirements development The high-level and
business requirements identified during the
procurement process have to be developed in more
detail. Requirements may have to be allocated to
hardware, software, or processes and prioritized for
implementation.
 System design This process overlaps significantly
with the requirements development process. It
involves establishing the overall architecture of the
system, identifying the different system components
and understanding the relationships between them.
 Subsystem engineering This stage involves
developing the software components of the system;
configuring off-the-shelf hardware and software,
designing, if necessary, special-purpose hardware;
defining the operational processes for the system.
Fundamental Activities of System Development
 System integration The components are put
together to create a new system. Only then do the
emergent system properties become apparent.
 System testing This is usually an extensive,
prolonged activity where problems are discovered.
The subsystem engineering and system integration
phases are reentered to repair these problems,
tune the performance of the system, and implement
new requirements. System testing may involve both
testing by the system developer/acceptance user
testing and by the organization that has procured
the system.
 System deployment This is the process of making
the system available to its users, transferring data
from existing systems, & establishing communication
with other systems in the environment.
Fundamental Activities of System Development
 System integration The components are put
together to create a new system. Only then do the
emergent system properties become apparent.
 System testing This is usually an extensive,
prolonged activity where problems are discovered.
The subsystem engineering and system integration
phases are reentered to repair these problems,
tune the performance of the system, and implement
new requirements. System testing may involve both
testing by the system developer/acceptance user
testing and by the organization that has procured
the system.
 System deployment This is the process of making
the system available to its users, transferring data
from existing systems, & establishing communication
with other systems in the environment.
System Operation
 Operational processes are the processes that are
involved in using the system for its defined purpose.
 For example, operators of an air traffic control
system follow specific processes when aircraft enter
and leave airspace, when they have to change
height or speed, when an emergency occurs, and
so on.
 The key benefit of having system operators is that
people have a unique capability of being able to
respond effectively to unexpected situations, even
when they have never had direct experience of
these situations.
 A problem that may only emerge after the system
goes into operation is the operation of the new
system alongside existing systems.
Reason’s Cheese Swiss Model
 In this model, the defenses built into a system are
compared to slices of Swiss cheese.
 Some types of Swiss cheese, such as Emmental,
have holes and so the analogy is that the latent
conditions are comparable to the holes in cheese
slices.
 The position of these holes is not static but changes
depending on the state of the overall sociotechnical
system.
 If each slice represents a barrier, failures can occur
when the holes line up at the same time as a
human operational error. An active failure of system
operation gets through the holes and leads to an
overall system failure.
Reason’s Cheese Swiss Model
 To reduce the probability that system failure will
result from human error, designers should:
◦ Design a system so that different types of barriers
are included. This means that the ‘holes’ will
probably be in different places and so there is
less chance of the holes lining up and failing to
trap an error.
◦ Minimize the number of latent conditions in a
system. Effectively, this means reducing the
number and size of system ‘holes’.
 Human errors are inevitable and systems should
include barriers to detect these errors before they
lead to system failure. Reason’s Swiss cheese
model explains how human error plus latent defects
in the barriers can lead to system failure.

You might also like