0% found this document useful (0 votes)
9 views54 pages

7 - Parameter Estimation

Uploaded by

Jennifer Lopez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views54 pages

7 - Parameter Estimation

Uploaded by

Jennifer Lopez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 54

Techniques of Parameter

Estimation

Dr Ompal Singh
Associate Professor
Department of Operational Research
University of Delhi
[email protected]
SOFTWARE RELIABILITY
Software Reliability is the probability of
failure-free operation of a system over a
specified time within a specified
environment for a specified purpose.
Software Reliability is probably the most
important of the characteristics inherent
in the concept “Software Quality”.
 Software Reliability concerns itself with
how well the software functions to meet
the requirement of the customer.
CHARACTERISTICS OF
SOFTWARE RELIABILITY
Failures are primarily due to design faults.
Repairs are made by modifying the design to
make it robust against conditions that can trigger
a failure.
There is no wear out phenomenon.
Software errors occur without warning.
Old codes can exhibit an increasing failure rate as
a function of error included while making updates.
External environment conditions do not affect
software reliability.
Internal environment factors such as insufficient
memory and inappropriate clock speeds do affect
the software reliability.
CHARACTERISTICS OF
SOFTWARE
RELIABILITY(CONTD
Reliability is not time dependent.
.)
Failure occurs when the logic path that contains
an error is executed.
Reliability growth is observed as errors are
detected and corrected.
A system without faults is considered highly
reliable.
Constructing a correct system is a difficult task.
Even an incorrect system may be considered to
be reliable if the frequency of failure is
“acceptable.”
ERROR , FAULT , FAILURE
ERROR:A discrepancy between a computed,
observed, or measured value or condition and
the true, specified, or theoretically correct value
or condition..
FAULT: An incorrect step, process, or data
definition in a computer program which causes
the program to perform in an unintended or
unanticipated manner. It is an inherent
weakness of the design or implementation
which might result in a failure.
FAILURE: The inability of a system or
component to perform its required functions
within specified performance requirements
FAILURE INTENSITY
Failure Intensity: The number of failures per unit
time.
Failure intensity is a way of expressing reliability.
The failure intensity is given by
λ (μ) = λ [ 1 – μ/ν ]
0 0

Where,
μ : Number of failures experienced
ν :Number of failures that would occur in
0

infinite time
λ : Initial failure intensity, at the start of the
0

execution
MEAN TIME TO
FAILURE(MTTF)
Mean Time To Failure (MTTF) is a basic
measure of reliability for non-repairable
systems.
Mean Time To Failure is the mean time
expected until the first failure of a piece
of equipment.
Mean Time To Failure is a statistical
value and is meant to be the mean over
a long period of time and a large
number of units.
MEAN TIME BEFORE FAILURE
(MTBF)
Mean time before failures (MTBF) is the
predicted elapsed time between
inherent failures of a system during
operation.
MTBF =
Reliability = MTBF / (1+MTBF)
HAZARD RATE
Failure rate is the frequency with which
an engineered system or component fails,
expressed, for example, in failures per hour.
It is often denoted by the Greek letter
λ(lambda) and is important in reliability
engineering.
λ(t)= f(t)/R(t)
where f(t): failure density function.
R(t): reliability function
And R(t)=1-F(t) where F(t) is the failure
distribution function.
AVAILABILITY
Availability is the measure of how likely a
system is available for use, taking in to
account repairs and other down-time.
 E.g., Availability of .998 means that
system is available 998 out of 1000 time
units.
 Relevant for continuously running systems.
 Thus, Availability is the ability of an item to
be in a state to perform a required function
at a given instant of time or at any instant
of time within a given time interval,
assuming that the external resources, if
required, are provided.
DIFFERENCE BETWEEN
RELIABILITY AND
AVAILABILITY
An important difference between
reliability and availability is that
reliability refers to failure-free operation
during an interval, while availability
refers to failure-free operation at a given
instant of time, usually the time when a
device or system is first accessed to
provide a required function or service.
MAINTAINABILITY
Maintainability is the ease with which a
product can be maintained in order to:
isolate defects or their cause,
correct defects or their cause,
repair or replace faulty or worn-out
components without having to replace still-
working parts,
prevent unexpected breakdowns,
maximize a product's useful life,
maximize efficiency, reliability, and safety,
meet new requirements,
make future maintenance easier, or
cope with a changed environment.
SURVIVABILITY
 Survivability is the capability of a system
to fulfill its mission, in a timely manner, in
the presence of attacks, failures, or
accidents.
POISSON PROCESS
 Poisson Process is a stochastic process which counts the
number of events and the time that these events occur in a
given time interval.
 The basic form of Poisson process, is a continuous-
time counting process {N(t), t ≥ 0} that possesses the
following properties:
N(0) = 0
Independent increments(the numbers of occurrences
counted in disjoint intervals are independent of each other)
Stationary increments(the probability distribution of the
number of occurrences counted in any time interval only
depends on the length of the interval)
The probability distribution of N(t) is a Poisson distribution.
No counted occurrences are simultaneous.
PARAMETER
ESTIMATION
PARAMETER ESTIMATION
Parameter estimation is the process of trying to
calculate model parameters based on a dataset.

It takes sample data as given and estimate the


most likely model fitting that data set also it
resolves to find most likely values for
parameters of the model.

This dataset can be the result of time course or


steady-state experiments or both.
.
Need of Parameter
Estimation

The task of mathematical model building is


incomplete until the unknown parameter of
the model are estimated and validated on
actual software failure data set.

Also to compare the performance of the


model with other existing model and draw
valid conclusions, we should have
knowledge of all the unknown parameters.
The estimation of unknown population
parameters can be done in two ways :
METHODS OF
ESTIMATION

POINT INTERVAL
ESTIMATION ESTIMATION
Interval Estimation

A point estimator may not coincide with the


actual value of the parameter.
In this situation it is favorable to determine
an interval of possible or probable values of
an unknown population parameter.
This is called confidence interval estimation
of the form [α, β] where α is the lower
bound and β is the upper bound on the
parameter value.
Point Estimation
• It deals with use of sample data to calculate a
value for the unknown parameters of the
model, which can be said a “best guess”.

• Best guess mean here that the estimated


value of the parameter satisfies the following
properties :
 Unbiasedness
 Consistency
 Efficiency
 Sufficiency
Methods of obtaining point
estimators

Method of Moments
Method of minimum chi-squares
Method of least squares
Method of maximum likelihood

Since most of the NHPP based SRGM are non


linear functions,
Method of Maximum Likelihood Estimate (MLE)
is a widely used estimation technique along
with Method of Least Square.
.

Parameter Estimation
. Techniques
 The success of mathematical modeling approach to
reliability evaluation depends heavily upon quality of
failure data collected. The parameters of the SRGM are
estimated based upon these data. Hence efforts should be
made to make the data collection more explicit and
scientific. Usually data is collected in one of the following
two ways.

 In the first case the times between successive failures are


recorded. Though this type of data collection is more
desirable, it may not be simple.

 The other easier and commonly collected data type is


known as the grouped data. Here testing intervals are
specified and number of failures experienced during each
such interval is noted.

 For both these data types Method of least squares and


method of maximum likelihood have been suggested and
widely used for estimation of parameters of SRGMs.
Method of Least Squares
Least Square Estimation methods evaluate
the set of parameters with the highest
probability of being correct for a given set of
experimental data.
In this method to estimate parameters, the
squared discrepancies between observed data
and their expected values are minimized.
 That is the best estimated value is the one for
which the "difference" between the data and
the function fitting the data is minimum,
where the difference is defined as the sum of
the squared errors.
Method of Least Squares
 If the expected value of the response variable mˆ (t )
is given by (can be a mean value function
of an SRGM), then the least square estimators
of the parameters of the model may be
obtained from n pairs of sample values (t1, y1),
(t2, y2), …, (tn, ynn) by minimizing J given by
J   y i  m ˆ (t )
2

i 1

where ti and yi are observed values of


explanatory and dependent variables
respectively.
Non Linear Least Square Method for G O
Model
Advantages of Least Square
Method
It can be used to predict number of
detected faults in future rather than the
estimation of the quantitative reliability
function, in order to control the software
testing process.
General technique applicable in most
practical situations for small to medium
sample sizes.
 It has an advantage of saving both test-
time and resources as it is easy to use.
Drawbacks of Least Square
Method
Considered to have less desirable
optimality properties than maximum
likelihood.
Can be quite sensitive to the choice of
starting values.
Can not generate accurate results for data
of sufficiently large samples.
Method of Maximum
Likelihood
It is the most popular and useful method given
by Prof. R. A. Fisher for deriving the point
estimates.

The maximum likelihood technique consists of


solving a set of simultaneous equations for
parameter values. The equations define
parameter values that maximize the likelihood
function.

Maximum Likelihood Estimation (MLE) method


has been extensively adopted for estimation of
parameters of SRGM based upon NHPP.
Method of MLE
Method of MLE
MLE for G O Model
MLE for G O Model
MLE of G O Model
Advantages of MLE
Maximum likelihood provides a consistent
approach to parameter estimation
problems. This means that MLEs can be
developed for a large variety of estimation
situations.

MLE has many optimal properties in


estimation i.e. sufficiency, consistency and
efficiency.
Drawbacks of MLE
With small numbers of failures MLE's can be heavily
biased and the large sample optimality properties
do not apply.
Sometimes the likelihood equations may be
complicated and difficult to solve explicitly.
It doesn’t give basis for testing hypotheses or
constructing models.
MLE of the initial number of software fault is
unreliable.
For instance when n software faults are detected in
the testing, it is shown that the MLE of the initial
number of software faults tends to take a close
value to n in many cases even if the time length of
testing is relatively short.
COMPARISON CRITERIAS
Comparison Criteria
Once some model have been selected for
an application, their performance can be
judged by their ability
 to fit the observed data (Goodness of fit )
 to predict satisfactorily the future
behavior of the process (predictive
validity )
A set of comparison criteria is used to
compare models quantitatively.
Predictive Validity Criterion

Predictive validity is defined as the


ability of the model to determine the
future failure behavior from present and
past failure behavior of a process. This
criterion was proposed by Musa, Iannino,
Okumoto.
Goodness of fit
The term goodness of fit is used in two
different contexts.

In one context, it answers the question if a


sample of data came from a population with
a specific distribution.

In another context, it answers the question


of “How good does a mathematical model
(for example a linear regression model) fit to
the data”?
Goodness of Fit

The literature refers to twelve common


criteria of Goodness of fit for SRGMs.
 Bias
 Mean Square Error (MSE)
 Mean Absolute Error
(MAE)
 The Akaike Information
Criterion (AIC)
 Accuracy of Estimation
(AE)
 Predictive Ratio Risk
(PRR)
Goodness of Fit Criteria
Case Study
The intense global competition in the dynamic
environment has lead to up-gradations of software
product in the market. The software developers are
trying very hard to project themselves as organizations
that provide better value to their customers.
One major way to increase the market presence is by
offering new functionalities in the software periodically.
To get the competitive edge it is critical to know the
appropriate release of the software.
Here we have a data set for 4 releases ( up gradations )
of a software. Company recorded time ( in weeks) and
Faults observed each week for each release.
Now they wish to analyze which release is best when
the observed data is fitted as per G O Model.
 Data observed is listed in following table.
Time (in Release Release Release Release
weeks ) 1 2 3 4
1 16 13 6 1
2 24 18 9 3
3 27 26 13 8
4 33 34 20 9
5 41 40 28 11
6 49 48 40 16
7 54 61 48 19
9 58 75 54 25
9 69 84 57 27
10 75 89 59 29
11 81 95 60 32
12 86 100 61 32
13 90 104 36
14 93 110 38
15 96 112 39
16 98 114 39
17 99 117 41
18 100 118 42
19 100 120 42
20 100
a 100 120 61 42

Faults observed are listed as cumulative data here.


 On estimating parameters a, and b for 4 releases
using SPSS we have following results :
Release 1 Release 2 Release 3 Release
4
Est. Obs. Est. Obs. Est. Obs. Est. Obs
a 103. 100 118.2 120 59 62 48.9 42
2 5
b Goodness
0.10 ------of0.184 -------for0.252
fit curve --------
Release 1 0.02 --------
4 - 9 -
 Goodness of Fit Curve for Release 2

 Goodness of Fit Curve for Release 3


 Goodness of Fit Curve For Release 4
Comparison Criterion Table
Conclusions

You might also like