JoeSipperDemystifyingWeibull PDF
JoeSipperDemystifyingWeibull PDF
Joe Sipper
9/13/2016
b) Scope
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 3
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
1a1. The Big Picture: Today’s Presentation
• There are lots of great Weibull applications to predict system life,
reliability, success rates, preventive maintenance intervals,
warranty structures, repair strategies, etc
• The same source data can be used for everything we’ll discuss
however the format needs to be in a format appropriate for each
assessment
• Today’s outputs can feed into some Weibull applications but that’s
for another day
• Today is about structuring failure data, analyzing it in Minitab, and
making first-tier conclusions about where in the bathtub a
performance falls
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 4
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
1a2. The Big Picture: Characterize Performances
• Understanding the problem is often the hardest part of the
problem; getting and arranging good data is a much easier for a
problem that is well understood than for one that is not
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 5
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
1a3. The Big Picture: Use Weibull for Systems?
• A lot depends on the question, i.e., the needs of the assessment
• Some believe that Weibull can be used only for an individual failure
mode, and multiple failure modes offset performance characteristics
and muddy the analysis
• Others believe a system is simply a series of components
• System fails when one (random) component fails
• System time-to-failure = smallest of the component failure times
• This is equivalent to Weibull distribution
• Underlying premise to, but not explored in, today’s presentation
• Ideally run a unique Weibull assessment for each failure mode,
however the method given here addresses needs of the assessment
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 6
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
1b1. Distribution Plotting for this Presentation
• Scope focuses on expectations of performance when the system or
element is repaired, and the corresponding data structures
Non-Repairable Repairable (System or Element) Repairable
(System or Element) analyze patterns b/w failures (Fleet)
Typical Single failure mode Individual system based Individual system Fleet based on
characterization based on sample data on field data based on field data field data
Element fails vs. System failures vs. System failures vs. Fleet failures vs.
Data analyzed element usage system usage system usage fleet usage
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 7
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
1b2. Repair Type and Data Structures
Timelines shown on this slide are in terms of each system studied
Replace “Like New” Repair “Minimal” Repair
• Brand new • Expectation is that the next failure • Next failure depends on
hardware is random (independent) because current usage, i.e., the system’s
restarts the history is reset, i.e., the repair age prior to the repair
the clock returned a system that was as good
Data structure: as new • System Reliability is the same
ordered usage
Data structure:
after the repair as it was
from T0 = 0
t1 ordered usage t1 before the repair
I t X from T0 = 0 I t X
2 2 Data structure:
It X It X superimposed
I tX 3
I tX3
t1 t2 t3 t4
I t X I t X . t.n . X
4 4
I XT XT XT XT
I X n
I X n T0 = 0 1 2 3 4 Tn
I XXX
T =0
I X XX
T =0
0 0
t1 t2 t3 t4
Note the differences in cumulative timelines: I XT XT XT XT . t.n . X
T0 = 0 1 2 3 4 Tn
I XXX
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 8
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
1b3a. Fleet vs System
• Fleet failures and fleet time can stay random over time
because of the mix of old and new systems in the fleet
• Failures for each individual system are tracked and composed
additively on a cumulative timeline
• Cumulative timeline combines failures from different systems
onto a single timeline
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 9
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
1b3b. Fleet vs Repairable Systems
Repairable Systems – IX X
System 1 XX X
Systems age in patterns, characteristics are evident
when viewing clustering of superimposed failures
System 2 I X X X XX
System 3 IX X XXX
NOTE: In repairable systems, failures for non-
repairable elements are s.i.i.d. however failures for System n I X X XX
the system or element are dependent
XXX X X X X XXXXXXX
Superimpose IX XXX
IX X XX X X XX XX X X XXX X X XX
Cumulative Timeline
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 10
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
1c. Overview: Analytical Flow
Data
Structure
Yes Censored
Data ?
Test
No (Test) and (Data)
Type Censoring Suspensions
Analyze as Analyze as
Considerations
Censored Data Complete Data
Replaceable = New =
MTTF
Repair
Non-repairable Right Censored – lower bound known,
or Structure upper bound not fully known for all data
(MTTF)
MTBF
? Left Censored – only have the upper
Analyze as Repairable bound, not sure of exact time of failure
MTTF (MTBF)
Interval Censored – all fail, have ideas of
Minimal Fleet times but not exact measures
Expectations
of Repair
Superimpose Cumulative (Test) Type I: Time Censoring – test time
each System Timeline fixed
New or
Like New
(Test) Type II: Failure Censoring –
Identical Ordered Usage number failures fixed
conditions from T=0
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 11
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
2. Data Censoring and Test Type
a) Complete data
b) Left censored data also called Interval data
c) Right censored data also called Suspended data
d) Singly censored data
e) Multiply censored data
f) Interval, or grouped, data
g) Test Type I also called Type I Censoring
h) Test Type II also called Failure Censoring
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 12
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
2a. Complete data
• All units run to failure, have exact usage times for each
• Another way to say this is that each unit has run through
its entire lifetime and is characterized exactly
Sample 2 of n I Fail
lab testing Sample 3 of n I Fail
Sample 4 of n I
• Fully accessible
Fail
Sample 5 of n I Fail
field data with high
...
...
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 13
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
2b1. Left Censored data
• Check performance at some time, t
• Examples include:
• Testing or field Left Censored Data
data where data Sample 1 of n I Fail
Sample 4 of n I Fail
...
Sample n of n I Fail
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 14
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
2b2. Interval Censored data
• Different from left-censored in that all intervals do not
start with zero usage
• Check performance at some time, t
• Observe the units have failed, do not know exactly when
• Examples include: Interval Censored Data
• Testing or field Sample 1 of n [ Fail in interval
]
data where data Sample 2 of n [ Fail in interval ]
is not always Sample 3 of n [ Fail in interval ]
Fail in interval
accessible Sample 4 of n [ ]
Sample 5 of n [ Fail in interval ]
...
...
Fail in interval
Sample n of n [ ]
Usage (Time, Cycles, Distance, etc.)
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 15
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
2c. Right Censored (or Suspended) data
• Important to include usage of survivors in timelines
...
• Examples include:
• Typical Type I or Type II testing in 2g and 2h
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 17
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
2e. Multiply (muhl-tuh-plee) Censored data
• Different run times and different number of interventions
for each system
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 19
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
2g. Test Type I (or Type I Censoring)
• Test n samples, terminate after some predetermined usage
regardless of the number of failures
• Some samples may, and likely do, survive the test (test conditions
may be too harsh if all fail) and are coded as suspensions
• Test time is fixed, number of failures is variable
• Record all usage – impacts cumulative percent fail
• Examples include:
• Reliability growth testing – accelerations impose 10 years of accelerated life
on samples in 600 hours, then terminate test
• Able to obtain usage for each element or system in the population, cross-
reference with those systems or elements that have failed at the time of
the data collection
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 20
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
2h. Test Type II (or Failure Censoring)
• Test n samples, terminate after some predetermined number of
failures regardless of the usage
• Apply when time is of the essence, guaranteed enough data to
carry out assessments with statistical significance
• Number failures is fixed, code survivors as suspensions
• Record all usage – impacts cumulative percent fail
• Examples include:
• Reliability growth testing – run test until value-added information is
available for, say, n = 30 failures
• Able to obtain usage for each system in the population, keep selecting
random systems from field population, viewing pass/fail status, until say n =
30 failures are identified
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 21
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
2i. Thought on Censoring (also called Suspensions)
• Implementation also depends on the question asked
b) Repairable system
i. Repair to “like-new” condition “reset” system history
c) Repairable fleet
i. Treat individual systems as a single population
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 23
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
3a. Non-Repairable System or Element
• Must be replaced; impractical or unsafe to return to
reusable condition
• Hazardous materials
• Disposables
• Inexpensive products
• Examples include:
• Most personal services, e.g., car repair, dentist, etc
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 28
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
4a1. Shape, Scale, and Threshold Parameters
• Shape parameter (used in 2- and 3-parameter Weibull)
• In simple words: an influential value, defined by the underlying
data, that drives the look (shape) of the Weibull plot
• More technically: Measure the rate of change in the unreliability
function over the usage, where both the unreliability and usage
are linearized using logarithms.
• Use median ranks to estimate unreliability for each failure
rank in the failure order − 0.3
• Median Rank = Bernard’s approximation
number of failures+0.4
1
• xi = ln time to fail i yi = ln ln
1−median rank
• Build a timeline valid for the application, either pull failures to T=0,
superimpose failures on a common timeline, or system
performances as appropriate
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 33
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
4b0b2. Prepare as much Relevant Data as Practical
Sample of fictitious
data for illustration only
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 34
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
4b0b3. Prepare as much Relevant Data as Practical
Excel formulas to drive fields added to pivot shown on prior slide
• Good Record? (cell N8 and drag through column N)
=IF(AND(A8<>"",A8<>"(blank)",B8<>"",B8<>"(blank)",C8<>"",C8<>"(blank)",E8="Yes",F8<>"",F8<>
"(blank)",J8="Closed"),"yes","no")
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 35
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
4b1a. Prepare as much Relevant Data as Practical
Example 4b1, System Repair State: New and Like New
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 36
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
4b1b. Prepare as much Relevant Data as Practical
Excel formulas for complete data for “New” or “Like New”
• Bin Grouped by 100_1 (cell Q8 and drag through column Q)
=IF(S8<=0,"",IFERROR(IF(S8<>"",ROUNDDOWN(S8/100,0)+1,""),""))
• Cum Pct Bin Size 100_1 (cell W9 and drag through column W)
Since this is cumulative, initialize U8 with a different formula
=IF(T8<>"",IFERROR(V8/SUM(V:V),""),"")
=IF(T9<>"",IFERROR(V9/SUM(V:V)+W8,W8),"")
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 38
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
4b2b. Prepare as much Relevant Data as Practical
Excel formulas for “Minimal Repair”
• Bin Grouped by 100_2 (cell X8 and drag through column X)
=IF(Z8<=0,"",IFERROR(IF(Z8<>"",ROUNDDOWN(Z8/100,0)+1,""),""))
• Order Bin Size 100_2 (cell AA8 and drag through column AA)
=IF(ROW()-7<=MAX(X:X),ROW()-7,"")
• Occurrences Bin Size 100_2 (cell AC8 and drag through column AC)
=IF(AA8<>"",COUNTIF(X:X,"="&AA8),"")
• Cum Pct Bin Size 100_2 (cell AD9 and drag through column AD)
Since this is cumulative, initialize U8 with a different formula
=IF(AA8<>"",IFERROR(AC8/SUM(AC:AC),""),"")
=IF(AA9<>"",IFERROR(AC9/SUM(AC:AC)+AD8,AD8),"")
• Order Bin Size 1000_3 (cell AH8 and drag through column AH)
=IF(ROW()-7<=MAX(AE:AE),ROW()-7,"")
• Occurrences Bin Size 1000_3 (cell AJ8 and drag through column AJ)
=IF(AH8<>"",COUNTIF(AE:AE,"="&AH8),"")
• Cum Pct Bin Size 1000_3 (cell AK9 and drag through column AK)
Since this is cumulative, initialize U8 with a different formula
=IF(AH8<>"",IFERROR(AJ8/SUM(AJ:AJ),""),"")
=IF(AH9<>"",IFERROR(AJ9/SUM(AJ:AJ)+AK8,AK8),"")
• For the purposes of the plotting here, each system will eventually fail (either
permanently or will be repaired) and that information will be added to future
assessments
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 42
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
5a. Weibull Interpretations and the Bathtub Curve
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 43
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
5b. Weibull Interpretations and the Bathtub Curve
• The bathtub curve is a compilation of 3 distributions, it is not
continuous
• Weibull can indeed be made to fit all 3 distributions of the
bathtub curve as three separate plots
• Shape parameter is the rate of change in failure rate over time
• Weibull characterization arranges data, plots a representation of
each failure, and determines the rate of failure rate over usage
along with the spread of failures via characteristic life
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 44
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
6. Process for Weibull Plots
a) Start with the appropriate data
1200 100
chart here look similar to the
Weibull distribution plot in
1000 80 7a of this presentation but
800 Use Minitab Assistant they are different.
Occurrences
60
Percent
600
> Graphical Analysis
> Pareto Chart 40
400
Scatterplot of Occurrences Bin , Cum Pct Bin Size vs Grouped Cycles_1
20
200 160 100.0%
in the scatterplot make the > Secondary > Assign secondary scale for Cum_Pct
0
first bar in the Pareto. 0.0%
0 2000 4000 6000 8000 10000 12000 14000
Grouped Cycles_1 - "New" or "Like New"
Variable
Occurrences Bin Size 100_1
Cum Pct Bin Size 100_1
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 48
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
6b2. Produce Frequency Plots for Failure Times
Compress Occurrences into a Cumulative Timeline (Excel)
80.0%
100
• Sort from smallest to largest
60.0%
80 • Create a pair for each data
60 40.0%
point by adding an adjacent
40
column filled with zeros (value
20.0% = x is now = x, 0)
20
Percent
Percent
Exponential
50
1
AD = 15.622
10 P-Value < 0.003
1
Weibull
0.01 0.01 AD = 1.514
-5000 0 5000 10000 0.1 1 10 100 1000 10000 P-Value < 0.010
Failure Free Hours at each retu Failure Free Hours at each retu
Gamma
Weibull - 95% CI Gamma - 95% CI AD = 3.374
99.99 99.99 P-Value < 0.005
99
90 90
50 50
10 10
Percent
Percent
1 1
0.01 0.01
0.01 0.1 1 10 100 1000 10000 0.001 0.01 0.1 1 10 100 1000 10000
Failure Free Hours at each retu Failure Free Hours at each retu
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 50
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
6d. Use Parameters to Create a Model for the Data
Histogram of Failure Free Cycles_1
Weibull
900 Shape 0.8540
Graph > Histogram > Simple Scale 974.3
800 N 1215
Graph variables
700 ‘Failure Free Cycles_1’
600 Multiple Graphs > Multiple Variables > Overlaid on the same graph > OK
Frequency
200
100
0
0 2000 4000 6000 8000 10000 12000
Failure Free Cycles_1 - "New" or "Like New"
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 51
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
6e. Determine Times-to-Failure based on the Model
“New” or “Like New”
Explanation:
STAT > Reliability/Survival > Distribution Analysis (Right Censoring) > Distribution ID Plot
Variables: ‘Failure Free Cycles_1’ Specify Weibull
There are options available to draw further distinctions on censoring that are beyond today’s scope.
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 52
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
6f. Final Validity Check of Distribution Selected
Probability Plot for Failure Free Cycles_1
LSXY Estimates-Complete Data Explanation:
Correlation Coefficient
Weibull Weibull
0.995 For this particular
99.99
set of data, the
95 Weibull model is
80 quite solid between
50 50-5,000 cycles
20 however is a little
shaky outside of
Percent
5 this range.
2
1 We need to pay
close attention to
processes
impacting early life
failures, e.g.,
0.01
0.01 0.1 1 10 100 1000 10000 • Testing
Failure Free Cycles_1 - "New" or "Like New" • Burn-in
• Shipping
STAT> Reliability/Survival > Distribution Analysis (Right Censoring) > Distribution ID Plot.
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 53
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
7a. Using Weibull Plots to Gain Performance Insight
Per the selected model, e.g., 50% fail at ≈ 634.3 cycles
Same source data,
“New” or
Distribution Plot manipulated as
“Like New” Weibull, Shape=0.854, Scale=974.3, Thresh=0 appropriate, used for
7a – 7d
0.0025
Graph > Probability Distribution Plots > View Probability
Distribution: Weibull
0.0020 ‘Shape’ and ‘Scale’ parameters from “Distribution ID Plot”
Probability Density
Shaded Area:
0.0015 Right Tail with Probability = 0.5 Shape parameter < 1
implies performance
is in Infant Mortality
0.0010
0.5
0.0000
0 634.3
Cycles to Failure_1 - "New" or "Like New"
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 54
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
7b. Using Weibull Plots to Gain Performance Insight
Per the selected model, e.g., ≈ 13.3% fail 100 cycles
Same source data,
“New” or
Distribution Plot manipulated as
“Like New” Weibull, Shape=0.854, Scale=974.3, Thresh=0 appropriate, used for
7a – 7d
0.0025
Graph > Probability Distribution Plots > View Probability
Distribution: Weibull
0.0020 ‘Shape’ and ‘Scale’ parameters from “Distribution ID Plot”
Probability Density
Shaded Area:
0.0015 X Value, Left Tail, Value:100 Shape parameter < 1
0.1333 implies performance
is in Infant Mortality
0.0010
0.0000
0100
Cycles to Failure_1 - "New" or "Like New"
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 55
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations
7c. Using Weibull Plots to Gain Performance Insight
Per the selected model, e.g., 50% fail ≈ 2300 cycles
Same source data,
Minimal manipulated as
Distribution Plot
Repair appropriate, used for
Weibull, Shape=1.164, Scale=3143, Thresh=0
7a – 7d
0.00025
Shape parameter ≈ 1 implies
performance is in Steady-State
0.00020
(Data from “New” or “Like New”
Probability Density
0.000006
0.000004
0.00025
Occurrences_Bin_Size_100_SS
0.02217
Cum Pct_Bin_Size_100_SS
80.0%
4
Probability Density
60.0%
0.00020 3
Cum Pct_Bin_Size_100_WO
80.0%
0.00012
3
2
60.0%
• Shape parameter =3.4-3 .6
closest to normal
Probability Density
40.0%
0.00010 1
20.0% distribution
0 0.0% Results for
0.00008 0 2000 4000 6000 8000
Example of Wearout
10000 12000 14000
illustration
0.6113
Variable
Occurrences_Bin_Size_100_WO
Cum Pct_Bin_Size_100_WO
of Wearout:
0.00006
Data not
reviewed in
0.00004 this
presentation
0.00002
>61% fail after 10,000 hrs
0.00000
0 10000 Model: Weibull
Wearout Cycles to Failure • Shape 4.33205
• Scale 11777.56298
Minitab Insights This document does not contain technology or Technical Data controlled under either the
Slide 59
9/13/2016 U.S. International Traffic in Arms Regulations or U.S. Export Administration Regulations