Rejection of Data: Rule of The Huge Error
Rejection of Data: Rule of The Huge Error
Outliers Rule of the huge error Dixon test - Q test Grubbs test
Rejection of data
Sometimes we know that a data point looks bad (outlier). We cant just pitch it out there must be a basis for rejection data.
Outliers
Outliers
Values that do not belong to a population. Can be based on knowing that the value is truly different or Demonstrated that it falls outside of a specified probability. When rejecting data resulting from replicate measurements, you need to use an established statistical method.
M=
suspect -mean s
Dixon test
Assumes Mean and standard deviation are unknown. Data is normally distributed. Steps 1. Rank the data: x1 < x2 < ... < xn 2. Choose confidence level 3. Calculate ratio (based on n) 4. Look up proper value 5. If ratio > table value then reject Also called the Q test.
The ratio used is based on the number of data points and if you are evaluating the highest or lowest value. # of points 3-7 8 - 10 11 - 13 14 - 25 Test Low
2
n
High
n -1 1
x 10 x x x 11 x 21 x 22
x 2 - x 1 x n - x n -1 x n -1 - x 1 x n - x 2 x 3 - x 1 x n - x n -2 x n -1 - x 1 x n - x 2 x 3 - x 1 x n - x n -2 x n -2 - x 1 x n - x 3
-x 1 x -x n -x 1 x -x
n
!10
n 3 4 5 6 7 8 9 10 11 12 13 14 15
Risk of false rejection. 0.5% 1% 5% 10% .994 .988 .941 .886 .926 .889 .765 .679 .821 .780 .642 .557 .740 .698 .560 .482 .680 .637 .507 .434 .725 .683 .677 .635 .639 .679 .713 .642 .675 .615 .649 .674 .647
Example
Grubbs test
This approach requires calculation of the mean and standard deviation Rank points Pick suspect point Calculate mean and standard deviation using all points. Calculate T. T = |mean - suspect| / sx Point can be rejected. Look up T on table. If T > table value then reject it.
Example
Grubbs example