Determination of Precision and Bias of Methods of Committee D22
Determination of Precision and Bias of Methods of Committee D22
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.
1
D 3670
3.2.9 ruggedness test—a factorial test designed to explore test and achieve satisfactory results, using the method as
the sensitivity of the method to variations in the procedure (see written. It also provides guidance to the user as to what levels
Youden and Steiner, 1975).6 of precision and accuracy may be expected in such usage.
3.2.10 single-operator precision—a measure of the replica- 5.2 The write-up of the method describes the media for
tion of repeated measurements obtained by a single operator on which the test method is believed to be appropriate. The
a given sample. collaborative test corroborates the write-up within the limita-
3.2.10.1 Discussion—Other classifications of precision tions of the test design. A collaborative test can only use
which are useful in evaluating a method, a measurement, or representative media so that universal applicability cannot be
performance within a single laboratory are: multioperator implied from the results.
precision, single or multi-apparatus precision, and single or 5.3 The fundamental assumption of the collaborative test is
multi-day precision. that the media tested, the concentrations used, and the partici-
3.2.10.2 Discussion—The terms “repeatability” and “repro- pating laboratories are representative and provide a fair evalu-
ducibility” are not standardized, but have generally come to ation of the scope and applicability of the test method as
mean “single-laboratory-operator-material precision” and written.
“multi-laboratory-multi-operator-single material precision,”
respectively. Such usage is maintained in the text of this 6. General Policy
practice.
6.1 This section describes the general policy to be followed
3.2.10.3 Discussion—Further classifications of bias which
by Committee D22, its subcommittees, and task groups in the
are useful in evaluating performance are: operator bias, appa-
development of ASTM standard methods and practices. The
ratus bias, and day bias.
objective of Committee D22 is to develop fully evaluated
4. Summary of Guide standard methods and practices as far as possible. In cases
where this is not expedient, proposed methods, as defined in
4.1 Data supporting a statement of single-operator repeat- 6.2, may be developed. In each case, an appropriate task group
ability is the entrance requirement for any candidate method to shall have the responsibility to critically examine the method or
be considered for standardization by Committee D22. The task practice, conduct evaluation tests by round robins or other
group to which a candidate method is assigned will review it techniques including ruggedness tests, and to recommend it, if
for adequacy in this respect, and conduct further tests as meritorious, for subcommittee balloting. No method or practice
necessary to evaluate its precision and bias, as technically shall be released and recommended for balloting unless the
feasible. A method may be accepted as a proposed method, precision or accuracy requirements, or both, as set forth in the
provided the repeatability is known or has been ascertained and following, have been satisfied.
provided all other criteria for acceptance have been met.
6.1.1 Collaborative testing by D22 is the preferred method
Independent tests by at least three laboratories shall be required
of validation. Data obtained by collaborative testing by others
to substantiate the repeatability of a method before it attains the
may be used in lieu of D22 testing, provided that such testing
status of a standard method. Collaborative testing by at least
was equivalent to ASTM approved procedures. In either case,
five laboratories to estimate the interlaboratory bias and, if
a copy of the test procedures and data must be filed in a
applicable to evaluate the method’s inherent bias with respect
research file maintained at ASTM for such purposes.
to the “true” value is needed for all standard methods and must
6.2 Proposed Method—A proposed method is one that has
be accomplished within 5 years of its initial issuance as a
found favorable usage in a specific laboratory, or has been used
standard, if such testing has not already been done. Failure to
by several laboratories, but has not yet been standardized. In
subject such methods to appropriate collaborative testing,
each case, the test method is submitted by its proponents to
constitutes valid grounds for disallowing its reapproval as a
Committee D22 for standardization.
standard.
4.2 Procedures that may be used in collecting the required 6.2.1 The minimum requirement for balloting of a proposed
data are given with particular emphasis upon the applicability method shall be the inclusion in it of a single laboratory’s
to analysis of atmospheres. Documentation requirements are statement of single-operator precision, together with support-
established. Terms that are useful in expressing statements of ing experimental data. Test methods meeting this requirement
precision and bias are presented. will be referred to a Task Group, following procedures estab-
lished by Committee D22.
5. Significance and Use 6.2.2 The experimental data needed to support a proposal
5.1 The objective of this standard is to provide guidelines to must reflect a test of the method as a whole, that is, sampling,
Committee D22 for the evaluation of the precision and bias, or apparatus, reagents and, calibration, and must use a procedure
both, of ASTM standard methods and practices at the time of that is essentially identical to that described in the proposal.
their development. Such an evaluation is necessary to assure Any significant deviations between the procedure used to
that a cross section of interested laboratories could perform the gather the data and the proposed procedure shall be clearly
identified.
6.2.3 If such data are missing or inadequate, but the method
6
Youden, W. J. and Steiner, G. H., Statistical Manual of the Association of
itself is considered by consensus of Committee D22 to be
Offıcial Analytical Chemists, AOAC International, 481 North Frederick Ave., Suite worthy of further study, a task group may be assigned to
500, Gaithersburg, MD 20877-2417, 1975. conduct experimental studies or enlist the services of at least
2
D 3670
one competent laboratory to obtain the data upon which to base during the tests. Because actual atmospheric samples cannot be
a statement of single-operator precision. collected and stabilized for long periods of time, two proce-
6.3 Standard Method–Initial Acceptance—A method that dures are acceptable. Reproducibility and repeatability may be
has found favorable acceptance and for which the within- evaluated by simultaneous measurement by participating labo-
laboratory repeatability has been verified by a multilaboratory ratories sampling the same atmosphere at substantially the
test program, shall be examined by the task group for compli- same time. Alternatively, comparison of a candidate method
ance with the following requirements. with a standard method of known precision and bias will
6.3.1 An initial minimum requirement for establishing a constitute an acceptable technique for evaluation of precision
standard method is a statement of within-laboratory precision and accuracy. Such measurements made by several laboratories
based on data from three laboratories similar to that described may be statistically treated to evaluate the reproducibility of
in 6.2.1-6.2.3. the candidate method. In this latter case, the measurements
6.3.2 If the method purports to measure the concentration of need not be made at the same place and time by the collabo-
a substance, an investigation of the bias of the method by rating laboratories.
comparison with a standard must be made by at least one 7.3 A test sample or series of test samples that are stable
laboratory and the results included in an accuracy statement. during the period required to perform a limited series of
6.3.3 A standard can only be carried under the provisions of measurements are adequate for evaluation of single-operator
6.3 for five years. Conditions for reapproval are specified in precision to satisfy the requirements for consideration as a
6.4. proposed method. Three levels of concentration are recom-
6.4 Standard Method–Reapproval—A standard method may mended, with such levels sufficiently well established to
be retained if it has found extensive use and between- determine whether, and to what extent, the repeatability is
laboratory precision data have been obtained. Before doing a dependent or independent of concentration level.
collaborative study, a ruggedness test should be performed by 7.4 A series of test samples of at least three concentration
at least one laboratory. levels, and available in sufficient number, is required for use by
6.4.1 The minimum requirement for retaining a standard collaborating laboratories to evaluate the repeatability and
method shall be a statement of the between-laboratory preci- reproducibility of a candidate method. The samples should be
sion of the method as established in a collaborative test stable during the entire test period, which should include a
including at least five participants. reasonable time following the collaborative test to permit
6.4.2 If a bias statement is appropriate for the method, the resolution of any discrepancies encountered during the evalu-
data supporting the statement should be obtained by at least ation procedures. The compositions of the test samples do not
two laboratories. At least one such test shall include the need to be known accurately, but the samples furnished to each
introduction of potential interferences. collaborator must be sufficiently similar to permit evaluation of
6.5 In all testing, the minimum number of participants measurement errors in excess of compositional inhomogeneity.
should be exceeded to the extent possible. The statistical power The test samples for repeatability and reproducibility should
of collaborative testing is greatly enhanced as such numbers closely simulate actual source or atmospheric air compositions,
are increased. The possibility of invalidation of a test by including the presence of any known interferents. The statisti-
outliers or missing data is also minimized. cal statements must reflect the type of test sample for which the
precision or bias, or both, are specified. The statement should
7. Sample Requirements include the concentration levels studied and the number of
7.1 The precision and bias of test methods are typically laboratories participating.
evaluated by the data obtained in the measurement of test 7.5 Accuracy tests to determine the inherent bias of an
samples. The extent to which such measurements can be made analytical method are preferably made under rigorously con-
is dependent upon the availability of test samples of adequate trolled laboratory conditions utilizing standards of known
stability and homogeneity. The scope of interest of Committee composition.
D22 is wide, ranging from contaminants at the parts-per-billion 7.6 In the absence of samples of known composition, the
level up to several percent. Particulate concentrations exist at use of the spiking technique in which standard additions of
similar concentration ranges and measurements of radioactivity known constituents are made by established techniques will be
extend the level even lower. The variety of substances of acceptable for evaluating the bias of candidate methods. In
interest range from simple inorganic constituents to complex such a case, the bias statement will consist of an accuracy of
organic molecules. Accordingly, it is not possible to set forth recovery of the spike.
rigid sample specifications, but only to delineate guidelines for
test sample preparation. Each method should be tested with 8. Planning the Collaborative Test
actual samples for which it is applicable, or as close a 8.1 Because of the technical diversity of test methods and
simulation as possible. The degree of evaluation will, of practices within the scope of responsibility of Committee D22,
course, depend on the simulation achieved, and the statements it is not possible to establish a rigid protocol for collaborative
of precision and accuracy must define the test conditions. testing. Accordingly, the responsibility for planning and con-
7.2 The ideal test sample is the actual atmosphere for which ducting an adequate collaborative test is delegated to the
the method is intended. However the use of such offers corresponding task group. All aspects including initial plan-
complications because the composition may not be known at ning, conducting the test program, and analyzing and interpret-
the moment of test and furthermore may undergo change ing the test results shall be consistent with the guidelines given
3
D 3670
in Practice E 691. Specialized standards such as Practice 10.5 Task groups will be assisted in the selection and
D 2777 and E 180 may be useful in some cases. recruitment of laboratories for collaborative testing of candi-
8.2 The results of ruggedness testing should be incorporated date methods through agencies established for this purpose by
into the method so as to properly inform participating labora- the Chairman of Committee D22 and the ASTM Staff.
tories about precautions needed to be taken.
8.3 A written protocol describing the proposed experimental 11. Statement Format
design and statistical analysis shall be submitted to the Chair-
11.1 The statement shall report the statistical values as those
man of Committee D22 or his designee for approval, prior to
obtained as the result of a collaborative test of the method. The
collaborative testing. Wherever possible, analysis of variance
following disclaimer shall be added.
or Youden pairs shall be utilized. This protocol shall describe
how estimates of precision will be made, including single- 11.1.1 The results reported are believed to be typical and
operator precision, within-laboratory precision, and between- representative of what would be expected from future tests or
laboratory precision. Provisions for handling missing data and use of the method. However, they cannot be extended to any
outliers shall be described. future application for the same or other materials. Each
laboratory using the method must validate its applicability to a
NOTE 1—In order to provide estimates for within-laboratory precision, specific application and must evaluate its own statistics based
it will be necessary to have more than one operator per laboratory for at on its own use of the method.
least two laboratories.
11.2 Precision Statement—Statements of single-operator
9. Test Data precision and between-laboratory precision shall be developed
9.1 Before publication of the standard, the statements of and included as appropriate. Such precision values shall be
bias, or precision, or both, together with the raw data (includ- considered with respect to concentration levels, and the overall
ing outliers) on which they were based shall be submitted by statement should reflect whether the values are independent of
the task group to ASTM Headquarters. The supporting data and concentration level, or vary linearly or curvilinearly with
test results shall be placed in the Research Report file of concentration. Established statistical techniques will form the
ASTM. The method shall carry a footnote indicating where the basis of such a determination. The statements should point out
supporting data can be found, as in the following example: the number of laboratories, concentrations, replicates and other
significant aspects of the test.
NOTE 2—Supporting data for the statements of precision and accuracy
have been filed at ASTM Headquarters.
11.3 Bias Statement—The statement of method bias will
depend upon the method used for its evaluation. This will
10. Conducting the Collaborative Test include the bias found on recoveries of known amounts of
10.1 The task group shall have the overall responsibility for prepared standards or spikes as appropriate. The amount found
developing methods, including the preparation of appropriate in comparison with that obtained by a comparison method is
test samples. In consequence of the expertise of the task group another basis for expression of bias.
in the measurement area involved, it will direct and coordinate 11.3.1 When bias is found as a result of collaborative
all aspects of the collaborative test. testing, every reasonable effort should be made to identify and
10.2 The task group shall verify as part of its review of a to state its source, whether due to laboratory bias (an artifact of
candidate method that it is ready for collaborative test before the test exercise) or to method bias (inherent in the methodol-
such an exercise is attempted. This ordinarily means that the ogy). Research by a peer laboratory is often the best way to
candidate method meets the specification for a tentative investigate method bias.
method as given in 6.3, and that a ruggedness test to identify
critical variables has been carried out. Only after a candidate 12. Applicability
method has been tried, proven, and reduced to unequivocal 12.1 This guide is mandatory for use by task groups of
language should a determination of its bias, or precision, or Committee D22 in the processing of test methods in any stage
both, be attempted. of approval or submitted for consideration after the date of its
10.3 The instructions for collaborative testing must require adoption.
preliminary work by potential collaborators to familiarize them
12.2 This standard is applicable to all test methods already
with the procedure, prior to the test measurements. This is
approved by Committee D22 for which statements of precision
necessary to ensure that the collaborative tests are made by
and bias do not exist. It will become mandatory for use in the
peer groups and that a “learning experience” is not included in
reconsideration of all existing methods at the time of their
the statistics of the collaborative test. The task group may also
periodic review.
develop procedures to qualify prospective collaborators, and
this approach is strongly recommended.
10.4 The task group has the responsibility to review and 13. Keywords
statistically evaluate the test data and to prepare the statements 13.1 accuracy; bias; candidate method; collaborative test;
of accuracy and precision on the basis of established statistical laboratory bias; method bias; over-all precision; relative over-
procedures. all uncertainty; ruggedness test; single-operator precision
4
D 3670
ASTM International takes no position respecting the validity of any patent rights asserted in connection with any item mentioned
in this standard. Users of this standard are expressly advised that determination of the validity of any such patent rights, and the risk
of infringement of such rights, are entirely their own responsibility.
This standard is subject to revision at any time by the responsible technical committee and must be reviewed every five years and
if not revised, either reapproved or withdrawn. Your comments are invited either for revision of this standard or for additional standards
and should be addressed to ASTM International Headquarters. Your comments will receive careful consideration at a meeting of the
responsible technical committee, which you may attend. If you feel that your comments have not received a fair hearing you should
make your views known to the ASTM Committee on Standards, at the address shown below.
This standard is copyrighted by ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959,
United States. Individual reprints (single or multiple copies) of this standard may be obtained by contacting ASTM at the above
address or at 610-832-9585 (phone), 610-832-9555 (fax), or [email protected] (e-mail); or through the ASTM website
(www.astm.org).