2023 - Clinical Decision Limits As Criteria For Setting Analytical Performance Specifications For Laboratory Tests
2023 - Clinical Decision Limits As Criteria For Setting Analytical Performance Specifications For Laboratory Tests
A R T I C L E I N F O A B S T R A C T
Keywords: Background: The biological (CVI), preanalytical (CVPRE), and analytical variation (CVA) are inherent to clinical
Analytical performance specification laboratory testing and consequently, interpretation of clinical test results.
Biological variation Methods: The sum of the CVI, CVPRE, and CVA, called diagnostic variation (CVD), was used to derive clinically
Clinical performance
acceptable analytical performance specifications (CAAPS) for clinical chemistry measurands. The reference
Clinically significant difference
Diagnostic variation
change concept was applied to clinically significant differences (CD) between two measurements, with the for
mula CD = z*√2* CVD. CD for six measurands were sought from international guidelines. The CAAPS were
calculated by subtracting variances of CVI and CVPRE from CVD. Modified formulae were applied to consider
statistical power (1-β) and repeated measurements.
Results: The obtained CAAPS were 44.9% for urine albumin, 0.6% for plasma sodium, 22.9% for plasma
pancreatic amylase, and 8.0% for plasma creatinine (z = 3, α = 2.5%, 1-β = 85%). For blood HbA1c and plasma
low-density lipoprotein cholesterol, replicate measurements were necessary to reach CAAPS for patient moni
toring. The derived CAAPS were compared with analytical performance specifications, APS, based on biological
variation.
Conclusions: The CAAPS models pose a new tool for assessing APS in a clinical laboratory. Their usability depends
on the relevance of CD limits, required statistical power and the feasibility of repeated measurements.
1. Introduction other medical laboratory scientists” [1], on biological variation [2], and
on the “state of the art” [3]. A hierarchical classification of the criteria
Laboratories need both clinical and analytical performance specifi with the effects on clinical outcomes at the pinnacle was proposed [4]
cations to ensure that their measurements are fit for the intended use in and agreed on at the IFCC-IUPAC conference in Stockholm in 1999 [5].
patient care. Manufacturers of measuring systems, reagents, and cali The hierarchical structure was confirmed and simplified at the EFLM
brators need these specifications when optimizing and evaluating strategic conference in Milan in 2014, leaving out opinion criteria [6].
whether their products are fit for purpose. Governmental regulators Clinical outcomes (Model 1) and biological variation (Model 2) were
around the globe also need them when evaluating IVD devices. The preferred strategies depending on the metrological and diagnostic
earliest published analytical performance specifications were based on properties of each measurand, leaving the state-of-the-art, Model 3, and
the opinions of “various interested pathologists, other physicians, and combinations of the three models, also as options [6]. Model 1a refers to
Abbreviations: Alb, albumin (in urine); AmylP, pancreatic amylase (in plasma); APS, analytical performance specification(s); CAAPS, clinically acceptable
analytical performance specification(s); CKD-EPI, chronic kidney disease, epidemiological formula for estimation of GFR; Crea, creatinine (in plasma); CD, clinically
significant difference; CVA, (allowable/acceptable) analytical coefficient of variation; CVD, (allowable) diagnostic variation; CVI, intra-individual biological coeffi
cient of variation; CVPRE, preanalytical technical coefficient of variation; CVREP, repeated testś coefficient of variation; EFLM, European Federation of Clinical
Chemistry and Laboratory Medicine; HbA1c, hemoglobin A1c (in blood); IVD, In vitro diagnostic medical device; IVDR, In Vitro Diagnostic Medical Devices Regulation
(EU) 2017/746; LDL-C, low-density lipoprotein cholesterol (in plasma); MU, measurement uncertainty; Na, sodium (in plasma); RCV, reference change value; TE,
total allowable error; α, probability of type I error, false positives; β, probability of type II error, false negatives; 1-β, statistical power; z, Gaussian statistic.
* Corresponding author at: Haikaranportti 4 B 22, FIN-02620 Espoo, Finland.
E-mail address: [email protected] (T.T. Kouri).
1
Present address: Fimlab Laboratoriot Oy Ltd, FIN-33520 Tampere, Finland.
https://doi.org/10.1016/j.cca.2023.117233
Received 2 September 2022; Received in revised form 17 January 2023; Accepted 18 January 2023
Available online 21 January 2023
0009-8981/© 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
E. Rotgers et al. Clinica Chimica Acta 540 (2023) 117233
direct link between testing and health outcomes and Model 1b refers to Table 1
impact of laboratory testing on medical decisions and classifications, i. Clinically significant differences for laboratory analytes.a,b
e., an indirect link between testing and health outcomes [7]. The EFLM Measurand Clinically CD, in Chosen Source a
Task and Finish Group on the allocation of laboratory tests to Different significant (%) clinical setting [Reference
Models, EFLM TFG-DM, has proposed example measurands for the difference, used]
CD c
different APS models [8]. In practice, setting a universal APS even for
blood hemoglobin A1c remains challenging after comparing various Blood Diagnostic 14% Diagnosis of WHO
models [9]. HbA1c testing: (IFCC) or diabetes, Classification of
6 (42− > 48 9% diagnostic diabetes mellitus
Measurement uncertainty (MU), and related analytical performance mmol/mol) (NGSP) interval in IFCC 2019 [18]
specifications combine uncertainty of the assigned values of reference or and NGSP units
materials, uncertainty in the assignment of calibrator values, and 0.5 (6.0->
imprecision of the reproducibility of results [10–12]. Variation in test 6.5 Hgb%)
Monitoring: 10% Worsening of Little RR et al
results also includes biological and preanalytical variation in addition to
5 mmol/mol (IFCC) diabetes (most 2011 [19],
the analytical variation [13–15]. It remains to be seen in the future how or or stringent limits) Skeie S et al 2005
regulatory and accreditation bodies will assess the clinical performance 0.5 Hgb% 7% at 53 mmol/ [21], Turner RC et
of laboratory tests in their intended use, against the requirements of the change (NGSP) mol (IFCC) or al 1998 [22]
new IVDR regulation [16] and the ISO 15189:2022 standard, Chapter 7.0 Hgb%
(NGSP)
7.3. [17]. Urine Alb 70 (30− > 230% Initial detection KDIGO Guideline
This study was initiated to verify clinical performance of the mea 100 mg/L) of moderate 2012 [23]
surement procedures on a novel automated platform at HUS Diagnostic albuminuria
Center, following the requirements of the ISO 15,189 standard. The (30–300 mg/L),
a limit at the
described approach was used to model clinically acceptable perfor
logarithmic
mance specifications (CAAPS) for six measurands, i.e., blood hemoglo midpoint 100
bin A1c (HbA1c), urine albumin (Alb), plasma sodium (Na), plasma mg/L of
pancreatic amylase (AmylP), plasma low-density lipoprotein cholesterol moderate
(LDL-C), and plasma creatinine (Crea). albuminuria
range,
corresponding
2. Materials and methods to ACR range
3–30 mg/mmol
2.1. Criteria for clinically acceptable analytical performance at an average
urine Crea
specifications (CAAPS)
concentration
of 10 mmol/L
Clinical guidelines were used to calculate CAAPS by converting them Plasma Na 5 (125− > 4% Treatment of Spasovski G et al,
to clinically significant differences (CD) for six common clinical chem 130 mmol/ profound European
istry measurands with variable characteristics, as listed in Table 1. L) hyponatremia, hyponatraemia
differentiation guideline 2014
Blood HbA1c is used both for the diagnosis of diabetes mellitus type 2 of 5 mmol/L [25]
and for treatment follow-up. The decision limit in the diagnosis of dia Plasma Upper 100% In alerting for a Tenner S et al,
betes is 48 mmol/mol or 6.5 Hgb%. In comparison, the upper health- AmylP reference possibility of American College
related reference limit is 42 mmol/mol or 6.0 Hgb%, expressed either limit (URL acute of
-> 2 × pancreatitis, a Gastroenterology
with IFCC (International Federation of Clinical Chemistry and Labora
URL); cut-off of Guideline 2013
tory Medicine) reference measurement units, or with NGSP (National e.g., 65 2xURL from the [26]
Glycohemoglobin Standardization Program) conventional units [18]. In (65− > 130 midpoint of a
evaluating glycemic control, the general target for treatment is < 53 U/L) 4-fold change
mmol/mol / <7 Hgb%, or < 64 mmol/mol / <8 Hgb% in cases where compared to
URL
less stringent goals are necessary. In the assessment of change, 5 mmol/ Plasma 0.4 (1.8 -> 20% Treatment Mach F et al, ESC/
mol (IFCC unit) or 0.5 Hgb% (NGSP unit) is interpreted as significant LDL-C 1.4 mmol/L; (22–15%) target of EAS Guideline
both by U.S. and European specialists, indicating a relative change of 9% or dyslipidemia in 2020 [27]
at 53 mmol/mol or 7% at 7 Hgb% [19,20]. In a survey for general 0.4 (2.6 -> the very high-
2.2 mmol/ risk group for
practitioners in six European countries, a decrease in blood HbA1c be
L) CVD; treatment
tween 7 and 20% and an increase of 6–10% were deemed relevant when target for the
using NGSP units [21]. Furthermore, in the UK Prospective Diabetes moderate risk
Study for the effect of glycemic control on clinical outcomes in patients group
with type 2 diabetes, the patients were grouped into intensive and Plasma 29 (72− > 40% Differentiation KDIGO Guideline
Crea 101 µmol/L) between limits 2012 [23]
standard glycemic control using a difference of 11% in blood HbA1c of normal
(NGSP unit) [22]. From these data, we deduced two different levels of (eGFR 90 mL/
clinical need: the significant difference in diagnostic testing is 14%, i.e., min/1.73 m2; P-
(48–42)/42 mmol/mol and 9% corresponding to (6.5–6.0)/6.0 Hgb%, Crea 72 µmol/
L) and mildly
while in monitoring glycemic control it would be about 10% in mmol/
impaired
mol, and respectively 7% in Hgb%. kidney function
Urine albumin (Alb) is used to screen diabetic and hypertensive (eGFR 60 mL/
nephropathy. The recommended assay for albuminuria screening is the min/1.73 m2; P-
urine albumin/creatinine-ratio (ACR), adjusting Alb concentration Crea 101 µmol/
L) in females at
against volume rate (diuresis). According to the KDIGO guideline, 40 years of age
normal albuminuria is below 3 mg/mmol, moderately increased albu d
2
E. Rotgers et al. Clinica Chimica Acta 540 (2023) 117233
a
Detailed explanations on selection criteria and their references are given in highest risk limits (1.8 mmol/L and 1.4 mmol/L) or 15% starting from
the text, Chapter 2.1. Criteria for clinically acceptable analytical performance the upper limit of the moderate risk individuals (0.4/2.6 mmol/L =
specifications (CAAPS). 15%). An average estimate of 20% was used in further calculations.
b
Abbreviations used: ACR, albumin/creatinine ratio; Blood HbA1c, blood Plasma creatinine (Crea) is used mainly to estimate glomerular
hemoglobin A1c; CD, clinically significant difference; CKD-EPI, the Chronic
filtration rate (eGFR) for the evaluation of kidney function. Calculated
Kidney Disease Epidemiological formula; CVD, cardiovascular disease; EAS,
from the CKD-EPI formula for eGFR, the limits of a mildly impaired
European Atherosclerosis Society; ESC, European Society of Cardiology; eGFR,
glomerular filtration rate, estimated; IFCC, International Federation of Clinical
kidney function (from 90 to 60 mL/min/1.73 m2) were used to estimate
Chemistry and Laboratory Medicine, recommended unit for HbA1c measure the clinically significant difference, below which active diagnostics and
ments; KDIGO, Kidney Disease - Improving Global Outcomes; NGSP, National treatment are indicated, since eGFR < 60 mL/min/1.73 m2 suggests
Glycohemoglobin Standardization Program, conventional unit for HbA1c mea moderately impaired kidney function [23]. These were modelled in 40-
surements; Plasma AmylP, plasma pancreatic amylase; Plasma Crea, plasma year-old women with the smallest changes in plasma Crea needed to
creatinine; Plasma LDL-C, plasma LDL cholesterol; Plasma Na, plasma sodium; differentiate the given GFR estimates. A corresponding change in plasma
Urine Alb, urine albumin; URL, upper reference limit; WHO, World Health Crea concentration of 40% (from 72 to 101 µmol/L) was used.
Organization.
c
The first value of each interval is the denominator of the relative difference.
d
eGFR was estimated by using the CKD-EPI formula.
2.2. Calculations of CAAPS
kidneys is the critical change, requiring differentiation between normal 2.2.1. Calculating diagnostic variation, CVD
and repeatedly demonstrated moderate albuminuria. Within the expo Results of quantitative laboratory measurements represent point
nential interval of 3–30 mg/mmol ACR, an incipient nephropathy must estimates with uncertainty distributions around the obtained value.
be established, and treatment initiated, already at the logarithmic Clinical laboratory tests are typically used to compare two consecutive
midpoint of the moderate albuminuria range – not at the upper limit of results (monitoring of a patient), or to differentiate diseases from a
30 mg/mmol several years later. Thus, a significant difference was healthy state or to establish prognostic categories (diagnostic testing).
chosen to be a change of 3 → 10 mg/mmol, corresponding to a change of Then, the combined variation, defined as diagnostic variation, CVD, in
30 to 100 mg/L urine Alb divided by an average urine creatinine con cludes analytical variation, CVA, preanalytical variation, CVPRE, and
centration of 10 mmol/L. This prognostically significant detection of intra-individual biological variation, CVI [15,28]. Two overlapping
kidney disease corresponds to a difference of 230%. Adjusting urine Gaussian distributions were used to model clinically significant differ
albumin concentration to that of urine creatinine is important because it ences, CD [29]. The CD between two measurement values was used to
reduces the CVI of ACR to about 30% in random spot specimens, cor calculate the clinically acceptable CVD of a laboratory result using the
responding to the CVI of Alb concentration in first-morning urine [24]. conventional formula of relative change value, RCV [21,28]:
Plasma sodium (Na) is used to assess water and electrolyte balance.
Hyponatremia is classified as mild at 130–135 mmol/L, moderate at CD = z*√2*CVD (1)
125–129 mmol/L, and profound at<125 mmol/L. During the correction
where z is the Gaussian statistic, CVD = diagnostic variation, and √2
of hyponatremia with intravenous infusion of hypertonic NaCl solution,
assumes two identical uncertainty distributions in the compared mea
the treatment goal for plasma Na increase is 5 mmol/L per 24 h, and it
surements. By converting the formula (1), the acceptable CVD was ob
should not exceed 8 mmol/L per 24 h prior to reaching 130 mmol/L
tained as follows:
[25]. The goal of detecting a 5 mmol/L difference in plasma Na con
centration is clearly desirable, indicating a significant difference of 5/ CVD = CD/(z*√2) (2)
125 = 4% at concentrations < 130 mmol/L. Detecting an increase of 8
mmol/L was an absolute minimum specification. In the treatment of 2.2.2. Sources of variation in the pre-examination processes
hypernatremia, it is significant to detect a decrease of 10 mmol/L in 24 h Data for biological intra-individual variation, CVI, were mostly ob
using repeated measurements, yielding a significant difference of 7% for tained from the database provided by the Task and Finish Group of the
Na concentrations above the upper reference limit (URL). European Federation of Clinical Chemistry and Laboratory Medicine,
Plasma pancreatic amylase (AmylP) is used to diagnose acute EFLM [30]. The provided median was used as the best estimate of intra-
pancreatitis in patients with acute abdominal pain. AmylP activity>3–5 individual biological variation, CVI, as reported by the EFLM Working
times URL (65 U/L at HUS Diagnostic Center) is diagnostic for acute Group on Biological Variation.
pancreatitis if observed together with abdominal pain or imaging that is Estimates for preanalytical technical variation are usually ignored,
consistent with pancreatitis according to the American College of since standardized sample collection and sample handling procedures
Gastroenterology Guideline [26]. The average significant difference for are assumed to minimize it [28,31]. However, we included this factor to
plasma AmylP activity for diagnosis of acute pancreatitis would be allow estimation of possible effects of regional storage and trans
300% to reach a high diagnostic specificity (Sp) of the result (1xURL –> portation on sample quality in practical assessments of acceptability of
4xURL; Sp > 95%). In clinical practice, plasma AmylP activity has a samples [32,33]. An acceptable preanalytical variation, CVPRE, was
reduced sensitivity (Sn) in detection of alcoholic pancreatitis or severe estimated based on our previous study [14], testing the effect of regional
necrotic pancreatitis, and in patients with symptoms that have lasted for transportation: a 1% variation was estimated for high-concentration,
several days (Sn < 50% after 4 days of disease onset). Because of these stable measurands, such as blood HbA1c, plasma LDL-C and plasma
risks, a lower decision limit was modelled to alert for the possibility of Crea. Four per cent was allowed for plasma AmylP activity due to a
pancreatitis, using a significant difference of 2xURL or 100% increase in possible inactivation during storage, and 5% for urine Alb, due to its
plasma AmylP activity (65 to 130 U/L). tendency to adhere on the walls of specimen containers. For plasma Na,
Plasma low-density lipoprotein cholesterol (LDL-C) is used to assess 0.5% preanalytical variation was used.
hyperlipidemia, a risk factor for cardiovascular disease (CVD). In the
recent 2019 ESC/EAS Guidelines for the Management of Dyslipidemias, 2.2.3. Determining acceptable analytical variation CVA from CVD
treatment targets for plasma LDL-C were redefined into concentrations The square of combined diagnostic variation, CV2D, was summarized
of 1.4, 1.8, 2.6, and 3.0 mmol/L depending on the level of CVD risk [27]. from its variance components [34], as follows:
The feasibility of differentiating the several categories require the
detection of differences of about 0.4 mmol/L in plasma LDL-C concen CVD2 = CVI2 + CVPRE
2
+ CVA2 (3)
trations. A clinically significant difference is then 22% between the From this equation, the acceptable analytical variation, CVA was left
3
E. Rotgers et al. Clinica Chimica Acta 540 (2023) 117233
over after subtracting variances of the other components from the 2.2.5. Need for repeated measurements
variance of CVD (Fig. 1), adopting an earlier example on postanalytical When it is impossible to detect a change between two measured
quality assessment [21]: values with sufficient statistical power due to a large biological variation
( ) or analytical variation, CVD may be achieved using n replicate samples.
CV 2A ≤ CV 2D − CV 2I + CV 2PRE (4)
This reduces the variation by division with a factor of √n [37]. The
The resulting clinically acceptable analytical variation, now called acceptable diagnostic variation from repeated measurements, CVREP, is
clinically acceptable analytical performance specification, CAAPS = √ calculated as follows:
CV2A, is then required to detect the clinically significant difference CD
CVD = CVREP /√n, where n = number of repeats (5)
used in the modeling.
Consequently, clinically significant difference CD = z * √2 * CVD = z
2.2.4. Statistical power and budget for diagnostic variation, CVD * √2 * CVREP / √n.
A difference that is considered clinically significant must be detected Inversely, an acceptable CVREP = CVD * √n.
with acceptable sensitivity and specificity to justify a decision on a
clinical measure (further investigation or treatment). In addition to 3. Results
providing a statistical estimate of still a stable situation (false positives,
probability α of type I error), as applied in the classical RCV [28,29], Prognostic (urine Alb; plasma LDL-C), or diagnostic groups (plasma
another estimate, the alternative probability β of type II error (false AmylP; plasma Crea) we associated with wide CD estimates using either
negatives) was used. It describes the sensitivity, or statistical power of an limits or midpoint values of the neighboring categories, while measur
observed difference, 1-β [35]. Detection of a CD between two mea ands used mostly for monitoring (blood HbA1c; plasma Na) showed
surement values depends on the statistical sensitivity, resulting in narrow CD estimates (Table 1).
different uncertainty budgets for CVD according to the used z statistic in The CAAPS estimates were initially calculated with the most used
the equation (2). The CVD budgets were first calculated with the RCV at p < 5% (2α, z = 1.96), using equations (2)–(4) (Table 2). For
commonly used borderline statistical sensitivity 1-β = 50% (at z = 1.96). plasma LDL-C, the intra-individual variation CVI was already larger than
To reach a sufficient statistical power to detect a clinically significant the obtained allowable CVD calculated with the borderline statistical
change, the z statistic needs to be increased from the conventional 1.96 power 1-β = 50% (Fig. 2). For blood HbA1c, the obtained CAAPS esti
(bidirectional change at 2α = 5%; or α= +/- 2.5% for a false positive mates were notably narrow as well, and even narrower for NGPS units
detection), since at z = 1.96 a sensitivity of just 1-β = 50% is obtained than for IFCC units in both diagnostic testing and monitoring. The
(Fig. 2). A change of z = +3 has a sensitivity of 85%, and a change up to z calculated CAAPS was very stringent for plasma sodium as well
= +4 a sensitivity of 98% in detecting a unidirectional change, when (Table 2).
keeping the unidirectional α = 2.5% [29]. Originally, statistical power To improve the statistical power, we calculated acceptable CAAPS
functions were developed by James O. Westgard and coworkers for error estimates at sensitivities of 85% (z= +3), and 98% (z= +4) at unidi
detection in statistical process control in 1970′ s [36]. rectional α = 2.5% [29] (Table 3). When increasing the z statistic to + 4
and keeping a given CD, the respective budget for CVD was decreased
according to equation (2). The impact of increasing 1-β on the calculated
A Clinically significant CVD is shown in Table 3. The corresponding CAAPS tightened even more
difference CD than the CVD suggested, because biological and preanalytical technical
variations remained the same. Diagnostic detection of diabetes with
HbA1c using a single measurement seemed to be possible at a sensitivity
of 85% (z= +3) with IFCC unit reporting with a performance of CAAPS
Normal / Diseased /
1.9%, but not for reporting with NGSP units (CAAPS 0.8%).
1st result 2nd result
CAAPS of urine Alb, plasma AmylP and plasma Crea seemed
B Allowable diagnostic variation attainable even at a 98% sensitivity (z= +4), because of their large CD
CVD < CD / (z * √2) estimates from diagnostic use (Table 3). On the other hand, the calcu
lated CAAPS was tight for plasma Na already at z = 1.96. For plasma
CVD
LDL-C, detection of the ascribed CD was not at all possible, as shown
already in Table 2. Using z = 1.64 with p < 10% for false positives,
instead of z = 1.96, would improve the sensitivity 1-β from 50% to 64%,
but still remain insensitive in detecting clinically significant changes.
C CVI CVPRE CVA
Repeating a laboratory measurement using a new specimen is
commonly applied to confirm the detection of a change. To model this,
Biological Empirical equation (5) was used. The effect of repeated sampling and measure
CAAPS
variation studies derived ments on CVD and subsequent CAAPS was calculated by multiplying CVD
database from CD with √n, to obtain allowable CVREP, when using n = 1 to 4 (Table 4).
With a statistical power 1-β = 85% (z = 3) and a unidirectional α =
2.5%, a reasonable CAAPS was reached using duplicate measurements
for monitoring blood HbA1c (IFCC units), triplicates for plasma Na and
CAAPS = acceptable CVA < [(CVD2 - (CVI2 +CVPRE2)]0.5 monitoring of blood HbA1c (NGSP units), and four replicate measure
ments to assess a change in plasma LDL-C (Table 4). Detailed calcula
tions of Table 4 are shown for clarity in the Supplementary Table.
Fig. 1. The flowchart on estimation of clinically acceptable analytical perfor
mance specification (CAAPS) from clinically significant difference (CD). Step A:
The obtained CAAPS were compared with APS for allowable MU,
Two results represent a CD for a measurand in a defined clinical situation. Step expressed as allowable analytical variation after elimination of bias with
B: The budget of diagnostic variation, CVD, is obtained by dividing CD with calibration, desirably CVA ≤ 0.5*CVI [38] (Table 5). Also, the conven
z*√2, where z is the chosen Gaussian statistic. Step C: Biological intra- tional APS expressed as total allowable error, TE, were calculated, with
individual variation (CVI) and preanalytical technical variation (CVPRE), are separate bias and imprecision estimates from biological variation [4],
subtracted from CVD as squared terms (variances) to obtain the variation left for despite becoming easily too wide [30]. A comparison of the CAAPS and
analytical performance (CVA) that represents the CAAPS.
4
E. Rotgers et al. Clinica Chimica Acta 540 (2023) 117233
Fig. 2. Effect of size of difference between two results on sensitivity of detection. Statistical power (1-β) = sensitivity of detection increases from 0.50 to 0.98, when
the difference between measured means increases from z = +1.96 to z = +4 of Gaussian distribution, using a fixed probability of false positives α = 0.025 (shown
with a line at z = +1.96 of original distribution). Modified from N Iglesias Canadell et al. Clin. Chem. Lab. Med. 42 (2004) 415–422 [29].
Table 2
Clinically acceptable analytical performance specification from clinically significant difference (at z = 1.96).
Measurand Significant Clinically Clinically Former Biological Former Preanalytical Former Variance CAAPSc
difference for significant acceptable squared intra- squared variation, squared remaining based on
medical difference, diagnostic individual estimated for analytical clinical
decision % variation a variation b variation a difference,
% %
c
CD CVD = CV2D CVI CV2I CVPRE CV2PRE CV2A = CV2D- CAAPS =
CD / CV2I - CV2PRE √CV2A
(1.96*√2) α = 2.5%, 1-
β = 50%
Blood Diagnostic
HbA1c testing:
42 -> 48 14% 5.1% 0.00255 2.5% 0.00063 1% 0.00010 0.00183 4.3% (IFCC)
mmol/mol
(IFCC)
6.0 ->6.5 Hgb 9% 3.2% 0.00106 1.7% [41] 0.00029 1% 0.00010 0.00067 2.6% (NGSP)
% (NGSP)
Blood Monitoring:
HbA1c
at 53 mmol/ 10% 3.6% 0.00130 2.5% 0.00063 1% 0.00010 0.00058 2.4% (IFCC)
mol (IFCC)
at 7.0 Hgb% 7% 2.5% 0.00064 1.7% [41] 0.00029 1% 0.00010 0.00025 1.6% (NGSP)
(NGSP)
Urine Alb 30 -> 100 mg/ 230% 83% 0.68944 30% [24] 0.09000 5% 0.00250 0.59694 77%
L
a
Plasma Na 125 -> 130 4% 1.4% 0.00021 0.5% [30] 0.00003 0.5% 0.00003 0.00016 1.3%
mmol/L
Plasma URL -> 2 × 100% 36% 0.13033 4.0% [30] 0.00160 4% 0.00160 0.12713 36%
AmylP URL
Plasma 1.8 -> 1.4 20% 7.2% 0.00521 8.0% [30] 0.00640 1% 0.00010 − 0.00129 (<0%)
LDL-C a mmol/L
Plasma Crea 72 -> 101 40% 14.4% 0.02085 4.9% [30] 0.00240 1% 0.00010 0.01835 13.5%
µmol/L
a
The allowable diagnostic variation CVD was calculated using the formula of reference change value for a Gaussian distribution: CVD = CD / (z*√2), equation (1).
The variance remaining for analytical variation was calculated with equation (4). For plasma LDL-C, detection of a change between two individual measurements was
not possible at z = 1.96, due to a high CVI (marked bold). Replicate testing is shown in Table 4.
b
References used for estimates of intra-individual biological variation were the following: [41] Biological variation of diabetics, given NGSP units converted also to
IFCC units from S. Carlsen, et al, Clin. Chem. Lab. Med. 2011; [24] S.S. Waikar et al, Am. J. Kidney Dis. 2018; and [30] A.K. Aarsand, et al, EFLM Biological Variation
Database, 2022.
c
Abbreviations used: Blood HbA1c, blood hemoglobin A1c; CAAPS, clinically acceptable analytical performance specification; CD, clinically significant difference;
CVA = (acceptable) analytical variation; CVD = (total) diagnostic variation; CVI, biological intra-individual variation; CVPRE, preanalytical (technical) variation; IFCC,
International Federation of Clinical Chemistry and Laboratory Medicine, mmol/mol unit; NGSP, National Glycohemoglobin Standardization Program, Hgb% unit;
Plasma AmylP, plasma pancreatic amylase; Plasma Crea, plasma creatinine; Plasma LDL-C, plasma LDL cholesterol; Plasma Na, plasma sodium; Urine Alb, urine
albumin; z, Gaussian statistic; α, type I error in statistical testing (false positives); β, type II error (false negatives); 1-β, statistical power, sensitivity to detect a change
(opposite probability to β).
5
E. Rotgers et al. Clinica Chimica Acta 540 (2023) 117233
Table 3
Impact of statistical power (1-β) on clinical performance specification.
Z score and associated statistical power (1-β) a Mean at (z ¼ 1.96), Mean at (z ¼ 3), Mean at (z ¼ 4),
(1-β) ¼ 50% (1-β) ¼ 85% (1-β) ¼ 98%
Measurand Clinically significant difference, CD % Acceptable CAAPS Acceptable CAAPS Acceptable CAAPS
CVD b c
CVD b c
CVD b c,d
Blood HbA1c (IFCC, DT) 14% 5.1% 4.3% 3.3% 1.9% 2.5% (<0%)
Blood HbA1c (NGSP, DT) 9% 3.2% 2.6% 2.1% 0.8% 1.6% (<0%)
Blood HbA1c (IFCC, Mon) 10% 3.6% 2.4% 2.4% (<0%) 1.8% (<0%)
Blood HbA1c (NGSP, Mon) 7% 2.5% 1.6% 1.6% (<0%) 1.2% (<0%)
Urine Alb 230% 83.0% 77.2% 54.2% 44.9% 40.7% 27.0%
Plasma Na 4% 1.4% 1.3% 0.9% 0.6% 0.7% 0.0%
Plasma AmylP 100% 36.1% 35.6% 23.6% 22.9% 17.7% 16.7%
Plasma LDL-C 20% 7.2% (<0%) 4.7% (<0%) 3.5% (<0%)
Plasma Crea 40% 14.4% 13.5% 9.4% 8.0% 7.1% 5.0%
a
The statistical power (1-β) of testing was taken at the unidirectional Gaussian probability of false positives α = +2.5 % (z = 1.96), while increasing the difference of
the mean of changed values (Fig. 2), according to N. Iglesias Canadell, P. Hyltoft Petersen, E. Jensen, C. Ricós, E. Jørgensen, Reference change values and power
functions, Clin. Chem. Lab. Med. 42 (2004) 415–422.[29].
b
The diagnostic variations CVD at different z scores were calculated with the equation (2): CVD = CD / (z * √2).
c
CAAPS were calculated as shown in Table 2. Limits of achievable CAAPS ranges based on our experience are marked bold.
d
Abbreviations used: Blood HbA1c, blood hemoglobin A1c; CAAPS, clinically acceptable analytical performance specification; CD, clinically significant difference;
CVD, diagnostic variation; DT, diagnostic testing; IFCC, International Federation of Clinical Chemistry and Laboratory Medicine, mmol/mol unit; Mon, monitoring;
NGSP, National Glycohemoglobin Standardization Program, Hgb% unit; Plasma AmylP, plasma pancreatic amylase; Plasma Crea, plasma creatinine; Plasma LDL-C,
plasma LDL cholesterol; Plasma Na, plasma sodium; Urine Alb, urine albumin; z, Gaussian statistic; α, type I error (false positives); β, type II error (false negatives); 1-β,
statistical power, sensitivity to detect a change (opposite probability to β).
Table 4
Impact of repeated measurements on CAAPS.b
Measurand Clinically significant Acceptable CVREP after CAAPS b,c assuming variable number Biological intra-individual
difference, CD % n repeated specimens of repeats, n (1 to 4) variation, %
a
to reach the required CVD using z = 3
Blood HbA1c 10% 2.4% 3.3% 4.1% 4.7% (<0%) 2.0% 3.1% 3.9% 2.5%
(IFCC, Monitoring)
Blood HbA1c (NGSP, 7% 1.6% 2.3% 2.9% 3.3% (<0%) 1.2% 2.1% 2.6% 1.7%
Monitoring)
Urine Alb 230% 54.2% 76.7% 93.9% 108.4% 44.9% 70.4% 88.8% 104.1% 30.0%
Plasma Na 4% 0.9% 1.3% 1.6% 1.9% 0.6% 1.1% 1.5% 1.7% 0.5%
Plasma AmylP 100% 23.6% 33.3% 40.8% 47.1% 22.9% 32.8% 40.4% 46.8% 4.0%
Plasma LDL-C 20% 4.7% 6.7% 8.2% 9.4% (<0%) (<0%) 1.3% 4.9% 8.0%
Plasma Crea 40% 9.4% 13.3% 16.3% 18.9% 8.0% 12.4% 15.5% 18.2% 4.9%
a
Acceptable diagnostic variation of repeated measurements, CVREP, was calculated by using the equation (3), CVREP = CVD * √n. The following derivation of
equation (1) applies: CD = z * √2 * CVD = z * √2 * CVREP / √n. The statistical power (1-β) = 85 % with a mean value at z = 3 was used to define the clinically
significant change at a unidirectional α = 2.5% (Fig. 2).
b
Abbreviations used: Blood HbA1c, blood hemoglobin A1c; CAAPS, clinically acceptable analytical performance specification; CD, clinically significant difference;
CVA, analytical variation; CVD, diagnostic variation; CVI, intra-individual biological variation; CVREP, variation of repeated measurements; IFCC, International
Federation of Clinical Chemistry and Laboratory Medicine, mmol/mol unit; Mon, monitoring; NGSP, National Glycohemoglobin Standardization Program, Hgb% unit;
n, number of repeats; Plasma AmylP, plasma pancreatic amylase; Plasma Crea, plasma creatinine; Plasma LDL-C, plasma LDL cholesterol; Plasma Na, plasma sodium;
Urine Alb, urine albumin; z, Gaussian statistic, 1-β, statistical power, sensitivity to detect a change.
c
CAAPS (acceptable CVA) were calculated using the equation CV2A = CV2REP/n – CV2I – CV2PRE (see also Table 2). Achievable CAAPS ranges based on our experience are
shown bold.
other estimates of APS for the studied measurands was performed at 4. Discussion
desirable levels of the other estimates. Compared APS and TE were based
on biological variation (Milan Model 2) derived from healthy in 4.1. Applicability of the obtained CAAPS
dividuals, except the alternative APS for HbA1c (CVI of diabetes patients)
and another for urine Alb (CVI of albuminuria patients). Another Milan 4.1.1. Blood HbA1c
Model 1b example was listed based on classification errors for blood A CAAPS for monitoring of blood HbA1c (2.4% in IFCC units; 1.6% in
HbA1c [39]. APS were obtained from published CVI estimates NGSP units) was more stringent than that of diagnostic testing (4.3%
[24,30,40,41]. For urine Alb, we used additionally a desirable bias of IFCC; 2.6% NGSP) (Table 2). A single platform is usually used to perform
+/-10% against isotope dilution mass spectrometry (ID-MS), with a the HbA1c assay in the laboratories meaning the CAAPS for monitoring
maximum analytical imprecision of 8% to calculate TE = 23.2% [42]. would be applied. The CAAPS for monitoring was no longer achievable
For plasma Na, consolidated recommendations of TE from EQA schemes with the generally used CD when the statistical power 1-β was increased
were also listed for comparison [43] (Table 5). CAAPS derived from to 85%, challenging also diagnostic testing with a CAAPS of 1.9%
detection of pathophysiological states were generally wider than APS (IFCC), or 0.8% (NGSP) (Table 3). According to this modeling, repeated
based on CVI in health. follow-up samples (Table 4) would be required in monitoring of glyce
mic control, considering the current performance for blood HbA1c assays
[39].
6
E. Rotgers et al. Clinica Chimica Acta 540 (2023) 117233
Table 5
Comparison of clinically acceptable analytical performance specifications, CAAPS, to other estimates of total allowable error, TE, or APS.a
Measurand CAAPS b, CAAPS with TE c from APS from Source Notes
e
singleton comparison at replicates (n) literature, biological
z ¼ 3 (Table 3), (Table 4), % variation d,
% % %
Blood HbA1C Diagnostic testing: 3.0 (IFCC) [39,40] The model used CVA ≤ 3% (IFCC) with no bias causing 2%
misclassifications
1.9 (IFCC) 3.1 (IFCC) 0.8 (IFCC) [30] Using CVI of healthy individuals
0.8 (NGSP) 2.2 (NGSP) 0.6 (NGSP) [30]
Monitoring:
(< 0) 2.0 (n = 2; IFCC) 3.9 (IFCC) 1.25 (IFCC) [41] Using CVI of diabetics
(< 0) 2.1 (n = 3; NGSP) 2.7 (NGSP) 0.85 (NGSP) [41]
Urine Alb 44.9 23.2 15.0 [24,42] Using CVI of albuminuria patients
CVA ≤ 8% (11/17 procedures), B ≤ 10% (8/17 procedures at
30 mg/L albumin); calculated from these, TE = 23.2%
Plasma Na 0.6 1.5 (n = 3) 0.7 0.25 [30,40]
2.9 [43] EQA recommendation: +/-4 mmol/L
(4 mmol/140 mmol = 2.9 %)
Plasma 22.9 12.2 2.0 [30]
AmylP
Plasma LDL- (< 0) 4.9 (n = 4) 13.7 4.0 [30]
C
Plasma Crea 8.0 7.4 2.25 [30]
a
Expressed as percentage (%) for all estimates.
b
Abbreviations: APS, analytical performance specification; B, bias; BV, biological variation; CAAPS, clinically acceptable analytical performance specification; CVA,
analytical variation; CVG, between-subject biological variation (healthy individuals); CVI, intra-individual biological variation; EQA, external quality assessment; IFCC,
International Federation of Clinical Chemistry and Laboratory Medicine, mmol/mol unit; NGSP, National Glycohemoglobin Standardization Program, Hgb% unit; TE,
total (allowable) error.
c
Total allowable error TE ≤ [1.65 × 0.5 CVI + 0.25 (CV2I + CVG2)0.5], calculated from biological variation.
d
APS for random variation of measurements, desirable CVA = 0.5 *CVI, assuming no bias after calibrations [41].
e
References in this table were the following (details given in the list of References): [24] S.S. Waikar et al, Am. J. Kidney Dis. 2018;[30] A.K. Aarsand, et al, EFLM
Biological Variation Database, desirable APS shown (https://biologicalvariation.eu, accessed 2022);[39] A.A. Nielsen, et al, Clin Chem Lab Med 2014;[40] F. Braga, M.
Panteghini, Clin Chem Lab Med 2021;[41] S. Carlsen, et al, Clin Chem Lab Med 2011; [42] L.M. Bachmann et al, Clin Chem 2014; [43] S. Westgard, (https://www.
westgard.com/consolidated-goals-chemistry.htm, accessed 2022);
4.1.2. Urine albumin, plasma creatinine and plasma pancreatic amylase that are used for rapid therapeutic decisions of cancer patients [45].
CAAPS were calculated for urine Alb and plasma Crea due to their Statistical sensitivity (1-β), urgency of detection from a single specimen,
key role in diagnosing chronic kidney diseases [23]. The diagnostic and a possibility to limit the clinical need to a larger diagnostic differ
classification of acute kidney injuries (AKI) also utilizes measurements ence must be considered, when measurands appear to show too wide
of plasma creatinine, but well-known limitations of its diagnostic per biological variation.
formance excluded AKI from our considerations [44]. The clinically
required differences between health and nephropathy-related values 4.1.4. Plasma sodium
could be detected with two singleton results, using 1-β = 85% (z = 3) CAAPS that allows detection of an analytical or clinical change of 5
with an estimated CAAPS of 44.9% (urine Alb) and 8.0% (plasma Crea), mmol/L in plasma Na was studied, despite the small relative difference.
or even an CAAPS = 27.0% and 5.0% with 1-β = 98% (z = 4), respec As an example, an emergency patient may have a true plasma Na 124
tively (Table 3). Similarly, measurements of plasma AmylP were esti mmol/L, but it is reported as 129 mmol/L, using the difference of 5
mated to allow a CAAPS of 22.9% (1-β = 85%), or 16.7% (1-β = 98%) for mmol/L corresponding to the limiting CD of + 4% (Table 2). A false
diagnostic classification of suspected pancreatitis in emergency room increase could be excluded with the calculated CAAPS of 1.3% (z = 1.96,
(Table 3). For these measurands, the estimated CAAPS were applicable α = 2.5%). However, if a statistical power of 1-β = 85% is expected in
in the chosen clinical settings. detection of a change in analytics, the acceptable CAAPS diminished to
as low as 0.6% with singleton measurements (Table 3). After triplicate
4.1.3. Plasma low-density lipoprotein cholesterol measurements both originally and in the follow-up, a currently attain
The large biological intra-individual variation of plasma LDL-C able CAAPS = 1.5% was reached at 1-β = 85% (Table 4). Both repeated
concentrations limits the detection of clinically significant treatment sampling and transportation delays to central laboratory indicate that
effects in a singleton comparison at a sufficient sensitivity (Table 2). To point-of-care devices are to be adopted in plasma sodium diagnostics in
confirm a 20 % change from 1.8 to 1.4 mmol/l after statin treatment, intensive care.
four repeated measurements were initially needed, and another four CAAPS frames seem to provide generally useful estimates of
measurements in the follow-up according to the Gaussian model at 1-β acceptable overall analytical variation for clinical laboratory measure
= 85% (Table 4). The CVD risk assessment would benefit from another ments. In addition, clinical laboratories need narrower limits to internal
measurand with a smaller CVI. Unfortunately, the median CVI of apoli quality control rules of their analytical measurements to be able to
poprotein B is also 7.4% [30]. Larger differences, such as detection of guarantee day-to-day reproducibility in their clinical laboratory
familial hypercholesterolemia with a difference between 5 and 3 mmol/ environments.
L in plasma LDL-C, are attainable from two singleton results at a sensi
tivity of 85% with the presented CAAPS modeling (data not shown). This
4.2. Comparison of CAAPS with other APS estimates
is compatible with the ESC/EAS guideline recommendation to detect a
− 50% reduction of plasma LDL-C concentration as compared to the
CAAPS derived from combined diagnostic variation, CVD (Fig. 1),
initial value in high-risk patients [27]. Use of plasma LDL-C in risk
represent Milan APS Model 1b. CAAPS to blood HbA1c measurements
assessment tolerates the repetitions as opposed to, e.g., tumor markers
seemed to reflect well performance needs for both diagnostic testing and
7
E. Rotgers et al. Clinica Chimica Acta 540 (2023) 117233
monitoring (Table 5). Duplicate or triplicate measurements are needed 4.4. Limitations of the CAAPS approach
to monitor disease states. Interestingly, desirable APS estimated from
CVI of diabetics for HbA1c (Model 2) gave closely related results despite 4.4.1. Clinical guidelines as sources of APS
different background of estimates (Table 5). The APS estimated as Clinically justified analytical performance is essential for clinical
CAAPS (Model 1b) remains our primary choice because of its clinically laboratory service. However, the complex use of laboratory tests is
defined background. difficult to translate into needed analytical performance [7]. An APS for
The obvious link to clinical need can be used to guide APS for a single test is not easily isolated from clinical practice with combined
measurements of urine Alb and plasma AmylP (Table 5) because of their tests, other investigations, and other factors of health-care environment.
key roles in classification of patients with chronic renal disease or acute Direct Milan Model 1a studies, i.e., impact of analytical performance of
abdominal pain, respectively. measurements on clinical outcomes, are lacking even with respect to
CAAPS for plasma LDL-C emphasizes the limitations of judging blood HbA1c measurements in diabetes outcomes [9]. The surrogate
treatment effects with singleton comparison only in case of a wide CVI. indirect outcome studies of Milan Model 1b are pragmatic, closely
Our calculation provided an estimate of 4.9% from four repeated com related to clinical classifications and decisions, but they carry inherent
parisons until a change of 20% was noted with 1-β = 85% (Table 4). The limitations, e.g., a tendency to base clinical requirements on state-of-
CVI based APS is 4.0% (Table 5), but nevertheless detection of the the-art analytical performance [6]. Another limitation is that expert
prognostic targets expressed in the European guidelines requires four consensus on adequate classification limits may provide a relevant CD,
repeated measurements [27]. Thus, both CAAPS (Model 1b) and APS and a consequent APS for laboratory use, but the impact of use of such
(Model 2) challenge the feasibility of the current prognostic categories CD clinically with an analytical performance corresponding to the given
used in dyslipidemia treatment. This highlights the need of clinical APS in the laboratory should be verified in a clinical study to show an
guideline developers to involve laboratory professionals in guideline actual increase in patients’ health outcome.
development so that treatment targets are set with clear understanding A primary prerequisite for the indirect Milan Model 1b is a well-
of the impact of both CVI and state-of-the-art analytical performance. defined link between clinical decisions and the used test [7]. We
Both the between-subject and intra-individual biological variations derived clinically significant differences, CD, from diagnostic categories
of plasma Na are narrow in health, resulting in a desirable APS = 0.25% or prognostic targets from international guidelines to avoid individual
for plasma sodium (Table 5). Clinically, the accuracy of plasma Na is opinions (Table 1). In patient monitoring, a difference between two
critical at concentrations distant from strictly homeostatic health- results obtained with optimal analytical performance tends to be
related reference interval, most importantly in the hyponatremia perceived always “significant” especially if indicating worsening of a
range. Our modelling provided a CAAPS = 0.6% (1-β = 85%), satisfying clinical situation, despite the actual effect on patient outcome, as shown
the need of critically ill patients. We approached incidentally the TE with interpretation of blood HbA1c results [21].
estimate = 0.7% from biological variation. All of these were notably Estimates of CD for monitoring blood HbA1c were available from
tighter than a traditional acceptance limit in EQA schemes [43]. The long-term clinical outcomes of diabetic patients and international
wide EQA limits emphasize the need to improve technical quality of studies (Table 1). With other measurands, we minimized state-of-the-art
electrolyte measurements, despite it being a difficult task [46]. reasoning by using CD from classifications of pathophysiological states
Plasma Crea was assigned to Milan Model 2 in the consensus pro (moderate nephropathy, acute pancreatitis), or treatment targets (high-
posal [8], requiring then an APS of 2.25% [30]. In clinical practice, renal risk dyslipidemia), independently of assay performance. If an accurate
insufficiency is widely screened and classified using computerized eGFR CD was missing, a consensus of the authors was used to define the
equations calculated from plasma Crea (Table 1). The frequency of quantitative limits of each CD. In this way, the moderate albuminuria
plasma Crea measurements is also explained by need of GFR estimates to was described by the logarithmic midpoint of its range (100 mg/L urine
avoid overdosing of drugs in renal insufficiency. These examples show Alb). For plasma AmylP, a sensitized limit from 4xURL to 2xURL was
that differentiation between healthy and impaired renal function is the used to improve detection of delayed or severe pancreatitis, based on
key use of plasma Crea measurements, and establish the clinical link sensitivity Sn and specificity Sp of elevated plasma AmylP in the di
required for Model 1b APS. The CAAPS of plasma Crea measurements agnostics of pancreatitis. These two midpoint examples show tailoring of
was 8.0% (1-β = 85%), or 5.0% (1-β = 98%), the former being close to the CD criteria to specific clinical purposes [7].
7.4% obtained from biological variation for TE (Table 5). In profound hyponatremia, the clinical need of accuracy is associated
with the risk of cerebral edema resulting from too rapid correction of
4.3. Statistical flexibility of obtained CAAPS estimates plasma Na (Table 1). Hyponatremia treatment protocol from the Euro
pean hyponatremia guideline quotes the same critical difference
The CAAPS model caters for different z statistics of the equation (2), (125–130 mmol/L) as used earlier by Klee in reporting “medical utility
representing different levels of statistical power 1-β. It might be used to CV” for plasma Na [52]. Clinical knowledge may be universal if relevant
define optimum, desirable, or minimum CAAPS for each measurand, like clinical situations for CD estimates are selected, although confirmation
earlier conventions for CVA < 0.25 CVI (optimum), <0.5 CVI (desirable), from local clinicians is always recommended.
or < 0.75 CVI (minimum) [38,47,48]. A major benefit of the CAAPS
model is its flexibility if quantitative data are available for classification 4.4.2. Limits of modeling acceptable CVA from estimates of CVD and its
of disease states at various concentration levels. The CAAPS is only as components
robust as the used variance components, available clinical guidelines, To model APS from clinically significant differences, CD, the concept
and applied decision limits they are anchored to. Separate variances for of reference change value RCV [28,34] was applied from existing ex
chosen decision limits at different concentration levels of the analyte can amples for measurements of HbA1c [19,21,9]. The guideline-derived CD
be modeled if needed [49]. Disease-associated CVI are larger than the (=RCV) was used to calculate maximum acceptable diagnostic variation
CVI in health, at least in patients with diabetes or chronic renal failure CVD (equation (2)) that was further divided into its components CVI,
[50], changing the equation of combined variance between two mea CVPRE and CVA (equation (3)), to reach CAAPS (Fig. 1). The modelling
surements. Replicate measurements are advisable when using diagnostic assumes Gaussian distributions of CV data with identical variances in
cut-offs, as they improve imprecision and consequently accuracy of repeated measurements. We then simplified the published non-Gaussian
clinical diagnosis [51]. Furthermore, post-analytical uncertainty was distributions of health related CVI using median values as means
not considered in the proposed model of the CD, because it is usually [30,41]. To improve sensitivity in detecting a true change from 50%
related to discrete non-conformity events rather than increased (corresponding 1-β at z = 1.96, α = 2.5%), we increased the z score to 3
variance. or 4 [29] (Fig. 2).
8
E. Rotgers et al. Clinica Chimica Acta 540 (2023) 117233
A practically important problem is a large CVI of some measurands [4] C.G. Fraser, P.H. Petersen, Analytical performance characteristics should be judged
against objective quality specifications, Clin. Chem. 45 (1999) 321–323. PMID:
that exhausts the budget available for CVA from a relatively narrow CVD
10053031.
in subtractions, as shown for plasma LDL-C (Table 2). We offer repeated [5] D. Kenny, C.G. Fraser, P. Hyltoft Petersen, A. Kallner, Consensus agreement, Scand.
measurements as an option to reach CAAPS with these measurands J. Clin. Lab. Invest. 59 (1999) 585–585, doi: 10.1080/00365519950185409.
(Table 4). An average CVI of ambulatory patients may also be reduced in [6] S. Sandberg, C.G. Fraser, A.R. Horvath, R. Jansen, G. Jones, W. Oosterhuis, et al.,
Defining analytical performance specifications: consensus Statement from the 1st
standardized environments using short collection intervals from in- Strategic Conference of the European Federation of Clinical Chemistry and
patients (emergency room, intensive care), to detect a tendency of Laboratory Medicine, Clin. Chem. Lab. Med. 53 (2015) 833–835, https://doi.org/
repeated results better, such as that in plasma troponin I concentrations, 10.1515/cclm-2015-0067.
[7] A.R. Horvath, P.M. Bossuyt, S. Sandberg, A.S. John, P.J. Monaghan, W.
when suspecting of cardiac events. Occasionally, another measurand D. Verhagen-Kamerbeek, et al. for the Test Evaluation Working Group of the
with a smaller CVI may solve the need for improved diagnostics. European Federation of Clinical Chemistry and Laboratory Medicine, Setting
analytical performance specifications based on outcome studies - is it possible?
Clin. Chem. Lab. Med. 53 (2015) 841–848, https://doi.org/10.1515/cclm-2015-
5. Conclusions 0214.
[8] F. Ceriotti, P. Fernandez-Calle, G.G. Klee, G. Nordin, S. Sandberg, T. Streichert, et
al., Criteria for assigning laboratory measurands to models for analytical
Clinically significant differences CD were searched from interna
performance specifications defined in the 1st EFLM Strategic Conference, Clin.
tional guidelines and modelled into two measurements to enable Chem. Lab. Med. 55 (2017) 189–194, https://doi.org/10.1515/cclm-2016-0091.
calculating diagnostic variation CVD with traditional RCV statistics and [9] T.P. Loh, A.F. Smith, K.J.L. Bell, S.J. Lord, F. Ceriotti, G. Jones, et al., Setting
analytical performance specifications using HbA1c as a model measurand, Clin.
deriving CAAPS for six clinical chemistry measurands representing
Chim. Acta 523 (2021) 407–414, https://doi.org/10.1016/j.cca.2021.10.016.
different areas of clinical diagnostics. The CAAPS provides a new tool to [10] Joint Committee for Guides in Metrology, International vocabulary of metrology —
anchor laboratory performance to clinical needs using well-defined Basic and general concepts and associated terms, third ed., VIM 3, 2012. Available
settings. The calculations can be further adjusted for different levels of from: <https://www.bipm.org/utils/common/documents/jcgm/JCGM_200_2012.
pdf>. (accessed 20 July, 2022).
required statistical power and repeated measures making them appli [11] International Organization for Standardization, ISO/TS 20914:2019, Medical
cable to a wide range of clinical scenarios and analytical performance. laboratories — practical guidance for the estimation of measurement uncertainty,
Thus, the CD-derived CAAPS support granular discussions on test per International Organization for Standardization, Geneva, 2019.
[12] International Organization for Standardization, ISO 17511:2020, In vitro
formance between administration, clinicians, and laboratories instead of diagnostic medical devices — requirements for establishing metrological
providing APS as external facts that professionals outside the labora traceability of values assigned to calibrators, trueness control materials and human
tories cannot challenge. samples, International Organization for Standardization, Geneva, 2020.
[13] E. Theodorsson, Uncertainty in measurement and total error: tools for coping with
diagnostic uncertainty, Clin. Lab. Med. 37 (2017) 15–34, https://doi.org/10.1016/
Funding sources j.cll.2016.09.002.
[14] T. Kouri, M. Siloaho, S. Pohjavaara, P. Koskinen, O. Malminiemi, P. Pohja-
Nylander, R. Puukka, Pre-analytical factors and measurement uncertainty, Scand J
This research did not receive any specific grant from funding Clin Lab Invest 65 (2005) 463–475, https://doi.org/10.1080/
agencies in the public, commercial, or not-for-profit sectors. 00365510500208332.
[15] B. Magnusson, H. Ossowicki, O. Rienitz, E. Theodorsson, Routine internal- and
external-quality control data in clinical laboratories for estimating measurement
and diagnostic uncertainty using GUM principles, Scand. J. Clin. Lab. Invest. 72
Declaration of Competing Interest (2012) 212–220, https://doi.org/10.3109/00365513.2011.649015.
[16] REGULATION (EU) 2017/746 OF THE EUROPEAN PARLIAMENT AND OF THE
COUNCIL of 5 April 2017 on in vitro diagnostic medical devices and repealing
The authors declare that they have no known competing financial Directive 98/79/EC and Commission Decision 2010/227/EU. Available from:
interests or personal relationships that could have appeared to influence <https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%
the work reported in this paper. 3A32017R0746&qid=1648147617108>.
[17] International Organization for Standardization, ISO 15189:2022, Medical
laboratories - requirements for quality and competence, third ed., International
Data availability Organization for Standardization, Geneva, 2022.
[18] World Health Organization, Classification of diabetes mellitus, World Health
Calculations used for Table 4 were shared as a Supplementary Excel Organization, Geneva, 2019.
[19] R.R. Little, C.L. Rohlfing, D.B. Sacks, for the National Glycohemoglobin
table for the readers Standardization Program (NGSP) Steering Committee, Status of hemoglobin A(1c)
measurement and goals for improvement: from chaos to order for improving
diabetes care, Clin. Chem. 57 (2011) 205–214, https://doi.org/10.1373/
Acknowledgements clinchem.2010.148841.
[20] C. Weykamp, HbA1c: a review of analytical and clinical aspects, Ann. Lab. Med. 33
Docent Lotta Joutsi-Korhonen, MD, Head Physician of the Depart (2013) 393–400, https://doi.org/10.3343/alm.2013.33.6.393.
[21] S. Skeie, C. Perich, C. Ricos, A. Araczki, A.R. Horvath, W.P. Oosterhuis, et al.,
ment of Clinical Chemistry at HUSLAB, HUS Diagnostic Center, and Postanalytical external quality assessment of blood glucose and hemoglobin A1c:
Professor Satu Mustjoki, MD, Head of the Department of Clinical an international survey, Clin. Chem. 51 (2005) 1145–1153, https://doi.org/
Chemistry, University of Helsinki, are warmly acknowledged for 10.1373/clinchem.2005.048488.
[22] R.C. Turner, R.R. Holman, C.A. Cull, I.M. Stratton, D.R. Matthews, V. Frighi, et al.,
providing resource and facilities to this study.
UK Prospective Diabetes Study (UKPDS) Group, Intensive blood-glucose control
with sulphonylureas or insulin compared with conventional treatment and risk of
Appendix A. Supplementary material complications in patients with type 2 diabetes (UKPDS 33), Lancet 352 (1998)
837–853, https://doi.org/10.1016/S0140-6736(98)07019-6.
[23] Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. KDIGO
Supplementary data to this article can be found online at https://doi. 2012 Clinical Practice Guideline for the Evaluation and Management of Chronic
org/10.1016/j.cca.2023.117233. Kidney Disease. Kidney Int, Suppl. 3 (2013) 1–150. Summary of Recommendation
Statements. doi: 10.1038/kisup.2012.77.
[24] S.S. Waikar, C.M. Rebholz, Z.H. Zheng, S. Hurwitz, C.Y. Hsu, H.I. Feldman, et al.,
References Chronic kidney disease biomarkers consortium, biological variability of estimated
GFR and albuminuria in CKD, Am. J. Kidney Dis. 72 (2018) 538–546, https://doi.
org/10.1053/j.ajkd.2018.04.023.
[1] R.N. Barnett, Medical significance of laboratory results, Am. J. Clin. Pathol. 50
[25] G. Spasovski, R. Vanholder, B. Allolio, D. Annane, S. Ball, D. Bichet, et al., on
(1968) 671–677.
behalf of the Hyponatraemia Guideline Development Group, Clinical practice
[2] E. Cotlove, E.K. Harris, G.Z. Williams, Biological and analytic components of
guideline on diagnosis and treatment of hyponatraemia, Eur. J. Endocrinol. 170
variation in long-term studies of serum constituents in normal subjects. III
(2014) G1–G47, https://doi.org/10.1530/EJE-13-1020.
Physiological and medical implications, Clin. Chem. 16 (1970) 1028–1032.
[26] S. Tenner, J. Baillie, J. DeWitt, S.S. Vege, American college of gastroenterology
[3] D.B. Tonks, M. Stoeppler, D.E. Douglas, J. Wolska, Evaluation of analytical
guideline: management of acute pancreatitis, Am. J. Gastroenterol. 108 (2013)
techniques for measurements of cadmium in body fluids, Ann. Clin. Lab. Sci. 15
1400–1415, https://doi.org/10.1038/ajg.2013.218.
(1985) 342–342.
9
E. Rotgers et al. Clinica Chimica Acta 540 (2023) 117233
[27] F. Mach, C. Baigent, A.L. Catapano, K.C. Koskinas, M. Casula, L. Badimon, et al., [40] F. Braga, M. Panteghini, Performance specifications for measurement uncertainty
the Task Force for the management of dyslipidaemias of the European Society of of common biochemical measurands according to Milan models, Clin. Chem. Lab.
Cardiology (ESC) and European Atherosclerosis Society (EAS), 2019 ESC/EAS Med. 59 (2021) 1362–1368, https://doi.org/10.1515/cclm-2021-0170.
Guidelines for the management of dyslipidaemias: lipid modification to reduce [41] S. Carlsen, P. Hyltoft Petersen, S. Skeie, Ø. Skadberg, S. Sandberg, Within-subject
cardiovascular risk, Eur. Heart J. 41 (2020) 111–188, doi: 10.1093/eurheartj/ biological variation of glucose and HbA1c in healthy persons and in type 1 diabetes
ehz455. patients, Clin. Chem. Lab. Med. 49 (2011) 1501–1507.
[28] C.G. Fraser, Reference change values, Clin. Chem. Lab. Med. 50 (2012) 807–812, [42] L.M. Bachmann, G. Nilsson, D.E. Bruns, M.J. McQueen, J.C. Lieske, J.J. Zakowski,
https://doi.org/10.1515/cclm.2011.733. W.G. Miller, State of the art for measurement of urine albumin: comparison of
[29] N. Iglesias Canadell, P. Hyltoft Petersen, E. Jensen, C. Ricos, P.E. Jorgensen, routine measurement procedures to isotope dilution tandem mass spectrometry,
Reference change values and power functions, Clin. Chem. Lab. Med. 42 (2004) Clin. Chem. 60 (2014) 471–480, https://doi.org/10.1373/clinchem.2013.210302.
415–422, https://doi.org/10.1515/CCLM.2004.073. [43] S. Westgard, Consodilated Comparison of Chemistry (and Toxicology) Performance
[30] A.K. Aarsand, P. Fernandez-Calle, C. Webster, A. Coskun, E. Gonzales-Lao, J. Diaz- Specifications, 2022. Available from: <https://www.westgard.com/consolidated-
Garzon, et al., on behalf of the European Federation of Clinical Chemistry and goals-chemistry.htm> (accessed 22 July 2022).
Laboratory Medicine Task and Finish Group for the Biological Variation Database, [44] B.C. Birkelo, N. Pannu, E.D. Siew, Overview of diagnostic criteria and
The EFLM Biological Variation Database, 2022. Available from: <https:// epidemiology of acute kidney injury and acute kidney disease in the critically ill
biologicalvariation.eu/> (accessed 3 November 2022). patient, Clin. J. Am. Soc. Nephrol. 17 (2022) 717–735, https://doi.org/10.2215/
[31] M.S. Sylte, T. Wentzel-Larsen, B.J. Bolann, Estimation of the minimal preanalytical CJN.14181021.
uncertainty for 15 clinical chemistry serum analytes, Clin. Chem. 56 (2010) [45] G.R.D. Jones, Further issues with using reference change values, Clin. Chim. Acta
1329–1335, https://doi.org/10.1373/clinchem.2010.146050. 528 (2022) 13–14, https://doi.org/10.1016/j.cca.2022.01.008.
[32] X. Fuentes-Arderiu, G. Acebes-Frieyro, L. Gavaso-Navarro, M.J. Castiñeiras- [46] S. Pasqualetti, M. Chibireva, F. Borrillo, F. Braga, M. Panteghini, Improving
Lacambra, Pre-metrological (pre-analytical) variation of some biochemical measurement uncertainty of plasma electrolytes: a complex but not impossible
quantities, Clin. Chem. Lab. Med. 37 (1999) 987–989, https://doi.org/10.1515/ task, Clin. Chem. Lab. Med. 59 (2021) e129–e132, https://doi.org/10.1515/cclm-
CCLM.1999.146. 2020-1399.
[33] R. Rigo-Bonnin, D. Munoz-Provencio, F. Canalias, Reference change values based [47] C.G. Fraser, P. Hyltoft Petersen, J.C. Libeer, C. Ricos, Proposals for setting
on uncertainty models, Clin. Biochem. 80 (2020) 31–41, https://doi.org/10.1016/ generally applicable quality goals solely based on biology, Ann. Clin. Biochem. 34
j.clinbiochem.2020.03.016. (1997) 8–12, https://doi.org/10.1177/000456329703400103.
[34] E.K. Harris, T. Yasaka, On the calculation of a “reference change” for comparing [48] M. Panteghini, F. Ceriotti, G. Jones, W. Oosterhuis, M. Plebani, S. Sandberg, on
two consecutive measurements, Clin. Chem. 29 (1983) 25–30. behalf of the Task Force on Performance Specifications in Laboratory Medicine of
[35] N. Iglesias, P. Hyltoft Petersen, C. Ricos, Power function of the reference change the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM),
value in relation to cut-off points, reference intervals and index of individuality, Strategies to define performance specifications in laboratory medicine: 3 years on
Clin. Chem. Lab. Med. 43 (2005) 441–448, https://doi.org/10.1515/ from the Milan Strategic Conference, Clin. Chem. Lab. Med. 55 (2017) 1849–1856,
CCLM.2005.078. https://doi.org/10.1515/cclm-2017-0772.
[36] J.O. Westgard, T. Groth, T. Aronsson, H. Falk, C.H. de Verdier, Performance [49] G.R.D. Jones, Critical difference calculations revised: inclusion of variation in
characteristics of rules for internal quality control: probabilities for false rejection standard deviation with analyte concentration, Ann. Clin. Biochem. 46 (2009)
and error detection, Clin. Chem. 23 (1977) 1857–1867. 517–519, https://doi.org/10.1258/acb.2009.009083.
[37] C.G. Fraser, Test result variation and the quality of evidence-based clinical [50] C. Ricos, N. Iglesias, J.V. Garcia-Lario, M. Simon, F. Cava, A. Hernandez, et al.,
guidelines, Clin. Chim. Acta 346 (2004) 19–24, https://doi.org/10.1016/j. Within-subject biological variation in disease: collated data and clinical
cccn.2003.12.032. consequences, Ann. Clin. Biochem. 44 (2007) 343–352, https://doi.org/10.1258/
[38] F. Braga, M. Panteghini, The utility of measurement uncertainty in medical 000456307780945633.
laboratories, Clin. Chem. Lab. Med. 58 (2020) 1407–1413, https://doi.org/ [51] P.H. Petersen, G.G. Klee, Influence of analytical bias and imprecision on the
10.1515/cclm-2019-1336. number of false positive results using Guideline-Driven Medical Decision Limits
[39] A.A. Nielsen, P.H. Petersen, A. Green, C. Christensen, H. Christensen, Clin, Chim. Acta 430 (2014) 1–8, https://doi.org/10.1016/j.cca.2013.12.014.
I. Brandslund, Changing from glucose to HbA1c for diabetes diagnosis: predictive [52] G.G. Klee, Establishment of outcome-related analytic performance goals, Clin.
values of one test and importance of analytical bias and imprecision, Clin. Chem. Chem. 56 (2010) 714–722, https://doi.org/10.1373/clinchem.2009.133660.
Lab. Med. 52 (2014) 1069–1077, https://doi.org/10.1515/cclm-2013-0337.
10