0% found this document useful (0 votes)
60 views

Enhancing Analytical Reasoning in Intensive Care Unit

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

Enhancing Analytical Reasoning in Intensive Care Unit

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Enhancing Analytical

Reasoning in the Intensive


Care Unit
Mark Barash, DO, Rahul S. Nanchal, MD, MS*

KEYWORDS
 Bayes theorem  Bias  Clinical reasoning  Heuristics  Logic  Noise  Probability
 Set theory  Venn diagrams

KEY POINTS
 Intensivists often rely on heuristic principles that lead to severe and systematic errors in
reasoning.
 Reasoning foundations can be described mathematically using logic, probability, and
value theory.
 Intensivists should familiarize themselves with basic statistical and probability principles
to enhance analytical reasoning and avoid biases.
 Bayesian reasoning is the framework surrounding the calculation of posterior odds of
events.
 Noise is likely pervasive in the intensive care unit and should be mitigated.

INTRODUCTION

Clinical reasoning in critical care presents numerous challenges because decisions


must be made expeditiously. Further, there is considerable uncertainty associated
with the decision-making process due to patient complexity, severity of illness, and
the enormous array of laboratory values and physiologic parameters that require distil-
lation into a diagnosis that is most compatible with the clinical presentation. The
cognitive load associated with these complexities combined with situational stressors
of time-sensitive conditions and rapidly deteriorating patients leads to reliance on heu-
ristic principles that reduce the intricate tasks of assessing probabilities and assigning
predictive values to simpler judgmental operations. Although useful, these heuristics
often lead to severe and systematic errors of reasoning.1 Decision making or judgment
encompasses processes that are amenable to systemic analysis in addition to those
that are intangible and value decisions.

Division of Pulmonary and Critical Care Medicine, Hub for Collaborative Medicine, Medical
College of Wisconsin, 8701 Watertown Plank Road, 8th Floor, Milwaukee, WI 53226, USA
* Corresponding author.
E-mail address: [email protected]

Crit Care Clin 38 (2022) 51–67


https://doi.org/10.1016/j.ccc.2021.09.001 criticalcare.theclinics.com
0749-0704/22/ª 2021 Elsevier Inc. All rights reserved.

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
52 Barash & Nanchal

Reasoning foundations can be described mathematically using probability, sym-


bolic logic, and value theory.2 Intuitive—but ultimately misguided—judgment of these
mathematical concepts is what leads to systematic errors. The components of error
comprise biases and the emerging concept of noise. In this article, we first describe
common errors in judgment due to cognitive biases and the mathematical basis un-
derpinning them. We then describe mathematical concepts with which intensivists
should be acquainted to minimize reasoning errors. We cite examples to illustrate con-
cepts whereby appropriate. Finally, we end with a brief discussion of noise and its
pertinence to critical care medicine.

ERRORS IN JUDGMENT AND MATHEMATICAL UNDERPINNINGS

A comprehensive list of heuristics and biases with brief descriptions appears in


Table 1. We explain in detail a select few that are encountered most often in clinical
practice. Problems in probabilistic and statistical reasoning are the mathematical un-
derpinning for most of these biases.

Base rate neglect


Simply stated, base rate neglect is the failure to correctly account for the probability of
a condition or disease which may lead to consequences such as unnecessary and
expensive tests, faulty diagnoses, and inappropriate therapies. Providers often over-
look that uncommon presentations of common diseases are more likely than common
presentations of rare disorders. For example, an immigrant from a region with endemic
tuberculosis presenting with multiple organ failure and laboratory features consistent
with primary hemophagocytic lymphohistiocytosis (HLH) is far more likely to have
disseminated tuberculosis than primary HLH.
Perhaps the most widespread form of base rate neglect is insensitivity to prior prob-
abilities and anchoring bias to new but incomplete information that frequently be-
comes available in the context of critical illness. Consider an instance whereby
laboratory tests are obtained on an elderly woman presenting with confusion and fe-
ver. Findings of anemia and thrombocytopenia may prompt an evaluation of throm-
botic thrombocytopenic purpura (TTP). However, the prior probability of sepsis is
manifold higher than TTP, and even with the new information of anemia and thrombo-
cytopenia, the posterior probability of sepsis remains much larger than TTP. It would
be an error not to evaluate for sepsis and administer antibiotics. Further, even when
explicitly presented with base rate information, people often display ineptness and
ignore the information. An adaptation of a well-known test is as follows: Leptospirosis
as a cause of sepsis occurs at a frequency of 1 in 1000. A middle-aged man presents
with sepsis; the source is undifferentiated. He is quickly intubated for respiratory
distress and multiple organ failure ensues. Further history is not obtainable. Among
the array of investigations that are performed, the test for leptospirosis is positive.
The false-negative rate of the test is zero and the false-positive rate is 5%. What is
the probability that this particular patient has leptospirosis? Most physicians would
answer 95% simply taking into account that the test has a 95% accuracy rate. The
correct answer is the conditional probability that the patient is sick and the test is pos-
itive which is less than 2%. A simple approach to arrive at the answer uses fre-
quencies. Out of 1000 patients with similar presentations and testing regardless of
whether they are suspected of having leptospirosis, only one is expected to have
the disease. Out of the 999 that do not have the disease 5% or approximately 50
will have a positive test (because the false positive rate is 5%). Thus, the probability
of having the disease for someone who has a positive test should be the ratio of the

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
Table 1
Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,

Heuristics
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.

Heuristic/Bias Description Consequences


Aggregate bias Associations between variables representing group Physician noncompliance in idiosyncratic
averages are mistakenly taken to reflect what is true approaches may result in patients
for a particular individual, usually when the individual receiving tests, procedures,
measures are not available. An individual patient and treatment outside of accepted
may be treated differently from what has been agreed clinical practice guidelines.
upon through clinical practice guidelines for a group of
patients (there is a tendency for some physicians to treat
their own patients as atypical).
Anchoring A tendency to fixate on specific features of a presentation Anchoring may lead to premature
too early in the diagnostic process. The likelihood of a closure of thinking leading to an
particular event is based on information at the outset. incorrect diagnosis early in the
Some clinicians may fail to adjust their impressions patient’s presentation.
based on new information as it arrives.
Ascertainment The physician’s thinking is pre-shaped by expectations Any prejudgment of patients
bias or by what the physician specifically hopes to find. may lead to under-assessing
A physician may be dismissive of abdominal pain in or over-assessing a condition.
a patient who is admitted frequently for diabetic
ketoacidosis. Stereotyping and gender biases are

Enhancing Analytical Reasoning


examples of ascertainment biases.
Availability and The tendency to diagnose a condition more frequently Availability and non-availability
non- availability if it comes more readily to mind. In other words, things lead to disproportionate
that are common will be readily recalled. estimates of the frequency
Nonavailability occurs when insufficient attention of a particular diagnosis or
is paid to what is not immediately present. condition, to starting estimates
of the base rate, thus influencing
pretest probability.

(continued on next page)

53
54
Table 1
Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,

(continued )
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.

Barash & Nanchal


Heuristic/Bias Description Consequences
Base rate neglect Failure to adequately take into account the May result in over- estimates of
prevalence of a particular disease. unlikely diagnoses. This may
lead to wastefulness and over
utilization of resources.
The pursuit of esoteric
diagnosis is occasionally
successful, and the
intermittent reinforcement
sustains this behavior in
some physicians.
Commission bias Tendency toward action rather Commission errors tend to change the
than inaction. This may course of events, because they involve
occur in someone who is an active intervention, and may
overconfident and reflects therefore be less reversible than an
an urge to “do something.” error of omission.
Confirmation bias A tendency to look for confirming evidence Confirmation bias leads to the
to support the hypothesis, rather than look for preservation of hypotheses
disconfirming evidence to refute it. and diagnoses that were
weak in the first place.
Diagnostic Tendency for a particular diagnosis to become A diagnosis may gather momentum
momentum established without adequate evidence. without gathering verification.
This may involve several intermediaries including Delayed or missed diagnoses lead
the patient and other health care providers. to the highest disabilities and
As this is passed from person to person, are the most costly.
the diagnosis gathers momentum to the point
that it may appear almost certain by the time
that the patient sees a physician.
Fundamental The tendency to blame people when things go Reflects a lack of compassion
attribution wrong rather than circumstances. and understanding for certain
Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,

error classes of patient and may result


2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.

in inappropriate or compromised care.


Hindsight bias After an event has occurred, there is a tendency May prevent a realistic appraisal of
to exaggerate the likelihood that would have been what actually occurred and compromise
assessed for the event before it occurred. learning from the event. It
This may distort the perception of previous may lead to both under - and over-estimations
decision making, such as occurs at morbidity of the clinical decision maker’s abilities.
mortality rounds.
Omission bias A tendency toward inaction or reluctance to treat. While inaction may often be the
most appropriate course, omission bias
may result in the development of
worsening. Emergencies.
Outcome bias The tendency to judge the decision being made Allowing Personal hopes and desires
by its likely outcome. Physicians may prefer decisions to enter clinical decision making
that lead to good outcomes rather than those reduces objectivity and may
that lead to bad outcomes. significantly compromise the process.
Playing the odds A physician’s opinion of the relative chances Playing the odds runs the risk of
that a patient has a particular disease or not. important conditions being missed.
It is influenced by the actual prevalence and
incidence of the disease. The decision is primarily

Enhancing Analytical Reasoning


determined by inductive thinking rather than
objective evidence that has ruled out the disease.
Posterior A physician bases their estimate of the likelihood This error may result in a wrong
probability error of disease on what has gone before. A patient may diagnosis being perpetrated or a
have several admissions with sepsis related to new diagnosis being missed.
urinary tract infection; to make the assumption
that the current admission is related to a
urinary tract infection is an example of a
posterior probability error.
Premature closure Physicians typically generate several diagnoses Premature closure tends to
early in their encounter with a clinical problem. stop further thinking.
Premature closure occurs when one of
these diagnoses is accepted before it has
been fully verified.

Adapted from Croskerry P. Achieving quality in clinical decision making: cognitive strategies and detection of bias. Acad Emerg Med. 2002;9(11):1184-1204. https://

55
doi.org/10.1111/j.1553-2712.2002.tb01574.x; with permission
56 Barash & Nanchal

number of people who have the disease to the number of true and false positive tests
which here is 1 in 51. The approach of indiscriminate testing (ie, casting a wide net
without sound hypotheses hoping that some test will return positive and will, in turn,
lead to a diagnosis), which is commonly described as a “shotgun approach to medi-
cine,” more often than not is a setup for diagnostic error and downstream administra-
tion of inappropriate therapies and iatrogenic harm. The process of deriving posterior
or conditional probabilities, commonly known as the Bayesian approach is useful in
interpreting test results. This approach is described in some detail later in discussion
in the mathematical concepts section.

Insensitivity to sample size


This bias results from the inability to account for sample size when evaluating proba-
bilities from small samples drawn from a larger population. Sampling theory dictates
that the probability of deviating from the population average is much higher in smaller
samples just by random chance. This phenomenon is particularly true for critical
illness, whereby the sample for any given disease or syndrome is derived from a
much larger universe of persons who are hospitalized or treated as outpatients. Com-
bined with the fact that ICU patients on average represent a far sicker population than
their counterparts hospitalized on medical wards or treated as outpatients, posterior
probabilities of encountering rare diagnoses are greater in the ICU than in other areas.
The label of ICU physicians being great diagnosticians may have more to do with the
odds of encountering uncommon diseases and the happenstance of diagnosis rather
than cognitive ability.
To illustrate, first consider the example of spontaneous intracranial hemorrhage
(ICH) in patients without coagulopathy or thrombocytopenia. For any given probability
of ICH, one is more likely to find higher rates in the ICU than the rest of the hospital
based on sample size alone. This is akin to the higher probability of getting 100%
heads with 3 fair coin tosses (probability 5 1/8) than with 5 tosses (probability 5 1/
32). Using the same logic, it is also more likely to find the lowest rates of ICH in the
ICU (the probability of not getting a single head is more if a fair coin is tossed thrice
instead of 5 times). Relatively rare events causing such discrepancies may often be
the target of intense scrutiny and quality improvement efforts, one example being
the prevention of catheter-associated central line infections (CLABSIs); variations of
CLABSI across ICUs in the same hospital may be nothing but phenomena of random
chance. Mathematically, this represents De Moivre’s equation3 which states that the
standard error of the mean is inversely proportional to the square root of the sample
size.
Illuminating this concept further, now consider an adverse drug reaction,
daptomycin-associated eosinophilic pneumonia. For the sake of clarity, let us assume
that 20% of patients receiving daptomycin develop this reaction. The use of daptomy-
cin is likely more frequent in the ICU; hence for simplicity, let us also assume that the
probability of daptomycin use in the ICU than in general medical wards is 80% to 20%.
The posterior odds that 4 out of 5 consecutive patients who receive daptomycin and
develop eosinophilic pneumonia in the ICU than a general ward are 224:1. Even if the
diagnostic accuracy in the ICU was 50%, the odds of diagnosing cases would over-
whelmingly be in the favor of the ICU physician. Asked an alternate way, given the
same circumstances which would afford more diagnostic confidence: A working diag-
nosis of eosinophilic pneumonia in 4 out of 5 consecutive patients who received dap-
tomycin in the ICU or a diagnosis in 10 out of consecutive 20 patients who received
daptomycin on the floor? At first glance, the rate of diagnosis in the ICU (80%) far ex-
ceeds the rate on the ward (50%), which, given our hypothetical rate of 20%, seems

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
Enhancing Analytical Reasoning 57

highly improbable. However, closer scrutiny reveals that the probability of developing
daptomycin-related eosinophilic pneumonia in 4 out of 5 patients in the ICU is orders
of magnitude higher than 10 out of 20 patients on the floor. Accurate diagnosis with
such a low probability of occurrence on the ward is quite the feat.
Misconception of regression
If 2 variables X and Y have the same distribution and the average X score for a group of
selected individuals deviates from the mean of X by k units, then the average of their Y
score usually will deviate from Y by less than k units. This phenomenon called regres-
sion toward the mean occurs frequently in everyday life and was described by Galton
more than 150 years ago.4 Let us return to the example of CLABSI. Mere observation
will clarify that a particularly outstanding period of performance on the occurrence of
CLABSI will inevitably be followed by a period whereby the performance was worse
than the preceding period and vice-versa. Often this prompts extensive critical quality
reviews during the period when performance deviated from organizational goals (often
set at zero infections!) and accolades when performance exceeds them. This schema
of administering rewards and admonishments is widespread and may lead to misper-
ceptions about their effectiveness when changes are most likely to occur secondary to
regression alone.
Fallacies of conjunctive and disjunctive events
These biases, particularly common in medicine are the consequence of anchoring. Psy-
chological studies5,6 indicate that people tend to overestimate the probability of
conjunctive events and underestimate the probability of disjunctive events. An example
of a conjunction fallacy occurs when a trainee alleges sepsis from pneumonia and uri-
nary tract infection (UTI) when asked for a diagnosis. Even if the probability of pneu-
monia and UTI are individually high, the probability of them occurring together is
quite low—that is, the overall probability of a conjunctive event is lower than the prob-
ability of each elementary event. This phenomenon is a simple explanation behind
Occam’s razor or the parsimony in diagnosis. Of course, deviations from this principle
are bound to occur. A patient may have both pneumonia and UTI at the same time; it is
possible but less probable. In the same vein, a trainee’s judgment that the cause of
sepsis is more likely from pneumonia rather than pneumonia or UTI constitutes a
disjunction fallacy. Although the likelihood of pneumonia may be high, the likelihood
or either pneumonia or UTI is higher than the probability of each elementary event.
Biases in the evaluation of compound events are pervasive and influence a myriad of
actions such as administration of therapies (eg, choice of antibiotics) and obtaining lab-
oratory/imaging studies which have numerous downstream influences on iatrogenic
harm (eg, Clostridium difficile colitis), costs of care as well as patient safety and quality.

MATHEMATICAL CONCEPTS
Set theory/Venn diagrams/logic concepts
Set theory is a branch of mathematical logic that pertains to the study of sets or collec-
tion of objects. Probability theory uses the language of sets which can be illustrated in
the form of Venn diagrams. Probability is nothing but a scientific method of measuring
uncertainty or quantifying randomness. Basic operations of sets include intersection,
union, difference, and symmetric difference (Fig. 1).
Probability concepts
Probabilistic reasoning asks a clinician to answer 2 basic questions: (1) how represen-
tative is the patient’s presentation of known disease and (2) what is the likelihood of

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
58 Barash & Nanchal

Fig. 1. Basic operations of sets.

encountering that disease in a patient like this?7 The skilled diagnostician—whether


formally or intuitively—may utilize a pretest probability based on historical and patient
facts and determine the posttest probability, through a likelihood ratio that the disor-
der is indeed present. This approach is termed Bayesian reasoning. Before discussing
Bayesian reasoning, we present concepts that intensivists must familiarize themselves
with and are particularly useful in the interpretation of diagnostic test properties. These
concepts illustrated in the form of 2  2 tables appear in Fig. 2.

Sensitivity
The proportion of true positives among those that have the disorder.

Specificity
The proportion of true negatives among those who do not have the disorder.

Fig. 2. Graphical representation and calculation of properties of diagnostics tests.

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
Enhancing Analytical Reasoning 59

Predictive values
The absolute probability that the disorder is present or absent. It is important to note
that predictive values depend on both test characteristics and the prevalence of the
disorder. For example, an early warning system that has a sensitivity and specificity
of 99% will have a positive predictive value of only 33% if the prevalence of clinical
deterioration is 5 out of 1000 admissions. In other words, for every 100 alerts, on
average, 67 will be false positive.

Likelihood ratios
The probability that a specific result is obtained in patient with the condition is divided
by the probability that the same result is obtained in patients without the condition. A
theoretic advantage of likelihood ratios is that they are independent of prevalence. In
case of dichotomous measures, the likelihood ratio for a positive result can be calcu-
lated as sensitivity/1-specificity and the likelihood ratio for a negative result as 1-sensi-
tivity/specificity.

Bayes theorem/reasoning
In 1736, Reverend Thomas Bayes (b. 1701) anonymously published “An introduction
to the doctrine of fluxions: and a defense of mathematicians against the objections of
the author of the analyst.”8 In it were the first echoes of what, many years later, would
be translated and reworked into Bayes theorem.
For epistemic rationality, probability estimates need to follow rules of objective
probability. The most important of these are as follows: (a) Probabilities always vary
between 0 and 1, (b) if an event is certain to happen, its probability is 1.0, (c) if an event
is certain not to happen, then its probability is 0, and (d) if 2 events cannot both
happen, then they are mutually exclusive and the probability of one OR the other
occurring is the probability of each added together.
An important concept is that of conditional probability. Conditional probabilities
concern the probability of an event A given that another event B has occurred. If A
and B were mutually exclusive, then the probability of A occurring given that B has
occurred would be zero. Thus, conditional probabilities usually deal with events that
are not mutually exclusive. (A simple example of mutually exclusive events is the diag-
nosis of cholecystitis in someone who presents with fever, jaundice, and right upper
quadrant pain but has previously undergone a cholecystectomy. Given that cholecys-
tectomy has occurred, the probability of cholecystitis is zero.) Bayes theorem de-
scribes the probability of an event occurring while using knowledge about the local
prevalence or risk factors of the condition itself. These represent the post and pretest
probabilities, respectively. Mathematically, the Bayes theorem is represented via the
formula.

PðAÞPðBjAÞ
PðAjBÞ 5
PðAÞPðBjAÞ1Pð:AÞPðBj:AÞ
where P(AjB) describes the likelihood of A occurring given B is true, P(BjA) describes
the likelihood of B occurring given A is true, P(A) describes the probability of A occur-
ring, P(B) describes the probability of B occurring and P(:A) describes the probability
of A not occurring which just 1 – the probability of A occurring. For judgment and de-
cision making, the Bayes theorem has special importance because it allows a formal
framework of updating beliefs in a hypothesis given new evidence. In clinical practice,
it can be stated as posttest odds 5 pretest odds X likelihood ratio. Arriving at, and be-
ing confident in these values, is the crux of Bayesian reasoning.

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
60 Barash & Nanchal

Using the sensitivity and specificity of a test, a likelihood ratio (LR) is calculated and
poses the question “what is the likelihood this patient has disease ‘X’ if presenting with
complaint or test ‘Y’.” Table 2 defines general values for LRs and their approximate
effect on changing the posttest probability. Fig. 3 shows a conversion graph for deter-
mining likelihood ratios from the sensitivity and specificity of positive and negative test
results.
There is the tremendous utility of Bayesian reasoning in the critical care environ-
ment wherein patients often present with a gamut of complex problems, physical ex-
amination findings, historical features, and laboratory/imaging investigations.
Instead of asking the clinician to consider the full range of differential diagnoses
for a particular set of data, Bayesian reasoning ask the clinician to determine which
is most likely based on the aforementioned information. Mathematically, if there is a
hypothesis (H) and a set of collected data elements (D) then
PðHÞ  PðHjDÞ
PðHjDÞ 5 PðHÞ  PðDjHÞ 1 pðwHÞ  PðDjwHÞ where P(H) is the probability of the hypothesis
prior to collecting the data and (P w H) is the probability that some alternate hypoth-
esis is true before collecting the data. It is important to note that P(wH) may repre-
sent any number of different hypotheses and that P(H) 1 P(wH) may not equal 1.0.
P(HjD) is the probability of hypothesis after the observed data pattern (posterior
probability). Similarly, P(DjH) is the posterior probability of data given the hypothesis
and P(DjwH) the posterior probability of data given alternative hypotheses. Here it is
important to realize that P(DjH) and P(DjwH) are not complements and may not add
up to 1.0. An illustrative example is an elderly patient presenting to the ICU with hy-
potension and altered mental status whereby the initial probabilities of septic shock
and cardiogenic shock may be assigned probabilities of 0.5 and 0.2, given that from
prior experience 50% and 20% of such patients presented with septic or cardio-
genic shock, respectively. After physician examination reveals cool extremities,
delayed capillary refill, and echocardiogram reveals a wall motion abnormality with
depressed ejection fraction a clinician may judge that given these findings, the prob-
ability of cardiogenic shock is 0.6 and septic shock is 0.2. Thus, the posterior prob-
abilities of cardiogenic and septic shock would be calculated as 0.54 and 0.45,
respectively.
An important question revolves around clinician estimates of pretest probabilities.
Judgment relies on several factors including the knowledge of the epidemiology of

Table 2
Approximate values for calculated likelihood ratios

Approximate
Change in
Likelihood Ratio Probability
Positive likelihood ratio value (1LR)
1 10%
2 115%
5 130%
10 145%
Negative Likelihood ratio value (LR)
0.1 45%
0.2 30%
0.5 15%
1 0%

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
Enhancing Analytical Reasoning 61

Fig. 3. Conversion graph for determining likelihood ratio from sensitivity and specificity.
(Adapted from Fischer JE, Bachmann LM, Jaeschke R. A readers’ guide to the interpretation
of diagnostic test properties: clinical example of sepsis. Intensive Care Med. 2003;29(7):1043-
1051. https://doi.org/10.1007/s00134-003-1761-8; with permission.)

the suspected disorder, utilization of risk calculators, or other validated analyses and
clinical gestalt. Of course, the time available for decision making may influence one’s
ability to apply these concepts. An applicable scenario is that of a middle-aged female
who presents with fevers, hypoxemia, cough with expectoration of dark phlegm, and a
dense right basilar infiltrate on CT imaging. A plausible hypothesis may be pulmonary
blastomycosis. If using basic epidemiology alone, someone practicing in the south-
western USA might conclude that the likelihood is low and forgo additional investiga-
tion. Another clinician, practicing in a rural town in the upper Midwest, whereby
blastomycosis is known to be hyperendemic, may conclude the pretest probability
is high enough to either test via urinary antigen or, to even treat empirically while
testing is pending. Risk calculators such as the PERC score9 (for ruling out pulmonary
embolism (PE)), LRINEC score10 (for diagnosing necrotizing soft tissue infection), and
PLASMIC score11 (for predicting ADAMTS13 deficiency in thrombotic thrombocyto-
penia purpura) are just a few examples of readily available tools that may help a clini-
cian, in the right circumstances, increase or decrease the likelihood that a disorder
exists. As with all categorical variables, however, it is vital to understand the popula-
tion in which these tests were developed and the limitations of the test. The PLASMIC
score, for example, may only be used in a patient that already has thrombocytopenia
and evidence of microangiopathic hemolytic anemia to then determine the probability
that TTP exists.
In clinical practice and at the bedside, it is common for clinicians to use medical
gestalt in estimating the pretest probability of a disorder, but limitations and failures
exist. First, it should be noted that any estimate of the pretest probability of the disease
will be a function of that clinician’s biases and prior experience with such cases,
knowledge about the presentation of the disease state and understanding of the utility
of a certain complaint or examination findings as increasing or decreasing the likeli-
hood of the disease existing. A wide variability in clinicians’ assessment of pretest
probability exists, adding to the lack of standardization. For example, Kline12 evalu-
ated clinician gestalt in estimating pretest probability for acute coronary syndrome
(ACS) or PE in patients presenting with chest pain and dyspnea. Not surprisingly, cli-
nicians significantly overestimated the presence of ACS (17% vs 4%; P < .001) and PE
(12% vs 6%; P < .01) when compared with a validated computerized method. On the

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
62 Barash & Nanchal

other hand, Penaloza13 and colleagues performed a retrospective analysis of a pro-


spective observational cohort of consecutive PE patients and compared gestalt
with the Wells score and revised Geneva score. Clinicians (using gestalt) were more
likely to label patients as either having “low” or “high” clinical probability of PE and
the prevalence of PE was significantly lower with gestalt in low clinical probability
groups. Likewise, gestalt better identified PE in nonhigh probability groups (low-me-
dium probability).
Numerous biases can affect one’s estimate of the pretest probability, as previously
mentioned in Table 1. A common bias is anchoring which is the tendency to fixate on
certain features of a case (ie, the first impression) and commonly leads to confirmation
bias whereby the clinician then looks for—and places emphasis on—evidence to
confirm their initial diagnosis. For example, a clinician may assume a patient with
recurrent UTIs has a toxic encephalopathy related to a recurrent UTI, assuming
elevated transaminase are the result of cholestasis of sepsis, while in fact disregarding
the presence of palmar erythema, spider nevi, and palpable liver with international
normalized ratio (INR) elevation suggestive of decompensated cirrhosis. This is also
an example of a posterior probability error that perpetuates a previous diagnosis in
a given patient. Some authors have argued that a pitfall of the probabilistic approach
is that, by nature of its governing rules, it leads the clinician to disregard less common
disease processes as evidenced by their low societal prevalence and therefore low
pretest probability by proxy.14 Thus, the Bayesian approach may accurately predict
the likelihood of the disease in the broader population, but may fail to put the individual
patient, with their idiosyncrasies, into context. In other words, Bayesian reasoning
may provide the probability of a disease, but not necessarily the right answer.
Bayesian approach to diagnosis also assumes that values for sensitivity and speci-
ficity are fixed. These values change with the evolution of the disease process. For
instance, in the diagnosis of eosinophilia with granulomatosis and polyangiitis
(EGPA), fixed or migratory pulmonary infiltrates carry a sensitivity and specificity of
15% to 30% and 92% to 94%, respectively.15 However, EGPA is a triphasic disease
with variable presentations and natural history that may take years to present with
classical findings. Although pulmonary infiltrates are a diagnostic criterion, in the initial
prodromic phase, this manifestation is generally absent and may lead a clinician to
disregard the diagnosis outright.

Value theory concepts


After a diagnosis has been established clinicians must decide on appropriate treat-
ment. Often this is relatively simple that is, administering available scientifically sound
therapies that are standard for a diagnosis (eg, antibiotics for pneumonia). However,
the choice of therapy frequently involves complex considerations that encompass
therapeutic (benefit vs harm), ethical, moral, social, and economic dilemmas spanning
patients, families, and society. A mathematic approach to these “value problems” is
defined by value theory. One form of this approach is described by the threshold
approach to treatment. In this approach, the clinician may (1) withhold further testing
and treatment, (2) administer empiric therapy, and (3) perform additional testing and
decide on therapy based on the results of this test16 The choices are driven by a
“testing threshold”—the probability of disease at which there is no difference in the
value of withholding treatment and performing further testing and a “test-treatment
threshold”—the probability of disease at which there is no difference between per-
forming further testing and administering treatment. Calculations of these thresholds
involve the assimilation of data describing risks and benefits of treatment and the risks
of testing and tests characteristics indicating reliability. Treatment is withheld if the

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
Enhancing Analytical Reasoning 63

probability of disease is below the testing threshold and given if it is greater than the
test-treatment threshold. Additional testing is performed if the probability lies between
these 2 thresholds with treatment contingent on the results of the test. An excellent
visual and mathematical review of this concept can be viewed in reference number 9.
A related approach uses game theory first developed by Von Neumann.17 Game
theory is the study of interactions between individuals by applying mathematical
models of conflict and cooperation. These concepts may be applied to understand
the complexities of treatments in the context of a variety of provider, patient, and fam-
ily uncertainties.
Regardless of the approach, an important concept in value theory is that of the “ex-
pected value.” To illustrate this concept, let us assume that a particular patient has
one of the 2 diagnoses D1 and D2 with probabilities of 3/5 and 2/5, respectively.
Let us also assume that there are 2 treatments T1 and T2. T1 is 80% effective for diag-
nosis D1 and 30% effective for diagnosis D2. Similarly, T2 is 20% effective for diag-
nosis D1 and 90% effective for diagnosis D2. The expected value of treatment T1 is
3/5  80/100 1 2/5  30/100 which is equal to 60/100. The expected value of treat-
ment T2 is 3/5  20/100 1 2/5  90/100 which is equal to 48/100. In other words, given
the same circumstances, 60 out of 100 patients would receive effective treatment if T1
were used as opposed to 48 out of 100 if T2 were used; hence T1 should be the
preferred therapy. Assignment of treatment values may often require consideration
of intangibles such as moral and ethical standards which require physician judgment.
Values may be negative in cases whereby treatment may be associated with consider-
able harm. Often probabilities of diseases or syndromes are not known, and the physi-
cian must use judgment to assign them. However, an alternative method for
determining the optimal treatment is by maximizing the number of patients expected
to be cured. For the same treatment values given above let F be the fraction of patients
receiving T1 and (1-F) the fraction receiving T2. If all patients had disease D1, we
would expect to cure 80/100  F 1 20/100 x (1-F) which equals 0.2 1 0.6 F. Similarly,
if all patients had disease D2 then we would expect to cure 30/100  F 1 90/100 (1-F)
which equals 0.9 to 0.6 F. From these 2 equations, the number of people cured is
maximized when F 5 7/12 whereby we would expect a cure in 55% of patients.
Thus, a proportion equal to 7/12 should receive treatment T1 and 5/12 should receive
treatment T2. In recent years, these concepts have gained traction in the medical liter-
ature; one study demonstrated a reduction in antibiotic misuse by discretizing clinical
information using a game-theoretic approach.18 Although many other nuances may be
considered in the final decisions regarding disease processes and therapies, these
concepts provide a valuable workable framework to think about complex problems.

NOISE

There are two components to errors in judgment—bias and noise. Although much
attention has been paid to bias, the role of noise which may be equally or more impor-
tant is often ignored. To fully understand error, it is imperative, we understand both
bias and noise and their relative contributions. Simply defined noise is variability in
judgements that should be identical.19 This variability may occur when different indi-
viduals judge the same situation or when the same individual judges identical situa-
tions. The simplest example of noise in medicine is 2 physicians arriving at separate
the diagnosis for the same patient. Diagnosis involves judgment and obviously, the
judgment of one or both physicians is incorrect (we may not know which); the diver-
gent opinions constitute noise. It is also well known that physicians arrive at different
diagnoses when presented twice with the same case.20–22 Noise invades both

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
64 Barash & Nanchal

diagnostic judgments as well as treatments.23–25 On closer examination, the perva-


siveness and magnitude of this phenomenon is astonishing.26–28 Variability in judg-
ments by the same person may be triggered by mood, fatigue, weather, and
sequence effects. Perhaps the most pertinent to critical care medicine especially
amid the COVID-19 pandemic are stress and fatigue. One study of over 700,000 pri-
mary care visits demonstrated that physicians were significantly more likely to pre-
scribe opioids at the end of a long day.29 Such patterns were not seen in
prescriptions of nonsteroidal antiinflammatory drugs or physical therapy referrals.30
Similarly, prescriptions for antibiotics were more likely while those of flu shots and can-
cer screenings less likely at the end of a long day.31–33 Even more surprisingly, doctors
seem to wash their hands less at the end of the day.34 Since bias and noise play the
same role in the calculation of overall error (mean squared error or
MSE 5 bias2 1 noise2),19 it is quite likely that there is a hidden epidemic of undiscov-
ered noise in the ICU. Some important examples of noise in the context of critical care
include poor interobserver reliability in the interpretation of chest radiographs for the
diagnosis of ARDS (reliability does not improve with training),35 clinical assessment
of work of breathing, discerning the etiology of acute kidney injury36,37 as well as
the identification and timely diagnosis of sepsis.38 One can easily extrapolate how
variability in the diagnosis of such common disorders would, in turn, lead to variability
in the delivery of time-sensitive treatments such as antibiotics, volume resuscitation,
and provision of appropriate techniques of mechanical ventilation. Compounding
the problem and aggravating the variability in the decision-making process is the
occurrence of occasion noise as a result of stress, fatigue, and mood so prevalent
in high-strung ICU environments. Noise can be reduced by improvement in the skill
levels of physicians39; appropriate training is key to the reduction of error and to the
mitigation of both noise and bias. Algorithmic approaches such as deep learning
and artificial intelligence are known to reduce noise.40–42 With rapidly advancing tech-
nology, these approaches are likely to undergo refinements and become increasingly
ubiquitous in the future. Until these advances can occur, a relatively simple method of
reducing noise is through the implementation of guidelines. Guidelines help distill
complex decisions into relatively simpler sub-decisions; judgment is not altogether
eliminated but is simplified using rules and relevant predictors. Relevant examples
include the ABCDEF bundle and the protocols for the early identification and treatment
of sepsis, both of which have been associated with improved outcomes.43,44

SUMMARY

Clinical reasoning involves predictive judgment which is afflicted by systematic error


which comprises bias and noise. Reasoning foundations can be mathematically
described by logic, probability, and value theory. Intuitive judgment of these mathe-
matical concepts leads to systematic errors in reasoning. Clinicians should familiarize
themselves with biases and their mathematical underpinnings to improve reasoning.
Noise is pervasive in medicine and may be reduced using algorithmic approaches
and guidelines.

CLINICS CARE POINTS

 Clinical reasoning is afflicted by systematic errors in judgment or biases


 Misconceptions of statistical and probability concepts underpin most biases
 Intensivists should familiarize themselves with mathematical concepts to improve decision
hygiene

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
Enhancing Analytical Reasoning 65

 An equally important component to errors in judgment is noise


 Noise is pervasive and attempts should be made to reduce it

DISCLOSURE

The authors (M. Barash and R.S. Nanchal) have nothing to disclose.

REFERENCES

1. Tversky A, Kahneman D. Judgment under uncertainty: heuristics and biases. Sci-


ence 1974;185(4157):1124–31.
2. Ledley RS, Lusted LB. Reasoning foundations of medical diagnosis; symbolic
logic, probability, and value theory aid our understanding of how physicians
reason. Science 1959;130(3366):9–21.
3. Wainer H. The most dangerous equation. Am Sci 2007;95(3):249.
4. Galton F. Natural inheritance. London: Macmillan; 1889.
5. Bar-Hillel M. On the subjective probability of compound events. Organizational
Behavior & Human Performance 1973;9(3):396–406.
6. Cohen J, Chesnick EI, Haran D. Evaluation of compound probabilities in sequen-
tial choice. Nature 1971;232(5310):414–6.
7. Cahan A. Diagnosis is driven by probabilistic reasoning: counter-point. Diagnosis
(Berl) 2016;3(3):99–101.
8. Bayes T, Noon J. An introduction to the doctrine of fluxions; and, Defense of the
mathematicians against the objections of the author of the analyst, so far as they
are designed to affect their general methods of reasoning. London: Printed for J.
Noon; 1736.
9. Kline JA, Mitchell AM, Kabrhel C, et al. Clinical criteria to prevent unnecessary
diagnostic testing in emergency department patients with suspected pulmonary
embolism. J Thromb Haemost 2004;2(8):1247–55.
10. Wong CH, Khin LW, Heng KS, et al. The LRINEC (Laboratory Risk Indicator for
Necrotizing Fasciitis) score: a tool for distinguishing necrotizing fasciitis from
other soft tissue infections. Crit Care Med 2004;32(7):1535–41.
11. Bendapudi PK, Upadhyay V, Sun L, et al. Clinical scoring systems in thrombotic
microangiopathies. Semin Thromb Hemost 2017;43(5):540–8.
12. Kline JA, Stubblefield WB. Clinician gestalt estimate of pretest probability for
acute coronary syndrome and pulmonary embolism in patients with chest pain
and dyspnea. Ann Emerg Med 2014;63(3):275–80.
13. Penaloza A, Verschuren F, Meyer G, et al. Comparison of the unstructured clini-
cian gestalt, the wells score, and the revised Geneva score to estimate pretest
probability for suspected pulmonary embolism. Ann Emerg Med 2013;62(2):
117–24.e2.
14. Jain BP. Why is diagnosis not probabilistic in clinical-pathological conference
(CPCs): point. Diagnosis (Berl) 2016;3(3):95–7.
15. Masi AT, Hunder GG, Lie JT, et al. The American College of Rheumatology 1990
criteria for the classification of Churg-Strauss syndrome (allergic granulomatosis
and angiitis). Arthritis Rheum 1990;33(8):1094–100.
16. Pauker SG, Kassirer JP. The threshold approach to clinical decision making.
N Engl J Med 1980;302(20):1109–17.
17. Neumann JV, Morgenstern O. Theory of games and economic behavior. Prince-
ton (NJ): Princeton University Press; 1963.

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
66 Barash & Nanchal

18. Diamant M, Baruch S, Kassem E, et al. A game theoretic approach reveals that
discretizing clinical information can reduce antibiotic misuse. Nat Commun 2021;
12(1):1148.
19. Kahneman D, Sibony O, Sunstein CR. Noise a flaw in human judgement. London:
William Collins; 2021.
20. Robinson PJ, Culpan G, Wiggins M. Interpretation of selected accident and emer-
gency radiographic examinations by radiographers: a review of 11000 cases. Br
J Radiol 1999;72(858):546–51.
21. Detre KM, Wright E, Murphy ML, et al. Observer agreement in evaluating coro-
nary angiograms. Circulation 1975;52(6):979–86.
22. Banky M, Clark RA, Pua YH, et al. Inter- and intra-rater variability of testing veloc-
ity when assessing lower limb spasticity. J Rehabil Med 2019;51(1):54–60.
23. OECD. Geographic Variations in Health Care. 2014.
24. Hurley MP, Schoemaker L, Morton JM, et al. Geographic variation in surgical out-
comes and cost between the United States and Japan. Am J Manag Care 2016;
22(9):600–7.
25. Appleby J, Raleigh V, Frosini F, et al. Variations in health care: the good, the bad
and the inexplicable. London: King’s Fund; 2011.
26. Speciale AC, Pietrobon R, Urban CW, et al. Observer variability in assessing lum-
bar spinal stenosis severity on magnetic resonance imaging and its relation to
cross-sectional spinal canal area. Spine (Phila Pa 1976 2002;27(10):1082–6.
27. Farmer ER, Gonin R, Hanna MP. Discordance in the histopathologic diagnosis of
melanoma and melanocytic nevi between expert pathologists. Hum Pathol 1996;
27(6):528–31.
28. Palazzo JP, Hyslop T. Hyperplastic ductal and lobular lesions and carcinomas
in situ of the breast: reproducibility of current diagnostic criteria among commu-
nity- and academic-based pathologists. Breast J 1998;4(4):230–7.
29. Neprash HT, Barnett ML. Association of primary care clinic appointment time with
opioid prescribing. JAMA Netw Open 2019;2(8):e1910373.
30. Philpot LM, Khokhar BA, Roellinger DL, et al. Time of day is associated with
opioid prescribing for low back pain in primary care. J Gen Intern Med 2018;
33(11):1828–30.
31. Linder JA, Doctor JN, Friedberg MW, et al. Time of day and the decision to pre-
scribe antibiotics. JAMA Intern Med 2014;174(12):2029–31.
32. Hsiang EY, Mehta SJ, Small DS, et al. Association of primary care clinic appoint-
ment time with clinician ordering and patient completion of breast and colorectal
cancer screening. JAMA Netw Open 2019;2(5):e193403.
33. Kim RH, Day SC, Small DS, et al. Variations in influenza vaccination by clinic
appointment time and an active choice intervention in the electronic health record
to increase influenza vaccination. JAMA Netw Open 2018;1(5):e181770.
34. You X, Tan H, Hu S, et al. Effects of preconception counseling on maternal health
care of migrant women in China: a community-based, cross-sectional survey.
BMC Pregnancy Childbirth 2015;15:55.
35. Goddard SL, Rubenfeld GD, Manoharan V, et al. The randomized educational
acute respiratory distress syndrome diagnosis study: a trial to improve the radio-
graphic diagnosis of acute respiratory distress syndrome. Crit Care Med 2018;
46(5):743–8.
36. de Groot MG, de Neef M, Otten MH, et al. Interobserver Agreement on Clinical
Judgment of Work of Breathing in Spontaneously Breathing Children in the Pedi-
atric Intensive Care Unit. J Pediatr Intensive Care 2020;9(1):34–9. https://doi.org/
10.1055/s-0039-1697679.

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.
Enhancing Analytical Reasoning 67

37. Koyner JL, Garg AX, Thiessen-Philbrook H, et al. Adjudication of etiology of acute
kidney injury: experience from the TRIBE-AKI multi-center study. BMC Nephrol
2014;15:105.
38. Vincent JL. The clinical challenge of sepsis identification and monitoring. PLoS
Med 2016;13(5):e1002022.
39. Tsugawa Y, Newhouse JP, Zaslavsky AM, et al. Physician age and outcomes in
elderly patients in hospital in the US: observational study. BMJ 2017;357:j1797.
40. Rodriguez-Ruiz A, Lång K, Gubern-Merida A, et al. Stand-alone artificial intelli-
gence for breast cancer detection in mammography: comparison with 101 radi-
ologists. J Natl Cancer Inst 2019;111(9):916–22.
41. Richens JG, Lee CM, Johri S. Improving the accuracy of medical diagnosis with
causal machine learning. Nat Commun 2020;11(1):3923.
42. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep
learning algorithm for detection of diabetic retinopathy in retinal fundus photo-
graphs. JAMA 2016;316(22):2402–10.
43. Seymour CW, Gesten F, Prescott HC, et al. Time to treatment and mortality during
mandated emergency care for sepsis. N Engl J Med 2017;376(23):2235–44.
44. Pun BT, Balas MC, Barnes-Daly MA, et al. Caring for critically ill patients with the
ABCDEF bundle: results of the ICU liberation Collaborative in over 15,000 adults.
Crit Care Med 2019;47(1):3–14.

Downloaded for Anonymous User (n/a) at Taipei Medical University from ClinicalKey.com by Elsevier on February 21,
2022. For personal use only. No other uses without permission. Copyright ©2022. Elsevier Inc. All rights reserved.

You might also like