0% found this document useful (0 votes)
7 views

Emergency Department Experiment in Displaying An Algorithmic Wait Time Prediction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Emergency Department Experiment in Displaying An Algorithmic Wait Time Prediction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Emergency Department Experiment in Displaying an

Algorithmic Wait Time Prediction

In a hospital that aims to have fewer patients leave the Emergency Department without being seen by a
physician (LWBS), we field-tested two approaches for displaying an algorithmic prediction of low-acuity
patients’ wait time to see a physician. The first approach is the prediction rounded to a multiple of 10 minutes,
and the second is an interval designed to communicate that the wait time could be even 20 minutes longer.
Relative to the control with no wait time information, both approaches significantly reduce the likelihood
of LWBS, with the interval approach being more effective. Improved waiting satisfaction, as indicated by
our incentivized satisfaction survey of ED patients, and a higher anticipated wait time with the interval
approach, indicated by our online experiment, may contribute to these effects. Consistent with prospect
theory, we find that to the extent that patients’ actual wait time exceeds the displayed wait time, they have
higher likelihood to LWBS. According to emergency medicine literature, many patients with ESI level 4-5
and complaint “dental pain” or “medication refill” need not be in the ED. Unfortunately, our intervention
is most effective at reducing LWBS by those patients.

Key words : Health Care Management, Randomized Experiments, Algorithmic Predicting, Delay
Announcement, Causal Inference, Empirical Research

1. Introduction
As a growing number of Emergency Departments display the wait time to see a physician the Amer-
ican College of Emergency Physicians (ACEP 2012) has raised concern that this deters patients
from waiting for treatment, when the displayed wait is long. Moreover, the displayed wait times
often are misleading for patients, so ACEP has called for improved accuracy. The Q-Lasso machine
learning algorithm can predict patients’ wait times with much lower mean squared error than the
rolling averages commonly used by hospitals (Ang et al. 2015).
Could an Emergency Department (ED) reduce the number of patients that choose to leave
without being seen by a physician (LWBS) by displaying an algorithmic wait time prediction?
How? To address those questions, we conducted a field experiment with the San Mateo Medical
Center (SMMC), a public, county hospital in California, wherein low-acuity1 ED patients often
experience long wait times to see a physician. We hung a screen in the SMMC ED waiting room to
display the Q-Lasso algorithmic prediction of low-acuity patients’ wait time from triage to be seen
1
“Low-acuity” refers to patients who are triaged to Emergency Severity Index (ESI) 3, 4, or 5, indicating that
they can safely wait to be seen by a physician, whereas high-acuity patients, the patients triaged to ESI 1 or 2, must
be seen by a physician immediately and have preemptive priority, so essentially no patients triaged to high-acuity
leave without being seen.

1
2

by a physician. Prior to our intervention, approximately 3% of low-acuity patients would LWBS.


Reducing the likelihood that low-acuity patients LWBS is a primary objective for SMMC.
We had to overcome algorithm aversion and nonadherence among the ED nurses. Initially, we
displayed the sharp algorithmic prediction (e.g. “The estimated wait time to see a physician for
low-acuity patients is 41 minutes”) and some nurses would turn off the screen to prevent patients
seeing the prediction. Though the algorithm is optimized for accuracy, it nevertheless makes some
prediction errors. Nurses complained about errors of the type in which patients wait longer than
predicted, feel dissatisfied and complain to the nurses. Therefore, in collaboration with the nurses,
we identified two candidate ways to display the algorithmic wait time prediction so as to mitigate
patient dissatisfaction, have nurses keep the screen on, and - hopefully - have fewer patients LWBS.
The first way is to round the algorithmic prediction to the nearest multiple of 10 minutes. Cogni-
tive psychology literature (Rosch 1975, Donnelly et al. 2021) shows that people perceive a multiple
of 10 as representative of the numbers that are close to that multiple of 10. For example, people
tend to perceive 42 as essentially the same as 40, whereas they perceive 42 as different from 41,
though the difference 42-40 =2 is larger than 42-41 = 1. Hence displaying a wait time of 40 minutes,
rather than the sharper 41 minutes, could cause patients to perceive the displayed information
as essentially correct even when they actually wait longer than the prediction. Presumably that
would mitigate dissatisfaction. Indeed, in an online experiment, we found that participants with
wait time prediction 40 minutes tend to anticipate a longer maximum wait time than do partici-
pants with the larger wait time prediction 41 minutes. Furthermore, with wait time prediction 40
minutes and actual wait time 50 minutes, most participants perceive the wait time prediction as
essentially correct. This suggests that rounding the algorithmic wait time prediction to a multiple
of 10 minutes will mitigate patients’ dissatisfaction in the event that their actual wait time is longer
than the algorithmic prediction.
The complementary, second way, recommended by SMMC nurses, is to communicate that the
wait time could be even 20 minutes larger, by displaying an interval. The lower end of the interval
is the algorithmic prediction rounded to the nearest 10 minutes. The upper end of the interval is
20 minutes larger. Indeed, in an online experiment, we confirmed that participants with interval
prediction 40-60 minutes tend to anticipate an even longer wait time and have less dissatisfaction
with a 70 minute actual wait than do participants with prediction 40 minutes. According to prospect
theory (Tversky and Kahneman 1991), patients are averse to waiting longer than they anticipated,
so our interval prediction, designed to increase patients’ anticipated wait above the algorithmic
prediction, will better mitigate dissatisfaction in the event that the actual wait time is longer
than the algorithmic prediction. The trade-off is that by increasing patients’ anticipated wait time,
displaying the interval might induce patients to LWBS rather than wait to see a physician.
3

This motivates the design of our field experiment in the SMMC ED. We dynamically rotate
among three different displays: (1) a point estimate of the wait time for low-acuity patients: Q-
Lasso algorithm’s prediction rounded to the nearest multiple of 10 minutes; (2) an interval estimate
of the wait time for low-acuity patients, ranging from our point estimate to 20 minutes larger; (3)
no wait time information.
To investigate whether and how these three different displays affect patient satisfaction, which
may mediate LWBS behavior, we automate a satisfaction survey for low-acuity patients in the
SMMC waiting room. Low survey response rate is a major challenge in EDs (Pines et al. 2018,
Westphal et al. 2022) that we address in two ways. First, our incentivized survey method increased
the response rate by a factor of nearly 5, from 0.5% response to the standard Press Ganey survey2
to 2.4% response rate to our survey. Second, we measure and correct for any non-response sam-
ple selection bias. This is important because Compton et al. (2019) found that characteristics of
patients who respond to a Press Ganey survey differ from those of the overall population and such
sample selection bias, uncorrected, could result in erroneous causal claims and misdirected policy
(Berg 2005). We develop a Heckman-type (Heckman 1976) two-stage treatment effects model that
involves a first stage to explain patient’s survey participation and a second stage to explain patients’
satisfaction while waiting. Importantly, we incorporate instrumental variables that predict survey
participation (selection) but not the outcome of interest (patient satisfaction).
Another major challenge in studying patients’ LWBS behavior is that one cannot observe the
wait time for a patient who chooses to LWBS, because the ED does not record the time at which
he leaves (Batt and Terwiesch 2015). As a substitute, we develop an algorithm to estimate the time
a LWBS patient would have waited to see a physician if he had stayed. By using features of the
backlog of diagnostic tests in our algorithm, we find a large accuracy improvement. This approach
could be useful for researchers and practitioners to more accurately predict wait times for patients
in other health care settings.
In closely-related empirical ED literature, Batt and Terwiesch (2015) find - in an ED that doesn’t
display wait time information- association between patients’ LWBS decisions and the observable
queue of other patients in the waiting room. They advocate providing wait time information to
better manage LWBS, as do (Arendt et al. 2003, Johnson et al. 2012). In a field experiment,
Westphal et al. (2022) provide individualized wait time and/or operational information via patients’
cell phones; ones who got wait time and operational information had higher satisfaction but were
more likely to leave without completing treatment than ones who got only operational information
(no wait time information). That reinforces the concern raised by (ACEP 2012) that providing
wait time information deters patients from waiting.
2
Press Ganey (PG) is the largest provider of tools for patient satisfaction measurement and analysis.
4

The stakes are high. To LWBS increases a patient’s risk of later hospitalization or adverse
health outcomes (Mataloni et al. 2018). To improve health outcomes, the Centers for Medicare and
Medicaid Services provides financial incentives for hospitals to reduce the % LWBS (CMS 2014).
However, inducing the “right” patients to LWBS could improve health outcomes for those
patients and others. Some low-acuity patients, including many ESI 4-5 patients with chief com-
plaints such as “dental pain” or “medication refill”, need not be treated the ED and cannot obtain
the best treatment in the ED, yet their presence exacerbates ED overcrowding (Currie et al. 2017,
Wilsey et al. 2008, Thrasher et al. 2019). Having low-acuity patients LWBS reduces high-acuity
patients’ wait time to triage and wait time to see a physician, which improves health outcomes for
high-acuity patients (Luo et al. 2017). Our display of wait time information has greater effect on
% LWBS among ESI 4-5 patients with “dental pain” or “medication refill” type complaints than
on other low-acuity patients with greater need for ED resources.
In complementary theoretical work, Anunrojwong et al. (2022) show how to provide wait time
information so as to deter the right patients (patients who least need the service) from waiting for
service. In a multi-server queue, (Whitt 1999) show how providing wait time information reduces
customers’ average wait time, by enabling customers to balk when the wait is long. In a model in
which both the service provider and customers act strategically, Allon et al. (2011) show the value
of providing an interval estimate of the wait time rather than the precise wait time. We refer the
reader to Ibrahim (2018), Westphal et al. (2022) for excellent reviews of the theoretical literature
on how providing wait time information can improve the performance of queueing systems and
Anunrojwong et al. (2022) for an excellent review of the theoretical literature on information design.
Other relevant literature examines how to improve customers’ experience of waiting. Yu et al.
(2017) provide empirical evidence from a call center that predicting a longer wait time reduces
customers’ cost per unit time waiting. Consistent with prospect theory, when the call center cus-
tomers wait longer than they anticipated, their cost of waiting increases; providing an accurate wait
time prediction can change customers’ anticipated wait time and thereby reduce their waiting cost,
whereas inaccurate information is ineffective Yu et al. (2021). Other ways to improve the customer
experience of waiting include providing a wait time guarantee (Kumar et al. 1997), increasing the
feeling of progress (Soman and Shi 2003, Westphal et al. 2022), service segmentation and time
fillers in between (Carmon et al. 1995), appropriate sequencing (Chase and Dasu 2001), improv-
ing perceived fairness (Larson 1987), shaping memories (Norman 2009), and providing operational
transparency (Buell and Norton 2011, Buell et al. 2017).

2. Online Experiments
We want to understand how rounding an algorithmic wait time prediction (e.g., 41 minutes) to
its nearest multiple of 10 minutes (40 minutes) might affect how people interpret the informa-
tion. Therefore, in our first online experiment (all details of which are provided in the EC.1), we
5

essentially guide participants to imagine abdominal pain, triage, and waiting in the ED to see a
physician. Randomly, we display one of the following to each participant: (1) Estimated wait time
see a physician: 41 minutes; (2) Estimated wait time to see a physician: 40 minutes; (3) a blank
screen. We then ask: What is the maximum amount of time that you would need to wait to see
a physician? The minimum amount of time that you would need to wait to see a physician? We
inform every participant that they actually wait for 50 minutes to see a physician. Of participants
to whom we displayed a wait time estimate, we ask if they perceive the quality of that informa-
tion to be essentially misleading vs. essentially correct. Lastly, we ask participants to rate their
satisfaction. We have approximately 100 participants in each of the three treatments, for a total
of 300.
As is evident from Table 1, with the wait time prediction rounded down to 40 minutes, par-
ticipants tend to anticipate a higher maximum wait time and lower minimum wait time than do
participants with wait time prediction of 41 minutes. Moreover, even though they experience a
larger prediction error (50 minutes - 40 minutes = 10 minutes vs. 50 minutes - 41 minutes = 9
minutes) they tend more to perceive the quality of the wait time information as essentially cor-
rect. With either wait time estimate, participants tend to have higher satisfaction than do the
participants with none. All the aforementioned results are statistically significant.

Table 1 Online Experiment 1 Summary: the survey responses are organized by experimental conditions:
displaying 41 minutes, 40 minutes, or no information (None). The mean and standard deviations (in parentheses)
are calculated in minutes or in percentages for each condition. The actual wait time for all three conditions is 50
minutes.

Wait time prediction in minutes 41 40 None


Anticipated maximum wait time in minutes 61 (25) 76 (71) 107 (102)
Anticipated minimum wait time in minutes 34 (12) 27 (17) 27 (27)

Actual wait time in minutes 50 50 50

Perceive information as essentially correct 71% 83% NA


Satisfaction on scale from 1 to 10 5.6(2.4) 5.4(2.3) 4.5(2.2)

We want to understand how displaying the wait time point prediction of 40 minutes versus
interval 40-60 minutes might affect peoples’ anticipated wait time and satisfaction. As recommended
by SMMC nurses, the interval additionally informs patients that they may wait even 20 minutes
longer. Therefore, we repeated our experiment, with the modification that we randomly display
to each participant one of the following: (1) Estimated wait time to see a physician: 41 minutes;
(2) Estimated wait time to see a physician: 40 - 60 minutes, and then elicit the participant’s
subjective distribution of the anticipated wait time to see a physician, as illustrated in Figure
6

EC.2. Next, we randomize among three actual wait times: 30 minutes, 50 minutes, and 70 minutes,
before asking participants if they perceive the quality of the displayed wait time information to be
essentially misleading vs. essentially correct, and asking them to rate their satisfaction. We have
approximately 100 participants in each of the 2 x 3 = 6 treatments, for a total of 600. (All details
of the experiment are in the EC.1.)
As is evident from Table 2, with the interval estimate 40-60 minutes, participants have subjective
distribution for the anticipated wait time with greater mean and coefficient of variation than do
participants with the point estimate 40 minutes. Moreover, among participants with actual wait
time of 70 minutes, the ones who were displayed the interval estimate of 40-60 minutes tend more
to perceive that wait time information as essentially correct and have higher satisfaction than do
participants who were displayed only the point estimate 40 minutes. All the aforementioned results
are statistically significant.

Table 2 Online Experiment 2 Summary: the survey responses are organized by experimental conditions,
displaying 40 minutes or 40 to 60 minutes (40 - 60), and for each information displayed, we vary the actual wait
time with three levels (30 minutes, 50 minutes, and 70 minutes). The mean and standard deviations (in
parentheses) are calculated in minutes or in percentages for each condition. C.V. stands for coefficient of
variation.

Wait Time Prediction in minutes 40 40-60

Mean anticipated wait time 46(12) 52(13)


C.V. anticipated wait time 0.30 0.37
(0.16) (0.20)

Actual wait time in minutes 30 50 70 30 50 70

Perceive information as essentially correct 87% 86% 13% 75% 95% 59%
Satisfaction on scale from 1 to 10 6.8(2.4) 5.0(2.2) 3.6(1.9) 6.2(2.7) 5.4(2.4) 4.7(1.8)

A common observation in both experiments is that with an algorithmic prediction of the wait
time rounded to the nearest multiple of 10 minutes (40 minutes) or with the 20 minute interval
(40 - 60 minutes), most participants with an actual wait time within 10 minutes of the prediction
perceive the quality of the wait time information to be essentially correct. This was true even if
the actual wait time was 10 minutes shorter or longer than the point estimate, and even if the
actual wait time was 10 minutes shorter than the lower bound or 10 minutes longer than the upper
bound of the interval estimate. We conclude that people tend to interpret an integer multiple of 10
as representative of the numbers between the preceding lower integer multiple of 10 and the next
higher integer multiple of 10.
7

3. Design of the Field Experiment


We experimented with three designs for displaying an algorithmic prediction of low acuity patients’
wait time (or not):
Design 1 (D1): No wait time information.
Design 2 (D2): Q-Lasso prediction rounded to nearest multiple of 10 minutes.
Design 3 (D3): Interval with lower bound as in D2 and upper bound being 20 minutes larger.
Figures 1 - 3 show how D1, D2 and D3 each would appear to low-acuity patients, displayed at
the top the screen in the SMMC ED where low-acuity patients wait after triage to be seen by a
physician. In this specific example of D2 and D3, at 10:38 AM on May 22nd 2019, the Q-Lasso
prediction was 34.2 minutes and therefore D2 was 30 minutes and D3 was 30-50 minutes. The
Q-Lasso prediction of low acuity patients’ wait time from triage to be seen by a physician is based
on the current state of the ED (count of patients of each acuity level by stage in service, time
of day, staffing, etc.) and combines machine learning and queueing theory to minimize the mean
squared error, as described in Ang et al. (2015).
Over the course of our experiment, from April 1st, 2019 to December 1st, 2020, we rotated among
D1, D2 and D3 so as to experiment with each design at each time in a 24 hour day (important due
to the diurnal variation in ED conditions) while having few patients observe a change in design
(we discard the data for patients who were waiting while a change occurred). Specifically, at time
intervals of 8 hours and 20 minutes, we changed the design on display, so that a complete cycle
through D1, D2, and D3 took exactly 25 hours. Figure 4 illustrates how the daily schedule for
display of D1, D2 and D3 would advance over time. This rotation schedule provides a pseudo
randomization in the sense that patients’ wait times are not statistically different in our data
collected under D1, D2 and D3, as shown in the summary of statistics of patients’ actual wait time
across experimental conditions in Table 4.
At the bottom of the screen, we invited all patients to participate in a survey and lottery to win
a $50 Amazon gift card, as shown in Figure 5. As is evident in the screen shot, all text appeared in
both Spanish and English, and patients could choose to take the survey in either of those languages.
When patients texted “EEE” (for English) or “SSS” (for Spanish) to the shortcode, they were
texted a link to a consent form, including a short description of the survey (Figure 7).
Patients that gave consent were routed to a 3-question survey. Question 1 asked patients to rate
their satisfaction with their wait time to be seen by a physician on a scale from 1 “very unsatisfied”
to 5 “very satisfied”. Question 2 asked patients to rate their pain level on a scale from 1 “no
pain” to 10 “the worst pain”. Question 3 asked patients whether or not they are accompanied by
a companion or family member.
8

Figure 1 The top of the screen for D1

Figure 2 The top of the screen for D2

Figure 3 The top of the screen for D3

Figure 4 The Experiment Schedule: the shaded small boxes on the top indicates the time span
of the day in hours. The starting time of the experiment is on April 1st, 2019 (indicated as Day
1), the time rotational period for each of the experiment design is defined using black brackets.
The vertical black dashed line indicates the starting time of the rotation D1, D2 and D3.

Patients that completed the survey were given a link to share their questions, concerns or sug-
gestions, or to update their responses at any later time, though none chose to do so. They were
9

Figure 5 The bottom of the screen

also given a second consent form (Figure 8) requesting name, phone number and consent to use
that information to match the survey response with the patient’s actual wait time from triage to
be seen by a physician.
Over the course of our experiment (from April 1st, 2019 through December 1st, 2020), unin-
tentional randomization occurred in that the TV screen went blank during random electrical con-
nection interruptions and IT/software maintenance and updating. While the screen was blank, no
wait time information was provided to low-acuity patients in the ED and none could participate
in the satisfaction survey. We treat the occurrences of the blank screen as natural experiments,
and conduct placebo tests of our results using data from patients when the screen was blank. The
placebo tests, reported in Table EC.12, EC.13, and EC.14, confirm that none of the effects found
while the screen was on occurred while the screen was blank.

4. Analysis and Results


This section presents our analysis and results for the 9787 low-acuity patients who waited after
triage in the ED while our screen was on and did not see an information change in the TV screen
when the Q-Lasso prediction was updated 3 . Of these patients, 224 completed the satisfaction
survey.

4.1. Estimate the wait time that would have occurred for patients who LWBS
It is important to control for patients’ actual wait time before being seen by a physician when
evaluating the effect of the provision and format of the wait time information. However, we did not
have the actual wait time for patients who indeed LWBS. As a result, we used machine learning to
estimate the wait time that could have occurred for patients who left the ED without being seen
by a physician. From April 1st, 2019 to December 31st, 2020, 53703 patients visited the SMMC
ED. We excluded 197 patients who left the ED without being triaged when building our model.

3
The Q-Lasso prediction was updated every 10 minutes, which means that the on-line inference prediction pipeline
was run every 10 minutes.
10

We selected a set of relevant features that could potentially influence the wait time, such as the
time of day, the day of the week, the patient’s triage level, the patient’s mode of arrival, and the
overall workload in the ED. By using features of the backlog of diagnostic tests in our algorithm,
we find a large accuracy improvement. All features and their summary of statsitics are reporeted
in EC.1.2.
We used a pool of candidate machine learning algorithms, including Lasso, Ridge, Elastic Net,
Random Forest, XGBoost, and Multi-layer Perceptron (MLP) Neural Network. We trained each
algorithm on a randomly selected 80% of the data, and performed a 5-fold cross-validation to
obtain the optimal hyperparameters that minimize the mean squared error (MSE) for each. We
then used the trained model to predict the wait time of the remaining 20% of the data, and used
the MSE to evaluate the performance of each algorithm. (All details of the model buildings are in
EC.1.2).
After comparing the performance of the different algorithms, we found that the Random Forest
algorithm performed the best, yielding a test set MSE of 117.2 minutes with a standard error of
3.1 minutes. This indicates a high prediction performance for our machine learning model. Table
EC.7 shows the MSEs using other methods. Also, note that this error is much smaller compared
to the MSEs using methods in the literature (see inside Ang et al. (2015)). We used the trained
Random Forest model to estimate the wait time that would have occurred for patients who chose
to LWBS.
For the rest of the paper, we will approximate the wait time that would have occurred for patients
who LWBS.

4.2. Effect of Wait Time Information on Patients’ Likelihood to LWBS


In this section, we evaluate the effect of wait time information (i.e., our display of the Q-Lasso
algorithmic wait time prediction in design D2 and D3) on the likelihood that low-acuity patients
LWBS from the SMMC ED.
We focus on low-acuity patients (patients triaged to ESI 3, 4, or 5 indicating that they can safely
wait to see a physician) because none of the patients triaged to ESI 1 or 2 LWBS, at least not from
SMMC ED during the period for which we have data. Among low-acuity patients, ESI 3 indicates
that these patients need multiple ED resources, whereas ESI 4-5 patients need at most one. In each
of those low-acuity categories, Table 3 reports the total number of patients, number who chose to
LWBS, and LWBS % . Due to the concern raised in emergency medicine literature that many ESI
4-5 patients with chief complaint of “dental pain” or “medication refill” need not be in the ED and
would not get the best care in the ED (Currie et al. 2017, Wilsey et al. 2008), we also focus on
those patients, which we call “ESI 4-5 minor” for brevity; notice the LWBS percentage is higher
for these ESI 4-5 minor than in the other low-acuity patient categories.
11

Table 3 Number of visits, LWBSs and LWBS % across ESI groups in SMMC ED during April 1, 2019 to
December 31, 2020 while the screen was on.

Triage Level # LWBS LWBS %


Low-acuity patients ESI 3-5 9787 254 2.26%
ESI 3 4906 59 1.20%
ESI 4-5 4630 193 4.17%
ESI 4-5 minor 534 42 7.82%

Table 4 Summary of statistics across experimental conditions

D1 D2 D3
Num. of observation 3280 3398 3109
Mean(sd) Mean(sd) Mean(sd)

Actual W (min) 39(38) 38(34) 39(36)


Age 40(19) 40(19) 41(19)
Triage pain level (out of 10) 5.8 (3.2) 5.4 (3.1) 5.7(3.1)
NumPatientsFirstHour 15(7.2) 16(7.1) 15(7.3)
Q-Lasso prediction 31(7.5) 30(7.5) 32(7.6)

Percent Percent Percent


WT provision indicator 0 100% 100%
ESI 4|5 48% 50% 51%
Is.Accompanied 46% 46% 49%
Is.Male 52% 51% 50%
Is.AfterStateEmergency 15% 11% 14%
Pod (08:00 - 16:00) 52% 42% 48%
Pod (16:00 - 24:00) 35% 41% 42%

Table 4 summarizes the mean and standard deviations of all the variables (defined in Table
5) used in this analysis across experiment groups. At the observational level, the data across
experiment groups are well-balanced. For instance, the average age and male percentage of patients
in each experimental condition were similar. Additionally, there were no statistically significant
differences in patients’ actual wait times across the experimental conditions. We used a one-way
analysis of variance (ANOVA) to test for differences in the means of patients’ wait time across the
three experimental conditions, and the p-value of the test was 0.231 (see Table 12 in Appendix
A.1), indicating that we could not reject the null hypothesis that the means of patients’ wait time
in different experimental conditions are the same. Therefore, any potential effects of wait time
information provision on LWBS cannot be attributed solely to differences in these control variables.
We wish to test the following hypothesis in the field experiment.

Hypothesis 1. When wait time information is provided, patients are less likely to leave the ED
without being seen by a physician.
12

Table 5 Variable Definitions

Variable Description
Dependent variables
LWBS A binary variable indicating whether the patient choose to leave the
ED while waiting to see a physician.
Other variables
WT provision indicator A binary variable indicating whether wait time information is
provided in the TV screen, e.g. 0 when D1 is implemented, and 1
when D2 and D3 are implemented.
Actual W Patient’s actual wait time after triage before seeing a physician.
ESI 4|5 An categorical variable with two levels: patients with triage level
ESI 3, and patients with triage level ESI 4 or 5.
Is.Accompanied An indicator variable indicating if the patient is accompanied by
family members or friends.
Triage pain level A scale from 0 to 10. Zero means “no pain,” and ten means “the
worst possible pain.”
Is.AfterStateEmergency An binary variable indicating if the patient visit the ED after
March 4th, on which the state of California declared an emergency
over the COVID-19 pandemic.
Is.Male A binary indicator variable indicating patient’s biological sex: 1 for
male, 0 for female.
Pod A categorical variable that divides a day into three arrival periods:
midnight to 8:00 AM, 8:AM - 4:00 PM, 4:00 PM to midnight.
Age Patients’ age.
NumPatientsFirstHour The number of patients in the ED during the first hour upon on the
arrival of the current patient.

We model the likelihood of patients LWBS as a function of IWTinfo , the wait time provision indi-
cator, the patients’ wait time, W , the interaction effect between the two, and all other variables in
Table 5. The dependent variable (whether the paitent LWBS) is binary; we can use a binary logistic
regression to model the log-odds of the patient’s likelihood of LWBS, as specified in Equation (1).
 
Pr(LWBS)
logit(LWBS) = log = β0 + β1 IWTinfo + β2 W + β3 IWTinfo × W + βX . (1)
1 − Pr(LWBS)
We found that providing wait time information reduces a patient’s likelihood of LWBS. The
coefficient for the wait time provision indicator is negative and statistically significant in Table 6
across all models, indicating our result is robust, from the simplest model where it only has the
variable of interest, to the second model, where it controls for patients’ actual wait time and the
interaction effect between the two, and to the fully specified model where it controls for all other
variable in Table 5. These results support Hypothesis 1. Coefficients for the other control variables
are in Table EC.8.
To quantify the specific effect of wait time information provision on a low-acuity patient’s likeli-
hood of LWBS, let’s focus on the fully specified model (column 3 in Table 6). Note that in model
(1), the average marginal effect (AME) of wait time provision indicator is
∂E[logit(LWBS)|IWTinfo , W ]
= β1 + β3 W.
∂IWTinfo
13

We can then use the mean for W (38.28 minutes) and the estimated βˆ1 and βˆ3 to obtain the AME
of the wait time provision indicator: the likelihood of a patient’s LWBS decreases 0.0035.
In addition, we found that providing wait time information is particularly effective at reducing
the likelihood of LWBS among ESI 4-5 minor patients. We replicated the above analysis using ESI
4-5 minor patients’ data and presented results in columns 4-6 in Table 6. Comparing coefficients in
columns 1, 2, and 3 to those in columns 4, 5, and 6, we found that the absolute value of the wait
time provision indicator coefficient is greater, and the p-value is lower, indicating a higher level
of statistical significance. To quantify the specific effect, we focused on the fully specified model
(column 6 in Table 6) and found that providing wait time information decreases the likelihood
that a ESI 4|5 minor patient will LWBS by 0.015. This effect is stronger than the effect of wait
time provision on a low-acuity patient when comparing the percentage change in the likelihood of
LWBS. Additional details for the coefficients are in column 4-6 of Table EC.8.

Table 6 The effect of providing wait time information on the likelihood that a low-acuity patient will LWBS
and on the likelihood that a ESI 4|5 minor patient will LWBS.

Dependent variable:
ESI 3|4|5 logit(LWBS) ESI 4|5 minor logit(LWBS)
(1) (2) (3) (4) (5) (6)
∗ ∗∗ ∗∗ ∗∗ ∗∗∗
WT provision indicator −0.155 −0.174 −0.186 −0.356 −0.387 −0.396∗∗∗
(0.091) (0.087) (0.079) (0.103) (0.054) (0.058)
Actual W 0.009∗∗∗ 0.010∗∗∗ 0.030∗∗ 0.020∗∗
(0.003) (0.003) (0.009) (0.009)
WT provision indicator * Actual W 0.001∗∗ 0.001∗∗ 0.003∗ 0.003∗
(0.0003) (0.0003) (0.001) (0.001)
.. ..
. .
Observations 9,787 9,787 9,787 581 581 581
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01

Also, results show that patients’ actual wait time has a small but statistically significant effect
on their likelihood of LWBS. As shown in Table 6, for example, in the fully specified model, the
coefficient for the patient’s actual wait time is 0.01 with p-value < 0.001. This finding is consistent
with the notion that longer wait times lead to time losses and dissatisfaction among patients,
making them more likely to LWBS.
In addition, we found that providing wait time information amplifies the effect of longer wait
times on patients’ likelihood of LWBS. The interaction term between the wait time provision
indicator and the patient’s actual wait time is positive and statistically significant, as shown in Table
6. We used the sample mean for IWTinfo (0.66) and the estimated βˆ2 and βˆ3 from the fully specified
14

model to calculate the AME of a patient’s actual wait time: the LWBS probability increases 0.0003
per additional minute wait time.
The placebo test was then performed by fitting models in Table 6 using patients’ data during
the times when the TV was turned off. We found no statistically significant effect from providing
wait-time information nor from the interaction term between the wait time provision indicator and
the patient’s actual wait time. Results for the placebo tests are in Table EC.12.

4.3. Effect of Waiting Longer vs. Shorter Than the Predicted Wait Time on LWBS
This section considers the effect on patients’ likelihood of LWBS of two types of prediction error,
a “prediction” that is longer than the actual wait (the patient waits less than the prediction) and
a “prediction” that is shorter than the actual wait (the patient waits longer than the prediction),
relative to the base scenario that the patient didn’t experience an error. We will be focusing on
patients’ data during the time that D2 and D3 are in effect.
We used quotation marks in “prediction” because our experimental designs D2 and D3 use integer
multiples of 10 (which people interpret as representative of nearby numbers) in order to reduce
any perceived error in the prediction. Recall from cognitive psychology literature (Rosch 1975) and
our online experiment §2 that people tend to interpret a wait time prediction that is an integer
multiple of ten minutes as representative of wait times between the next lower integer multiple
of ten minutes and next higher integer multiple of 10 minutes, and most people perceive the wait
time information as essentially correct even when their actual wait time deviates by 10 minutes
from the predicted integer multiple of 10 minutes. Therefore under experimental condition D2, we
say that patients didn’t experience an error if their actual wait time4 falls in the range [displayed
wait time - 10 minutes, displayed wait time + 10 minutes] and thus the magnitude of error is zero.
Similarly, under the experimental condition D3, we say that patients didn’t experience an error if
their actual wait time falls inside the range [displayed lower bound - 10 minutes, displayed upper
bound + 10 minutes]. Definitions of two different types of error and the magnitude (∆) of each
follow:
When experimental condition D2 is in effect:
• Type 1 error: the patient waits less than the displayed point estimate minus 10 minutes.
∆ = [the displayed point estimate − 10 minutes − patient’s actual wait time ]+ ;
• Type 2 error: the patient waits more than the displayed point estimate plus 10 minutes,
∆ = [patient’s actual wait time − (the displayed point estimate + 10 minutes)]+ .
When experimental condition D3 is in effect:
• Type 1 error: the patient waits less than the lower bound of the interval minus 10 minutes,

4
For patients who LWBS, we use the estimated wait time from §4.1 to approximate this value.
15

∆ = [lower bound of interval − 10 minutes − patient’s actual wait time]+ ;


• Type 2 error: the patient waits more than the upper bound of the interval plus 10 minutes,
∆ = [patient’s actual wait time −(upper bound interval + 10 minutes)]+ ;

Hypothesis 2. Patients that experiences the type one error (the prediction longer than actual
wait) are less likely to leave the ED without being seen by physician than are patients who experience
the type 2 error (the prediction shorter than the actual wait).

Hypothesis 3. The likelihood that patients choose to leave the ED without being seen decreases
with the magnitude of the type one error and increases with the magnitude of the type two error.

To test Hypothesis 2, we model the likelihood of a patient LWBS as a function of a categorical


variable with three levels: the patient experiences the type 1 error, type 2 error or no error, the
patient’s actual wait time, and other controls in Table 5, shown in Equation (2). During the time
that the experimental condition D2 and D3 are on, 5853 low-acuity patients came to the ED,
1637 patients experienced no error, 2318 patients experienced a type 1 error, and 1897 patients
experienced a type 2 error. The dependent variable is binary; we used a general linear model with
a binomial family and a logit link. The results are shown in Table 7, where column 1 is the simplest
regression model with only the variable of interest, column 2 is the more elaborate model where
we also control for patient’s actual wait time, and column 3 is the fully specified model with all
controls.

logit(LWBS) = β0 + β1 × Error Class + β2 W + β3 X . (2)

Table 7 The effect of waiting less or longer than the predicted wait time on the likelihood that a low-acuity
patient and a ESI 4|5 minor patient will leave the ED without being seen by physicians.

Dependent variable:
ESI 3|4|5 logit(LWBS) ESI 4|5 minor logit(LWBS)
(1) (2) (3) (4) (5) (6)
∗∗ ∗∗ ∗∗ ∗∗ ∗∗
Is.Type1 -0.404 -0.561 -0.461 -0.627 -0.674 -0.741∗∗
(0.213) (0.213) (0.214) (0.285) (0.290) (0.291)
Is.Type2 0.903∗∗ 0.919∗∗ 0.876∗∗ 0.901∗∗ 0.981∗∗ 0.986∗∗
(0.311) (0.315) (0.315) (0.302) (0.303) (0.302)
Actual W 0.0003∗∗∗ 0.0004∗∗∗ 0.0003∗∗∗ 0.0003∗∗∗
(0.0001) (0.0001) (0.0001) (0.0001)
.. ..
. .
Observations 5,853 5,853 5,853 357 357 357
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01

We found that waiting less than the prediction reduces the likelihood of a patient’s LWBS by
0.008. As demonstrated in column 1 of Table 7, the coefficient experiencing an type 1 error (waiting
16

less than the prediction) is −0.404 with p-value 0.03. In contrast, we found that experiencing the
type 2 error (waiting more than the prediction) increases the likelihood of LWBS by 0.036 (the
coefficient is 0.903 with p-value 0.04). These results support hypothesis 2 and provide evidence that
patients are more sensitive to losses (lose-averse) as the effect of waiting longer than the prediction
is much larger than the effect of waiting shorter than the prediction.
Above results remain robust as we control for patient’s actual wait time (column 2 in Table 7)
and in the fully specified model (column 3 in Table 7). The coefficients for other controls in the
fully specified model can be found in Appendix EC.1.3 Table EC.10.
As a natural experiment, we used patient data when no wait time information is provided to
carry out placebo tests. We fitted model (2) again using patients’ data when the TV screen is off.
Table EC.13 presents the results. In none of the models is a error class statistically significant.
These results support Hypothesis 2.
Next, we look at how the magnitude of the error affect patients’ likelihood of LWBS. We model
the likelihood of a patient will LWBS as a function of the magnitude of the two types of error:
∆ × IIs.Type1 and ∆ × IIs.Type2 , the patient’s actual wait time, and other controls in Table 5. Again,
the dependent variable is binary, we used a general linear model with the binomial family and the
logit link, as shown in Equation (3).

logit(LWBS) = β0 + β1 ∆ × IIs.Type1 + β2 ∆ × IIs.Type2 + β3 W + β 0 X . (3)

The regression results are in Table 8, where column 1 is the simplest model, and column 3 is the
fully specified model where we have the full controls.

Table 8 The effect of the magnitude of the error between the predicted wait time and the patient’s actual wait
time on the likelihood that a low-acuity patient and a ESI 4|5 minor patient will leave the ED without being seen
by physicians.

Dependent variable:
ESI 3|4|5 logit(LWBS) ESI 4|5 minor logit(LWBS)
(1) (2) (3) (4) (5) (6)
∗∗ ∗∗ ∗∗ ∗∗ ∗∗
∆ × IIs.Type1 −0.015 −0.013 −0.013 −0.022 −0.024 −0.021∗∗
(0.008) (0.008) (0.007) (0.010) (0.010) (0.009)
∆ × IIs.Type2 0.022∗∗ 0.033∗∗ 0.029∗∗ 0.031∗∗ 0.034∗∗ 0.031∗∗
(0.008) (0.012) (0.013) (0.015) (0.016) (0.016)
Actual W 0.0002 0.0003∗ 0.0002 0.0003
(0.0002) (0.0002) (0.0002) (0.0002)
.. ..
. .
Observations 5,853 5,853 5,853 357 357 357
∗ ∗∗ ∗∗∗
Note: p<0.05; p<0.01; p<0.001
17

We found that the likelihood of LWBS decreases with the magnitude of the error when a patient
waits less than the prediction, increases with the magnitude of the error when a patient waits
longer than the prediction, and patients are more sensitive to the magnitude of the error when
they wait longer than the prediction. In particular, in the simplest model, column 1 in Table 8, the
coefficient for ∆, βˆ1 , is −0.015 with p−value 0.006, which means that when patients experiences
the type 1 error, the likelihood of LWBS decreases by 0.0004 per unit increase (1 minute) in the
magnitude of the error. On the other hand, the coefficient for the magnitude of error when the
patient experiences the type 2 error is 0.022, which means that the odds of LWBS increases by
0.0006 per unit increase (1 minute) in the magnitude of the error. These effects are robust when
we control for patients’ actual wait time (Column 2 in Table 8) and in the fully specified model
(column 3 in Table 8). The coefficients for the controls in the fully specified model is in Appendix
EC.1.3 Table EC.11.
Again, the placebo test was carried out using patients’ data when the TV screen is off. Table
EC.14 presents the results. Neither βˆ1 nor βˆ2 is statistically significant in any of the models. These
results support Hypothesis 2.
Furthermore, we found that the effect of experiencing an type 1 error or an type 2 error and the
magnitude of the error have a more pronounced effect on ESI 4|5 minor patients. We repeated the
same analysis for ESI 4|5 minor patients. In table 7 and 8, compared to the coefficients in column
1, 2 and 3, the coefficients in column 4, 5, and 6 have a higher statistical significance level and
a larger absolute value, indicating that both effect are larger on ESI 4|5 minor patients. Taking
the fully specified model as an example. The effect of waiting less than the prediction reduces
the likelihood of LWBS by 0.039. The effect of waiting longer than the prediction increases the
likelihood of LWBS by 0.107. When patients wait less than the prediction, the effect of 1 unit
increase in the magnitude of the error decreases the likelihood of LWBS by 0.0015. On the other
hand, when patients wait longer than the prediction, the effect of 1 unit increase in the magnitude
of the error increases the odds of LWBS by 0.002. The coefficients for the fully specified models
are in Table EC.10 and EC.11.
We ran the placebo test again using patients data during the periods when the TV was turned
off. We did not find a statistically significant effect in any of the models (Table EC.14).

4.4. The Effect of Communicating that the Wait Time Could be 20 Minutes Larger
In this section, we investigate the impact of explicitly informing patients about the uncertainty of
the algorithm’s wait time prediction on their LWBS. Specifically, we communicate to patients that
their actual wait time could be up to 20 minutes longer than the predicted wait time. To evaluate
the effectiveness of this approach, we model the likelihood of a low-acuity patient’s likelihood of
18

LWBS as a function of a categorical variable with three levels indicating different experimental
conditions, the patient’s actual wait time, and other controls in Table 5, shown in equation (4).
The dependent variable is binary; again, we used a general linear model with a binomial family
and a logit link.

logit(LWBS) = β0 + β1 × Experiment Conditions + β2 W + β3 X . (4)

More importantly, we conduct this analysis using data from patients who fall into each of the three
different scenarios: those who experience a Type 1 error, those who experience no error, and those
who experience a Type 2 error, as defined in §4.3. In addition, we use the no information case (D1)
as the base level for all analyses.

Table 9 The effect of providing no information versus a single number estimate (D2) versus an estimated wait
time interval (D3) on low-acuity patients’ likelihood of leaving the ED without being seen using data from
patients who fall into three different scenarios: those who experience a Type 1 error, those who experience no
error, and those who experience a Type 2 error.

Dependent variable:
logit(LWBS)
Type 1 error No error Type 2 error worse

(1) (2) (3) (1) (2) (3) (1) (2) (3)


∗ ∗ ∗ ∗∗ ∗∗ ∗∗ ∗∗ ∗∗
D2 −0.201 −0.208 −0.209 −0.351 −0.347 −0.343 0.308 0.303 0.305∗∗
(0.205) (0.209) (0.203) (0.203) (0.207) (0.205) (0.177) (0.183) (0.181)
D3 −0.271∗ −0.231∗ −0.232∗ −0.369∗∗∗ −0.368∗∗∗ −0.368∗∗∗ 0.217∗∗∗ 0.218∗∗∗ 0.218∗∗∗
(0.235) (0.227) (0.203) (0.101) (0.105) (0.104) (0.113) (0.107) (0.103)
Actual W 0.009 0.010 0.007∗∗ 0.007∗∗ 0.005∗∗∗ 0.005∗∗∗
(0.019) (0.016) (0.015) (0.005) (0.002) (0.002)
.. .. ..
. . .
Observations 5,578 5,578 5,578 7,268 7,268 7,268 4,715 4,715 4,715
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01

Our results show that providing wait time information reduces the likelihood of patients’ LWBS
in the case of experiencing a Type 1 error or No error, and providing the wait time interval does
so to greater extent. However, providing wait time information increases the likelihood of LWBS
in the case of experiencing a Type 2 error, and providing the point estimate has a even stronger
effect.
These findings are consistent with our previous results that waiting less than the prediction
decreases the likelihood of LWBS, but waiting longer than the prediction has the opposite effect
(see §4.3). Furthermore, our online experiment found that providing a wait time interval increases
patients’ anticipated wait time (see §2), which may explain the stronger effect of providing the
19

wait time interval on reducing the likelihood of LWBS. According to prospect theory (Tversky and
Kahneman 1991), if we assume that the reference point is positively associated with the anticipated
wait time, then an increase in the former should lead to an increase in the latter. This, in turn,
may lead to a decrease in the chance of feeling a time loss and subsequently, a reduction in the
likelihood of LWBS.
The robustness of these results can be seen in Table 9, where the coefficients for D2 and D3
are negative and statistically significant (with a larger absolute value for D3) in analyses for
the “Type 1 error” and “No error” scenarios, across all model specifications, from the simplest
model to the fully spcified model. In contrast, the coefficients for D2 and D3 are positive and
statistically significant (with a larger absolute value for D2) in analyses of the “Type 2 error”
scenario. Additional coefficients for other variables can be found in Table EC.9.

4.5. Effect of Wait Time Information on Self-reported Wait Satisfaction

Table 10 Summary of statistics for survey participants across experiement conditions

D1 D2 D3
Num. of observation 65 86 73
Mean(sd) Mean(sd) Mean(sd)

Wait Satisfaction (out of 5) 2.8(1.7) 3.4(1.5) 3.5(1.6)


Pain Level (out of 10) 6.4(4.7) 6.7(4.4) 6.7(4.3)
Actual W (min) 45(40) 43(42) 45(28)
Age 40(12) 35(17) 39(22)
NumPatientsFirstHour 16(6.2) 18(8.1) 16(7.3)
Q-Lasso prediction 37(10) 33(11) 38(12)

Percent Percent Percent


WT provision indicator A 0 100% 100%
Is.ESI 3 64% 32% 33%
Is.ESI 4|5 36% 68% 67%
Is.Accompanied 62% 57% 64%
Is.Male 40% 40% 25%
Is.AfterStateEmergency 12% 15% 16%
Pod (08:00 - 16:00) 41% 54% 33%
Pod (16:00 - 24:00) 42% 26% 12%

This section discusses the effect of providing wait time information on low-acuity patients’ self-
reported wait satisfaction. We collected 224 survey responses. We match their responses with their
electronic medical record, and summarize the data in Table 10.
At the observational level, the box-plot in Figure 6 shows that providing wait time information
increases patients’ level of satisfaction. This is evident by comparing the mean satisfaction ratings
in the conditions where wait time information was provided (3.4 in D2 and 3.5 in D3) to the
20

3.75
3.50
Satisfaction Rating 3.25
3.00
2.75
2.50
2.25
D1 D2 D3
Information Design
Figure 6 The self-reported waiting satisfaction ratings of low-acuity patients who visited SMMC ED from April
1st 2019 to December 31st 2020.

condition where no wait time information was provided (2.8 in D1). This indicates that providing
wait time information can increase satisfaction by approximately 25% on average.
In order to examine whether there are statistically significant differences in the means of survey
responses across the experimental conditions, we conducted a one-way ANOVA test. We used an
integer binary variable to indicate participation in the study, where a value of 1 indicated that
a low-acuity patient took the survey and a value of 0 indicated that they did not. The p-value
obtained from the ANOVA test was 0.313, indicating that we could not reject the null hypothesis
that the means of the different experimental conditions are equal.
In addition to the one-way ANOVA test, we also ran a Probit regression analysis to investigate
how the experimental conditions affect the likelihood of a patient participating in the survey. The
model considered a single categorical variable with three levels (D1, D2, and D3), which represented
the experimental conditions. In the fully specified model, we controlled for other variables in Table
5. However, the coefficients for the experimental condition were not statistically significant in any
of the models. The results of these analyses are detailed in Table 13 and 14 in Appendix A.1.
These results alleviate our concerns that the experimental conditions affect the likelihood of survey
participation, i.e., the sample selection bias is independent of the experimental conditions and
therefore does not impact the external validity of our results.
Despite our findings, we still need to further address the issue of non-random sample selection.
As readers may already noticed, the wait satisfaction ratings are only available for patients who
participated in the survey. In the remaining of the section, we will first introduce our model, and
then continue the discussion of how we address the non-random sample selection problem.
We wish to test if the following hypothesis holds.
21

Hypothesis 4. Patients’ average wait satisfaction increases when wait-time-related information


is provided.

To test Hypothesis 4, we model wait satisfaction, S, as a function of the indicator variable, IWTinfo
(which equals 1 if wait time information is provided), the actual wait time of patient W , and a
vector X of control variables. We define cumulative probabilities P r{S ≤ j }), j = 1, 2, . . . , 4, since
each patient’s self-reported satisfaction level is an integer number from 1 to 5, and the responses
are ordered: 1 < 2 < . . . < 5. As a result, the cumulative logits are modeled as:

logit(P r{S ≤ j }) = θj − α1 IWTinfo − α2 W − β 0 X , j = 1 , 2 , . . . , 4 , (5)

where logit (Pr{S ≤ j }) = log(Pr{S ≤ j }/Pr{S > j }) and θj is the intercept for all j = 1, 2, . . . , 5.
We use Maximum Likelihood to estimate coefficients of the cumulative link model in Equation (5).
Thus, when an event S ≥ j with wait time being provided is compared to the same event without
wait time information(the baseline), the odds ratio is exp(α1 ). Providing wait time information
increases the odds of higher categories of wait satisfaction when the odds ratio is greater than 1.
Non-response bias. Our goal is to estimate the effect of providing wait time information (A) on
patients’ wait satisfaction (S), but the problem is that wait satisfaction can only be observed for
patients who participated in the survey. Therefore, a naive estimator would be biased, because we
do not know what the wait satisfaction is for those who did not participate in the survey. This is
known as “incidental truncation”, where the inclusion of a person in the sample is determined by
the person themselves, not the surveyor.
To address this endogeneity in survey participation, we used a two-stage treatment effects model
that involves a first stage to explain patient’s survey participation (Equation (7)) and a second
stage to explain patients’ wait satisfaction (Equation (6)):

Si∗ = X1 β1 + i , (6)

Ii = 1[Xγ
Xγ + µi > 0] . (7)

That is, S = Si∗ iff I∗i > 0. I is an binary indicator variable indicating whether the patient partici-
pated the survey.
First, we used a Probit model to estimate the patient’s likelihood of survey participation. We
included all of the variables in Table 11 and two additional variables that affect the likelihood
of patients’ survey participation but are unrelated to patients’ wait satisfaction, i.e., these two
variables satisfy the exclusion restriction. The first is ’Is.HandInjuries’, which is a binary variable
that indicates whether or not the patient has hand/finger injuries. The second is ’Is.Weekend’,
a binary variable that indicates whether the patient arrives on a weekend or not. We can derive
22

xi γγ)) for each observation from the Probit


estimators for γ and the predicted inverse Mills ratio (λ̂(x
selection equation. We fit the original model (Equation (5)) plus the (λ̂) as a predictor in the second
stage. If the coefficient of (λ̂) is statistically equal to zero, no sample selection is present and the
estimation results from model (5) are consistent and can be presented. This approach is sometimes
referred to as heckit, after Heckman (1976). In Column 4 of Table 11, the coefficient for (λ̂) is
−0.023, with a p-value of 0.813. In addition, we repeated the same heckit analysis using patient’s
data when experimental conditions D1 and D2 are in effect, because the number of observations in
these two groups has a higher discrepancy (55 in D1 and 96 in D2). The coefficient for the inverse
Mills ratio is small and not statistically significant (shown in column 4 of Table 15 in Appendix
B). Thus, there is little evidence of sample selection bias. Other empirical works in our field use
approaches similar to what we proposed here, e.g., (Guajardo et al. 2012) and (Liu et al. 2015).
To give more insights, Heckman’s correction is based on the idea that the presence of sample
selection bias can be evaluated by examining the statistical significance of a transformation of the
predicted values from the first stage of the model, known as the inverse Mills ratio. In our study, the
inverse Mills ratio would represent the selection hazard of the patient participating in the survey.
If the coefficient of the inverse Mills ratio is statistically equal to zero, it indicates that there is no
sample selection bias present in the second stage model, and the estimation results are consistent
and can be interpreted as such.
In our study, we included two additional variables in the first stage model to address the issue
of collinearity and improve the specification of the selection model. These variables are exogenous
and not related to patients’ wait satisfaction, which allows us to properly estimate the parameters
in the second stage model. By ensuring the quality of the first stage model, we can obtain more
accurate results in the second stage.
Our results show that providing wait time information increases patients’ wait satisfaction. In
column 1 of Table 11, the wait time provision indicator has a positive and statistically significant
coefficient: α1 = 0.594 with a p-value of 0.03. This means that when wait time information is
provided, the odds of a higher wait satisfaction increase by 81%. We controlled for the actual wait
time and included additional controls in the model, and the results remained robust. In addition,
the differences between the cumulative logistic regression estimates and the “Heckit” estimates
were small, and the inverse Mills ratio term was statistically insignificant. These findings support
Hypothesis 4.
We also found that patients are more satisfied when accompanied by family or friends. The
coefficient for Is.Accompanied is 0.485 with a P -value of 0.06, indicating that if the patient is
accompanied, the odds of a higher wait satisfaction increase by 62%. This result is consistent with
23

Table 11 Regression results of wait-time information on patient satisfaction. In the first column, only the wait
time provision indicator is included; in the second column, the patient’s actual wait time is included; and in the
third column, all controls are added. The fourth column is the regression result for the second stage model
(Equation (6)).

Dependent variable:
Wait Satisfaction
Cumulative Logistic Heckit
(1) (2) (3) (4)
∗∗ ∗∗ ∗∗∗
WT provision indicator 0.594 0.600 0.684 0.623∗∗∗
(0.186) (0.187) (0.030) (0.025)
Actual W −0.0002 −0.0001 −0.0003
(0.0001) (0.0001) (0.0003)
PainLevel −0.035 −0.033
(0.060) (0.060)
Age −0.006 −0.006
(0.010) (0.010)
Is.Male 1.182∗∗∗ 1.165∗∗∗
(0.184) (0.165)
ESI 4|5 −0.020 −0.034
(0.366) (0.332)
Is.Accompanied 0.485∗ 0.317
(0.248) (0.261)
NumPatientsFirstHour −0.075 −0.044
(0.068) (0.048)
Is.AfterStateEmergency −1.195∗∗∗ −1.114∗∗
(0.210) (0.383)
λ̂ −0.023
(0.905)
Observations 224 224 224 224
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01

previous findings (Lin et al. 2004). Additionally, we found that the actual wait time had no statis-
tically significant effect on patients’ wait satisfaction, confirming Thompson et al. (1996)’s claim
that managing patients’ perceptions and expectations is more effective in improving satisfaction
than reducing wait time.

5. Conclusion
We conducted randomized online experiments and a field experiment to see if and how displaying
an algorithmic wait time prediction can reduce the likelihood that patients leave the ED without
being seen by a physician and improve their experience of waiting. In collaboration with the
nurses, we developed two approaches to displaying the algorithmic prediction: The first approach
is the prediction rounded to a multiple of 10 minutes, and the second is an interval designed
to communicate that the wait time could be even 20 minutes longer. In the online experiments,
we found evidence that using a multiple of 10 improves the perceived quality of the information
24

displayed, despite increasing the prediction error. Moreover, the interval induces higher anticipated
wait time than the other approach, as intended.
Guided by these insights, in our field experiment at San Mateo Medical Center, we dynamically
rotated among three different displays. (1) a point estimate: we rounded our algorithmic prediction
output to the nearest 10 minutes. (2) an interval estimate ranging from our point estimate to 20
minutes larger. (3) no wait time information.
Relative to the control with no wait time information, both approaches significantly reduce the
likelihood of low-acuity patients leaving the ED without being seen, and the interval does so to
greater extent. We attribute this effect to (1) improved patient satisfaction and (2) a higher antic-
ipated wait time induced by the interval estimate. Our survey results showed a 25% increase in
self-reported waiting satisfaction. To measure the impact we addressed the low survey response
rate issue with an incentive scheme and corrected for potential non-response bias by leveraging
instrumental variables and a two-stage Heckman treatment effect model. According to prospect
theory (Tversky and Kahneman 1991), increasing patients’ anticipated wait time above the algo-
rithmic prediction should improve patients’ satisfaction in the event that they wait longer than
the algorithmic prediction. This may contribute to the higher self-reported wait time satisfaction
and, in turn, could induce those patients to wait longer rather than LWBS.
Additionally, ED managers want LWBS patients to be the ones who cannot obtain the best
treatment in the ED and do not need immediate treatment. Emergency medicine literature suggests
that ESI 4-5 patients with chief complaints such as “dental pain“ or “medication refill“ (ESI 4-
5 minor) are unable to receive the best care in the ED, but their presence can exacerbate ED
overcrowding. Unfortunately, we found that displaying an algorithmic wait time prediction is more
effective at reducing patients’ LWBS among ESI 4-5 minor patients than among ESI 3 patients.
We also quantified the effect of two types of prediction error on patients’ LWBS and found that
the likelihood of LWBS decreases with the magnitude of the error when a patient waits for less
than the prediction, increases with the magnitude of the error when a patient waits longer than the
prediction, and patients are more sensitive to the errors when they wait longer than the prediction.
These effects are more salient among ESI 4-5 minor patients.
Overall, we hope our insights help improve the design, adoption and adherence to analytics in ED
decision support systems. There are also interesting directions for future research such as comparing
the effectiveness of different wait-time display formats and intervals, as well as examining the
impact on different patient populations with varying levels of acuity. Another potential direction is
to investigate the potential benefits of incorporating feedback from patients and medical staff into
the wait time prediction algorithm, in order to improve its accuracy and effectiveness over time.
25

References
ACEP. 2012. Publishing wait times for emergency department care: An information paper. Report, American
College of Emergency Physicians, Baltimore .

Allon, Gad, Achal Bassamboo, Itai Gurvich. 2011. “we will be right with you”: Managing customer expec-
tations with vague promises and cheap talk. Operations research 59(6) 1382–1394.

Ang, E., S. Kwasnick, M. Bayati, M. Aratow, E. Plambeck. 2015. Accurate emergency department wait time
prediction. Manufacturing & Service Operations Management 18 141–156.

Anunrojwong, Jerry, Krishnamurthy Iyer, Vahideh Manshadi. 2022. Information design for congested social
services: Optimal need-based persuasion. Management Science .

Arendt, Katherine W, Annie T Sadosty, Amy L Weaver, Christopher R Brent, Eric T Boie. 2003. The
left-without-being-seen patients: what would keep them from leaving? Annals of emergency medicine
42(3) 317–IN2.

Batt, Robert J, Christian Terwiesch. 2015. Waiting patiently: An empirical study of queue abandonment in
an emergency department. Management Science 61(1) 39–59.

Berg, Nathan. 2005. Non-response bias .

Buell, Ryan W, Tami Kim, Chia-Jung Tsay. 2017. Creating reciprocal value through operational trans-
parency. Management Science 63(6) 1673–1695.

Buell, Ryan W, Michael I Norton. 2011. The labor illusion: How operational transparency increases perceived
value. Management Science 57(9) 1564–1579.

Carmon, Ziv, J George Shanthikumar, Tali F Carmon. 1995. A psychological perspective on service seg-
mentation models: The significance of accounting for consumers’ perceptions of waiting and service.
Management Science 41(11) 1806–1815.

Chase, Richard B, Sriram Dasu. 2001. Want to perfect your company’s service? use behavioral science.
Harvard business review 79(6) 78–85.

CMS. 2014. Centers for medicare and medicaid services. Federal Regis-
ter URL https://www.federalregister.gov/documents/2014/11/10/2014-26146/
medicare-and-medicaid-programs-hospital-outpatient-prospective-payment-and-ambulatory-surgical.

Compton, Jocelyn, Natalie Glass, Timothy Fowler. 2019. Evidence of selection bias and non-response bias
in patient satisfaction surveys. The Iowa orthopaedic journal 39(1) 195.

Currie, CC, SJ Stone, J Connolly, J Durham. 2017. Dental pain in the medical emergency department: a
cross-sectional study. Journal of Oral Rehabilitation 44(2) 105–111.

Donnelly, Kristin, Giovanni Compiani, Ellen RK Evers. 2021. Time periods feel longer when they span
more category boundaries: Evidence from the lab and the field. Journal of Marketing Research
00222437211073810.
26

Gilboy, Nicki, Paula Tanabe, Debbie Travers, AM Rosenau, et al. 2020. Emergency severity index (esi): a
triage tool for emergency department care, version 4. Implementation handbook 2020 1–17.

Guajardo, Jose A, Morris A Cohen, Sang-Hyun Kim, Serguei Netessine. 2012. Impact of performance-based
contracting on product reliability: An empirical analysis. Management Science 58(5) 961–979.

Heckman, James J. 1976. The common structure of statistical models of truncation, sample selection and
limited dependent variables and a simple estimator for such models. Annals of economic and social
measurement, volume 5, number 4 . NBER, 475–492.

Ibrahim, Rouba. 2018. Sharing delay information in service systems: a literature survey. Queueing Systems
89(1) 49–79.

Johnson, Mary Beth, Edward M Castillo, James Harley, David A Guss. 2012. Impact of patient and family
communication in a pediatric emergency department on likelihood to recommend. Pediatric emergency
care 28(3) 243–246.

Kim, Song-Hee, Carri W Chan, Marcelo Olivares, Gabriel Escobar. 2015. Icu admission control: An empirical
study of capacity allocation and its implication for patient outcomes. Management Science 61(1)
19–38.

Kumar, Piyush, Manohar U Kalwani, Maqbool Dada. 1997. The impact of waiting time guarantees on
customers’ waiting experiences. Marketing science 16(4) 295–314.

Larson, Richard C. 1987. Or forum—perspectives on queues: Social justice and the psychology of queueing.
Operations research 35(6) 895–905.

Lin, Herng-Ching, Sudha Xirasagar, James N Laditka. 2004. Patient perceptions of service quality in group
versus solo practice clinics. International Journal for Quality in Health Care 16(6) 437–445.

Liu, Angela, Tridib Mazumdar, Bo Li. 2015. Counterfactual decomposition of movie star effects with star
selection. Management Science 61(7) 1704–1721.

Luo, Danqi, Mohsen Bayati, Erica L Plambeck, Michael Aratow. 2017. Low-acuity patients delay high-acuity
patients in the emergency department. Available at SSRN 3095039 .

Mataloni, Francesca, Paola Colais, Claudia Galassi, Marina Davoli, Danilo Fusco. 2018. Patients who leave
emergency department without being seen or during treatment in the lazio region (central italy):
Determinants and short term outcomes. PLoS One 13(12) e0208914.

Norman, Donald A. 2009. Designing waits that work. MIT Sloan Management Review 50(4) 23.

Pines, Jesse M, Pooja Penninti, Sukayna Alfaraj, Jestin N Carlson, Orion Colfer, Christopher K Corbit,
Arvind Venkat. 2018. Measurement under the microscope: high variability and limited construct validity
in emergency department patient-experience scores. Annals of emergency medicine 71(5) 545–554.

Rosch, Eleanor. 1975. Cognitive reference points. Cognitive psychology 7(4) 532–547.

Soman, Dilip, Mengze Shi. 2003. Virtual progress: The effect of path characteristics on perceptions of progress
and choice. Management Science 49(9) 1229–1250.
27

Thompson, D.A., P.R. Yarnold, D.R. Williams, S.L. Adams. 1996. Effects of actual waiting time, perceived
waiting time, information delivery, and expressive quality on patient satisfaction in the emergency
department. Annals of Emergency Medicine 28(6) 657–665.

Thrasher, Tony W, Martha Rolli, Robert S Redwood, Michael J Peterson, John Schneider, Lisa Maurer,
Michael D Repplinger. 2019. ‘medical clearance’of patients with acute mental health needs in the
emergency department: a literature review and practice recommendations. WMJ: official publication
of the State Medical Society of Wisconsin 118(4) 156.

Tversky, Amos, Daniel Kahneman. 1991. Loss aversion in riskless choice: A reference-dependent model. The
quarterly journal of economics 106(4) 1039–1061.

Westphal, Monika, Galit Yom-Tov, Avi Parush, Anat Rafaeli. 2022. Reducing abandonment and improving
attitudes in emergency departments: Integrating delay announcements into operational transparency
to signal service quality. Available at SSRN 4120485 .

Whitt, Ward. 1999. Improving service by informing customers about anticipated delays. Management science
45(2) 192–207.

Wilsey, Barth L, Scott M Fishman, Alexander Tsodikov, Christine Ogden, Ingela Symreng, Amy Ernst.
2008. Psychological comorbidities predicting prescription opioid abuse among patients in chronic pain
presenting to the emergency department. Pain Medicine 9(8) 1107–1117.

Yu, Qiuping, Gad Allon, Achal Bassamboo. 2017. How do delay announcements shape customer behavior?
an empirical study. Management Science 63(1) 1–20.

Yu, Qiuping, Gad Allon, Achal Bassamboo. 2021. The reference effect of delay announcements: A field
experiment. Management Science 67(12) 7417–7437.
28

Appendix A: Supplemental Details for §4

Figure 8 The consent form for asking patients’ name


and phone number.

Figure 7 The consent form and the three survey


questions.
29

A.1. One-way ANOVA and Probit models


The one-way ANOVA is a statistical method used to compare the means of multiple groups. We performed
two one-way ANOVAs to determine if there were any significant differences in the means of patients’ wait
times and number of patients’ participation in different experimental conditions. The results of these analyses
are shown in Tables 12 and 13. In both cases, we found that the differences between the means were not
statistically significant. This suggests that the experimental conditions did not significantly affect patients’
wait times or their likelihood of participating in the survey.

Table 12 One-way ANOVA summarizing the one-way ANOVA results for determining whether there are any
statistically significant differences between means of patients’ wait time in different experimental conditions

Degree of Freedom Sum of Squared Mean Squared F value Pr(> F)


Experiment Conditions 2 3708 1854 1.467 0.231
Residual 9786 10160936 1264
*p < 0.05, **p < 0.01, ***p < 0.001

Table 13 One-way ANOVA summarizing the one-way ANOVA results for determining whether there are any
statistically significant differences between means of patients’ participation in different experimental conditions

Degree of Freedom Sum of Squared Mean Squared F value Pr(> F)


Experiment Conditions 2 0.05 0.02597 1.161 0.313
Residual 9784 218.82 0.02236
*p < 0.05, **p < 0.01, ***p < 0.001

The results in Table 14 (using Probit model) show that the experimental conditions have no statistically
significant effect on patients’ likelihood of participating in the survey. This result is robust from the simplest
model to the fully specified model, indicating that the experimental conditions are not correlated with the
likelihood of survey participation.
Appendix B: Sample Selection
Two additional variables in the first stage of the two-stage heckman model. We found that the variables
Is.HandInjuries and Is.Weekend affect the likelihood of survey participation but are independent of wait
satisfaction. Table 16 shows that Is.HandInjuries has a statistically significant effect on survey participation
in both the simple model (column 1) and the fully specified model (column 2) that includes all covariates.
Columns 3 and 4 show that it does not correlate with wait satisfaction in either model. These results
demonstrate that Is.HandInjuries and Is.Weekend satisfy the exclusion restriction.
30

Table 14 The effect of different experimental conditions on the likelihood that a patient will participate in the
survey. Selection is a binary variable indicating that a patient participated the survey.

Dependent variable:
Selection
(1) (2) (3)
D2 0.103 0.095 0.101
(0.068) (0.067) (0.067)
D3 0.071 0.069 0.072
(0.070) (0.071) (0.070)
Triage level 0.003∗∗ 0.003∗∗
(0.001) (0.001)
Actual W 0.0001∗∗∗ 0.0001∗∗
(0.00003) (0.00003)
is.AfterStateEmergency 0.006∗∗ 0.007∗∗
(0.003) (0.003)
Is.Accompanied 0.003 0.002
(0.002) (0.002)
NumPatientsFirstHour 0.0004
(0.0003)
Age −0.0001
(0.00005)
Is.male −0.005∗∗∗
(0.002)
Constant −2.058∗∗∗ −0.005 −0.001
(0.051) (0.005) (0.006)
Observations 9787 9787 9787
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01

Table 15 Regression results for models looking at the impact of wait-time information provision on patient’s
wait satisfaction, from the simplest model (column 1) to the fully specified model (column 3). The forth column
is the result for the second stage model discussed in §4.5 using an endogenous approach with two instrument
variables.

Dependent variable:
Wait Satisfaction
Cumulative Logistic Heckit
(1) (2) (3) (4)
∗ ∗ ∗
WT provision indicator 0.380 0.365 0.357 0.350∗∗
(0.106) (0.137) (0.130) (0.104)
Actual W −0.0002 −0.0001 −0.0003
(0.0001) (0.0001) (0.0003)
PainLevel −0.035 −0.031
(0.087) (0.085)
Age −0.003 −0.006
(0.010) (0.012)
Is.Male 0.882∗∗ 0.765∗
(0.184) (0.365)
ESI 4|5 −0.020 −0.134
(0.366) (0.332)
Is.Accompanied 0.285∗ 0.217
(0.108) (0.261)
NumPatientsFirstHour −0.071 −0.088
(0.066) (0.069)
Is.AfterStateEmergency −1.053∗ −1.530∗∗
(0.473) (0.770)
λ̂ −0.031
(0.887)
Observations 151 151 151 151
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01
31

Table 16 The effect of having hand injuries (Is.HandInjuries) and arriving to the ED during weekend
(Is.Weekend) on patients’ likelihood of survey participation and on their waiting satisfaction.

Dependent variable:
Selection Satisfaction WT
probit cumulative
logistic
(1) (2) (3) (4) (5) (6)
∗∗ ∗∗∗
Is.HandInjuries 0.241 0.522 0.682 0.355
(0.185) (0.190) (0.665) (0.730)
Is.Weekend −0.120∗ −0.115∗∗ 0.300 0.461
(0.050) (0.031) (0.468) (0.496)
WT provision indicator −0.043 0.678∗∗
(0.080) (0.030)
Actual W 0.003∗∗∗ −0.008
(0.001) (0.005)
numPatientsFirstHour 0.021 −0.075
(0.014) (0.068)
ESI 4|5 0.228∗∗∗ −0.040
(0.079) (0.395)
Is.Accompanied 0.097 0.223
(0.079) (0.424)
Is.AfterStateEmergency 0.284∗∗∗ −1.149∗∗
(0.102) (0.552)
Age −0.003 −0.007
(0.002) (0.011)
Is.Male −0.284∗∗∗ 1.386∗∗∗
(0.080) (0.427)
Constant −2.041∗∗∗ −2.311∗∗∗ −2.520∗∗∗
(0.028) (0.041) (0.167)
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction ec1

Electronic Companion for


“Emergency Department Experiment in Displaying an
Algorithmic Wait Time Prediction“
Appendix EC.1: Details of the Online Experiment
We conducted a randomized online experiment on Prolific, a platform for conducting online research
similar to Amazon Mechanical Turk. We posted our study to the U.S. pool and took advantage of
the “balance sample“ feature offered by the platform to ensure that we distributed our study to
an equal number of male and female participants.
If participants open our study, they will see the information window where we inform them of
the purpose, setup, payment ($0.5 upon completion per person), and length of the study (∼ 2.5
minutes). There is open an “open“ button for them to enter the study.
After completing the consent form (shown in Figure EC.1), participants are asked to imagine
that they are being triaged by a nurse who has determined that they are of low-acuity and can
safely wait. We then display the wait time information to the participants on a ‘TV screen’ (please
see Figures EC.8-EC.10 for examples).
Next, we ask participants the following questions:
Q1. How long do they expect to wait before being seen by a physician (Figure EC.2)?
Q2. Whether they would choose to leave immediately (Figure EC.3)?
Q3. How long they would be willing to wait before leaving the ED without being seen (Figure
EC.4)?
For the first question in particular, we ask participants to distribute 100 points into 9 time-
interval buckets. This is designed to better understand people’s subjective distribution of wait time
outcomes.
After participants answer the above questions, their actual wait time will be revealed to them.
We then ask them to rate the quality (Q4) of the wait time information as either “essentially
correct” or “essentially misleading” and the understandability (Q5) of the wait time information
on a scale from 1 to 10, with 1 being “difficult to understand“ and 10 being “easy to understand”.
We also ask them to rate how likely they are to leave complaints (Q6) about the waiting experience
on a scale from 1 to 10, with 1 being “very likely” and 10 being “very unlikely”. Finally, we ask for
their satisfaction with the waiting experience in the ED (Q7) on a scale from 1 to 10, with 1 being
“very negative” and 10 being “very positive” (details of these questions can be found in Figure
EC.5).
Note that the ‘actual wait time of 70 minutes’ in the first question in Figure EC.5 is only one of
three possible scenarios. We will discuss all experimental conditions in more detail below.
ec2 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

Figure EC.1 The description of the online study.

Figure EC.2 Q1: The set of survey questions eliciting participants’ subjective wait time distribution

EC.1.1. Experiment Design and Results


We conducted two sets of online experiments.
Online Experiment 1. In our first online study, we employed a between-subject design with
three experimental conditions: 3 (different wait time information) × 1 (same actual wait time).
Each participant was randomly assigned to one of these conditions. The three different wait time
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction ec3

Figure EC.3 Q2: A survey question asking if the participant would leave immediately

Figure EC.4 Q3: A question asking how long the participant would be willing to wait before leaving.

information cases included two point estimates of 40 and 41 minutes, and a no information case
(as shown in Figures EC.7 - EC.9). For all three cases, participants’ true wait time was 50 minutes.
Thus, for all participants, there was a prediction error of about 10 minutes, with their actual wait
time being about 10 minutes longer than the estimated wait time provided to them.
Please note a modification in the first trial. After participants saw the estimated wait time
information, when we asked them the first question regarding how long they expect to wait before
being seen, we didn’t ask them to distribute 100 into 9 consecutive time-interval buckets. Rather,
we asked them to provide the maximum and minimum amount of time that they expected to wait
before being seen, and invited them to enter single numbers. (see Figure EC.6). The results of the
first trial are summarized in Table EC.1.

Table EC.1 Summary of Statistics of Online Experiment 1

Wait Time Prediction 41 minutes 40 minutes None


Number of observations 92 91 86
Anticipated maximum wait time in minutes 61 (25) 76 (71) 107 (102)
Anticipated minimum wait time in minutes 34 (12) 30 (17) 27 (27)
Actual wait time in minutes 50 50 50
Perceived information as essentially correct 71% 83% NA
Satisfaction on scale from 1 to 10 5.6(2.4) 5.4(2.3) 4.5(2.2)

Finding 1 We found that when providing an estimated wait time that is an integer multiple
of 10 minutes (despite a prediction error of up to 10 minutes), participants perceive the quality of
the wait time information to be essentially correct, rather than misleading. In this case, when 40
minutes is provided as the estimated wait time, despite the actual wait time being 50 minutes, 83%
ec4 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

Figure EC.5 Q4 - Q7: four questions asking participants to rate the quality and understandbility of the
estimated wait time information, the likelihood of leaving complaints, and the waiting satisfaction in the ED on a
1 to 10 scale, respectively.

of participants reported that the estimated wait time is essentially correct, rather than misleading
(see Table EC.1).
Finding 2 We found that giving a patient an estimated wait time that is an integer multiple
of 10 minutes leads to a longer expected wait time and a longer maximum amount of time that a
patient is willing to wait before leaving the ED than giving a sharp integer. As shown in Table EC.1,
when 40 minutes is provided, the empirical mean of the “Expected Maximum WT” is higher (76
minutes) than when 41 minutes is provided (61 minutes). Additionally, the empirical mean of the
“Maximum WT before leaving” is higher (101 minutes) when 40 minutes is provided than when 41
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction ec5

Figure EC.6 The question asking the maximum and the minimum amount of time the participant expects to
wait after the estimated wait time information is given.

minutes is provided (84 minutes). We ran two sets of t-tests to compare the means of participants’
responses to both questions, respectively, when 40 minutes is provided versus 41 minutes being
provided. The p-value of the t-test comparing the means of the “Expected Maximum WT” is 0.07
and the p-value of the t-test comparing the means of the “Maximum WT before leaving” is 0.09,
indicating that the means of the responses under these two experimental conditions are not the
same.
Finding 3 Our findings show that providing wait time estimates leads to higher self-reported
wait time satisfaction compared to not providing this information. As shown in Table EC.1, the
mean wait time satisfaction ratings are 5.4 and 5.6 out of 10 when 40 and 41 minutes are provided,
respectively, which are both higher than the rating of 4.5 when no wait time information is given.
We conducted two sets of t-tests to compare the responses to the “Wait Satisfaction” question
under the different conditions. The p-value s of the first and second t-tests were 0.005 and 0.001,
respectively, indicating that the self-reported satisfaction is significantly higher when wait time
information is provided.
Online Experiment 2. In the second set of experiments, we used a between-subjects design with
3 levels of wait time information (40 minutes, 41 minutes, and 40-60 minutes) × 3 levels of actual
wait time (30 minutes, 50 minutes, and 70 minutes), for a total of 9 experimental conditions.
Participants were randomly assigned to one of the 9 conditions. The three different estimated wait
times are shown in Figures EC.8-EC.10, including two point estimates and one interval estimate.
Next, Let’s describe how we compute the mean and the coefficient of variation of the subjective
distribution of each participant. When we elicit a participant’s subjective distribution (shown in
Figure EC.2), we ask the participant to distribute 100 into 9 consecutive time-interval buckets.
The first bucket is less than 10 minutes, the second bucket is 10 to 19 minutes, and to the ninth
ec6 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

bucket, which is greater than 80 minutes. One can view a participant i, her input to bucket j,
j ∈ {1, 2, 3, · · · , 9}, as a weight wij , and we approximate the weighted mean µi and the coefficient
of variation (cvi ) of the subjective distribution using equation (EC.1) and (EC.3),
9
X 1
µi = wij · bj × (EC.1)
j=1
100
9
X 1
σi2 = wij · (bj − µi )2 × (EC.2)
j=1
100
σi
cvi = (EC.3)
µi

where bj is mean of the time interval j, with b = [5, 15, 25, 35, 45, 55, 65, 75, 85]. The values for µ̄
¯ reported in table EC.2 are obtained by taking averages of {µi } and {cvi }.
and cv

Table EC.2 Summary of Statistics of Online Experiment 2: µ̄ and cv


¯ are the averages of the means and
coefficient of variations of the subjective distribution (calculated using equation EC.1 and EC.3) of the
corresponding participants.

Wait Time Prediction in minutes 40 40-60


Number of observations 317 299

Mean anticipated wait time 46(12) 52(13)


C.V. anticipated wait time 0.30 0.37
(0.16) (0.20)
Actual wait time in minutes 30 50 70 30 50 70
Number of observations 97 121 99 123 90 86
Perceived information as essentially correct 87% 86% 13% 75% 95% 59%
Satisfaction on scale from 1 to 10 6.8(2.4) 5.0(2.2) 3.6(1.9) 6.2(2.7) 5.4(2.4) 4.7(1.8)

Figure EC.7 A screen showing no wait time information

Finding 4 In addition to Finding 1, we also found that when providing an estimated wait
time that is an integer multiple of 10 minutes, participants perceive the quality of the wait time
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction ec7

Figure EC.8 A screen showing the estimated wait time is 40 minutes for low-acuity patients.

Figure EC.9 A screen showing the estimated wait time is 41 minutes for low-acuity patients.

Figure EC.10 A screen showing the estimated wait time is 40 - 60 minutes for low-acuity patients.

information to be essentially correct if the actual wait time is 10 minutes less than the estimate. As
reported in Table EC.2, in ‘40’ group with either 30 minutes or 50 minutes as the actual wait time,
87% and 86% of participants, respectively, reported that the provided estimate was essentially
correct rather than misleading. This suggests that participants perceive an estimate of 40 minutes
to be essentially correct if their actual wait time falls within the range of 30 to 50 minutes.
Related to this, in ‘40-60’ group, 75%, 95%, and 59% of the participants, with either 30, 50, or
70 minutes as the actual wait time, respectively, reported the the provided estimate as essentially
correct rather than misleading. Thus, participants perceive that 40–60 minutes is mostly correct
ec8 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

when their actual wait time falls inside 30-70 minutes, i.e., 10 minutes lower than the rounded
lower bound and 10 minutes higher than the upper bound.
Finding 5 The results show that providing an interval estimate instead of a point estimate
leads to higher self-reported wait time satisfaction, especially when the actual wait time is long. As
shown in Table EC.2, when the participants’ actual wait time was 70 minutes, the 40-60’ group has
the highest average self-reported waiting satisfaction. We ran a t-test comparing the the responses
to this question for ‘40’ versus ‘40-60’ groups with actual wait time 70 minutes, the p-value is less
than 0.0001. This may be because participants perceive the ‘40-60’ estimate as essentially correct,
while the ‘40’ are perceived as incorrect when the actual wait time is 70 minutes, leading to negative
feelings when rating the waiting experience in those cases.
Finding 6 The results show that compared to providing an point estimate, providing an interval
estimate induces a subjective distribution with higher mean values, i.e., increase participants’
expected wait time. Table EC.2 reported that the averages of the µi is the the highest among
experimental conditions. We ran a t-test comparing the means of the corresponding subjective
distribution is for ‘40’ versus ‘40-60’ when wait time was 70 minutes. Results show that the means
of the induced subjective distributions are significantly different, with p-value s less than 0.0001.

EC.1.2. Details in estimating the wait time that would have occurred for patients
who LWBS
For the machine learning methods discussed in §4.1, the features that we used are listed below.
Among these variables, we found that the total number of backlog diagnostic tests being sent,
processed, and waiting to be examined by physicians had the highest prediction power.
All variables used in our model building process are shown below, and the summary statistics
of these variables are in Table EC.3.
• Experiment conditions: a categorical variable indicating the experimental condition that the
patient experiences while waiting: either D1, D2 or D3.
• pod: a categorical variables with three level indicating the time window that patient came in:
00:00 - 08:00, 08:00 - 16:00, 16:00 - 00:00.
• month: a categorical variable for the 12 months in a year.
• year: a categorical variable for the year.
• season: a categorical variable with 4 levels indicating the astronomical seasons of the year in
north California: Spring (March 21 to June 20), Summer (June 21 to September 20), Autumn
(September 21 to December 20), and Winter (December 21 to March 20).
• ESI level: a categorical variable with 5 levels indicating the triage levels from 1 (most urgent)
to 5 (least urgent).
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction ec9

• Mode of arrival: a categorical variable with 9 levels indicating ways of arrival of the patient. A
patient can arrive via ambulance, auto driven by oneself, auto accompanied by family members
or friends, bus, taxi, on foot, wheel chair, or came in escorted by police force, and others.
• Complaint types: a categorical variable with 12 different levels indicating patient’s complaint
types including “Abdominal Pain”, “Medical Clearance”, “Influenza Like Illness”, “Multiple
Complaints”, “Headache”, “Chest Pain”, “Rash”, “Back Pain”, “fever”, “Cough”, “Dizzi-
ness”, or other complaint types.
• is.weekend: a indicator variable indicating the patient arrives during weekend.
• is.fasttrack: a indicator variable indicating the patient arrives while the ’Fast Track’ is in effect.
SMMC operates a Fast Track : a dedicated set of providers treat only low-acuity patients,
mostly ESI level 4 and 5 patients.
• is.busy: a indicator variable indicating the patient arrives to the SMMC when the system is
highly congested. We define the busy statue when the level of occupancy in ED is greater than
50% of the maximum patient census over the month. This metrics is motivated by a similar
performance measure in (Kim et al. 2015).
• is.ICU: a indicator variable indicating the patient is assigned to the ICU after triage.
• is.covid: a indicator variable indicating the patient is suspected to be COVID related, i.e.,
showing covid-related symptoms.
• is.afterStateEmergency: a indicator variable indicating if the patient arrives after March 4th,
2020, which is the first time Governor Gavin Newsom declared a state of emergency due to
COVID-19.
• N : a numerical variable indicating the total number of patients in the ED upon the patient’s
arrival.
• Nlab
1
: a numerical variable indicating the total number of laboratory tests that are being
ordered but not being processed at the lab unit.
• Nlab
2
: a numerical variable indicating the total number of laboratory tests that are being
processed at the lab unit.
• Nlab
3
: a numerical variable indicating the total number of laboratory tests that are waiting to
be evaluated.
• Nrad
1
: a numerical variable indicating the total number of radiology tests that are being ordered
but not being processed.
• Nrad
2
: a numerical variable indicating the total number of radiology tests that are being pro-
cessed.
• Nrad
3
: a numerical variable indicating the total number of radiology tests that are waiting to
be evaluated in the ED.
ec10 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

Here we provide a overview for the diagnostic tests including laboratory (lab) tests and radiology
(rad) tests in SMMC. There are 147 different lab tests shown in our data set, the top five most
ordered lab tests are Urinalysis, Complete Blood Count (CBC), Comprehensive Metabolic Panel
(CMP), Pregnancy Test, Urine (HCG), Influenza A and B Antigen (AG)5 . These 5 mostly ordered
lab tests comprise 50% of the total number of lab tests ordered by physicians. There are 167 different
rad tests shown in our data set. The most ordered rad tests in SMMC are Electrocardiogram (EKG),
Posteroanterior (PA) chest view, Portable Chest X-ray, CT Scan for Head Without Contrast, and
Abdomen and Pelvis CT Scan with Contrast. These 5 mostly ordered rad tests comprise 50% of
the total number of lab tests ordered by physicians.
The best performing model among the 6 methods, see table EC.7 for details, that we considered
is constructed using random forest. The out-of-sample mean squared error is 117.2 minutes with
standard error 3.1 minutes. The hyperparameters that we tuned with 5-fold cross validation on
the training set are number of trees in the forest, the maximum number of features considered
for splitting a node, the maximum number of depths (levels) in each decision tree, the minimum
number of data points placed in a node before the node is split, the minimum number of data
points allowed in a leaf node. We use the “caret” package in R to perform the hyperparameter
tuning.

EC.1.3. Results for the Fully Specified Models


Table EC.8, EC.9, EC.10 and EC.11 show the results for the fully specified models.

EC.1.4. Placebo Tests Results


Table EC.12, EC.13, and EC.14 show the results for the placebo tests.

5
Influenza A and B are the two most common types of influenza that cause epidemic seasonal infections.
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction ec11

Table EC.3 Summary of statistics for variables in building the machine learning model for predicting the wait
time for patients who LWBS.
Factor Variables Count or Mean (Standard Deviation)
pod (00:00 - 08:00): 7256, (08:00 - 16:00): 25196, (16:00 - 00:00): 20558
Month Janurary - December, Table EC.4 shows the count of patient’s arrival per month.
Year 2019, 2020
seasons Table EC.4 shows the count of patient’s arrival per season.
ESI level 1: 47, 2: 2771, 3: 25458, 4:22834, 5:2593
Mode of Arrival Table EC.5 shows the count of each arrival method.
Experiment conditions D1:17361, D2:18253, D3:18089
Complaint types Table EC.6 shows the count of top 20 chief complaint types.
Indicator Variables
is.weekend 1 : 13293 0: 40410
is.fasttrack 1: 24796, 0: 28907
is.busy 1: 26433, 0: 27256
is.ICU 1: 447, 0: 53256
is.covid 1: 626, 0: 53077
is.afterstateEmergency 1: 18749, 0:34954
Numerical Variables
N 15.7 (7.3)
1
Nlab 28.1 (28.3)
2
Nlab 10.4 (9.1)
3
Nlab 16.1 (12.5)
1
Nrad 4.5 (1.7)
2
Nrad 0.9 (0.8)
3
Nrad 1.6 (1.2)
ec12 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

Table EC.4 The count of patient’s arrival per month during April 1st 2019 to December 31st, 2020, and
number of COVID-19 related cases (the sum of COVID suspected and COVID confirmed cases with ICD.9.code
“U07.1”, and “U07.2”), during each month.

year month patient arrivals COVID-19 related


2019 - 4 3310 0
2019 - 5 3328 0
2019 - 6 3141 0
2019 - 7 3097 0
2019 - 8 3100 0
2019 - 9 3080 0
2019 - 10 3192 0
2019 - 11 3061 0
2019 - 12 3158 0
2020 - 1 3475 0
2020 - 2 3156 0
2020 - 3 920 3
2020 - 4 675 2
2020 - 5 1793 91
2020 - 6 2017 57
2020 - 7 2180 99
2020 - 8 2229 102
2020 - 9 2323 86
2020 - 10 2397 59
2020 - 11 2062 122
2020 - 12 2009 5

Table EC.5 Mode of arrivals for patients who went in SMMC for ED service during April 1st 2019 to
December 31st, 2020

Mode of Arrival count


Ambulance 5148
Auto driven by oneself 15444
By foot 5551
Wheel Chair 538
Auto accompanied by family or friends 23430
Bus 978
Police 1149
Taxi (including services provided by ride sharing companies) 821
Others 635
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction ec13

Table EC.6 Count of chief complaint types for patients who went in SMMC for ED service during April 1st
2019 to December 31st, 2020

Top 20 Chief Complaint Types count


Abdominal Pain 3497
Medical Clearance 2823
Influenza like illness 2528
Chest pain 1841
Headache 1530
Rash 1317
Cough 1102
Multiple Complaints 975
Back pain 973
Dizziness 905
Shortness of breath 828
Fever 824
Dysuria 747
Sore throat 726
Alcohol intoxication 712
Flank pain 579
Dental pain 577
Medication Refill 546
Epigastric pain 532
Acute Respiratory Illness 511

Table EC.7 Test set mean squared error (and standard error) of methods for predicting wait time for patients
in SMMC.

Predicting Method Mean Squared Error (standard error)


Lasso 978.2 (30.1)
Ridge 914.3 (31.3)
Elastic Net 654.3 (58.9)
Neural Network (with maximum 2 hidden layers) 543.2 (25.7)
XGBoost 317.2 (11.5)
Random Forest 117.2 (3.1)
ec14 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

Table EC.8 The effect of providing wait time information on the likelihood that a low-acuity patient will LWBS
and on the likelihood that a ESI 4|5 minor patient will LWBS.

Dependent variable:
ESI 3|4|5 logit(LWBS) ESI 4|5 minor logit(LWBS)
(1) (2) (3) (4) (5) (6)
∗ ∗∗ ∗∗ ∗∗ ∗∗∗
WT provision indicator −0.155 −0.174 −0.186 −0.356 −0.387 −0.396∗∗∗
(0.091) (0.087) (0.079) (0.103) (0.054) (0.058)
Actual W 0.009∗∗∗ 0.010∗∗∗ 0.030∗∗ 0.020∗∗
(0.003) (0.003) (0.009) (0.009)
WT provision indicator * Actual W 0.001∗∗ 0.001∗∗ 0.003∗ 0.003∗
(0.0003) (0.0003) (0.001) (0.001)
NumPatientsFirstHour −0.164∗∗∗ −0.307∗∗
(0.043) (0.080)
Pod (08:00 - 16:00) 0.818∗∗ 1.705∗
(0.349) (1.074)
Pod (16:00 - 24:00) 0.500 1.414
(0.346) (1.369)
Is.Male 0.076 −0.116
(0.194) (0.224)
Age −0.009 −0.011∗
(0.005) (0.006)
Triage pain level −0.174 −0.114
(0.365) (0.353)
Is.Accompanied −0.899∗∗∗ −0.734∗∗
(0.247) (0.292)
Is.AutoSelf −0.229 −0.381
(0.230) (0.258)
Is.Weekend −0.271 −0.247
(0.244) (0.266)
Is.AfterStateEmergency −0.225∗∗ −0.314
(0.095) (0.253)
ESI 4|5 1.225∗∗∗
(0.231)
Constant −5.643∗∗∗ −6.131∗∗∗ −6.426∗∗∗ −3.848∗∗∗ −4.189∗∗∗ −2.889∗∗∗
(0.409) (0.572) (1.294) (0.176) (0.263) (0.511)
Observations 9,787 9,787 9,787 581 581 581
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01
Dependent variable:
Table EC.9

logit(LWBS)
Type 1 error No error Type 2 error worse

(1) (2) (3) (1) (2) (3) (1) (2) (3)


∗ ∗ ∗ ∗∗ ∗∗ ∗∗ ∗∗ ∗∗
D2 −0.201 −0.208 −0.209 −0.351 −0.347 −0.343 0.308 0.303 0.305∗∗
(0.205) (0.209) (0.203) (0.203) (0.207) (0.205) (0.177) (0.183) (0.181)
D3 −0.271∗ −0.231∗ −0.232∗ −0.369∗∗∗ −0.368∗∗∗ −0.368∗∗∗ 0.217∗∗∗ 0.218∗∗∗ 0.218∗∗∗
(0.235) (0.227) (0.203) (0.101) (0.105) (0.104) (0.113) (0.107) (0.103)
Actual W 0.009 0.010 0.007∗∗ 0.007∗∗ 0.005∗∗∗ 0.005∗∗∗
(0.019) (0.016) (0.015) (0.005) (0.002) (0.002)
NumPatientsFirstHour −0.183∗∗∗ −0.210∗∗∗ −0.310∗∗∗
(0.043) (0.063) (0.086)
Pod (08:00 - 16:00) 0.569 0.571 0.603
(0.634) (0.627) (0.638)
Pod (16:00 - 24:00) 0.066 0.057 0.056
(0.197) (0.192) (0.323)
Is.Weekend −0.406 −0.206 −0.306
(0.478) (0.472) (0.558)
Is.Male 0.222 0.231 0.237
(0.273) (0.293) (0.301)
Age −0.007 −0.007 −0.007
(0.008) (0.008) (0.009)
Is.Accompanied −1.309∗∗∗ −1.012∗∗∗ −1.827∗∗∗
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

(0.340) (0.340) (0.373)


Is.AutoSelf −0.720∗∗ −0.810∗∗ −0.853∗∗
error, and those who experience a Type 2 error.

(0.395) (0.326) (0.526)


ESI 4|5 1.373∗∗∗ 1.439∗∗∗ 1.795∗∗∗
(0.432) (0.312) (0.532)
Is.AfterStateEmergency −1.107∗ −1.189∗ −1.237∗
(0.610) (0.593) (0.621)
Constant −4.428∗∗∗ −5.112∗∗∗ −5.198∗∗∗ −4.478∗∗∗ −6.703∗∗∗ −6.098∗∗∗ −4.258∗∗∗ −5.651∗∗∗ −5.147∗∗∗
(0.170) (0.268) (0.734) (0.390) (0.291) (0.561) (0.208) (0.351) (0.632)
Observations 5,578 5,578 5,578 7,268 7,268 7,268 4,715 4,715 4,715
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01
patients who fall into three different scenarios: those who experience a Type 1 error, those who experience no
wait time interval (D3) on low-acuity patients’ likelihood of leaving the ED without being seen using data from
The effect of providing no information versus a single number estimate (D2) versus an estimated
ec15
ec16 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

Table EC.10 The effect of experiencing an over-estimation error or an under-estimation error on the likelihood
that ESI 3|4|5 patients and on ESI 4|5 minor patients will leave the ED without being seen by physicians.

Dependent variable:
ESI 3|4|5 logit(LWBS) ESI 4|5 minor logit(LWBS)
(1) (2) (3) (4) (5) (6)
∗∗ ∗∗ ∗∗ ∗∗ ∗∗
Is.Type1 -0.404 -0.561 -0.461 -0.627 -0.674 -0.741∗∗
(0.213) (0.213) (0.214) (0.185) (0.190) (0.191)
Is.Type2 0.903∗∗ 0.919∗∗ 0.876∗∗ 0.901∗∗ 0.981∗∗ 0.986∗∗
(0.211) (0.215) (0.215) (0.102) (0.103) (0.102)
Actual W 0.0003∗∗∗ 0.0004∗∗∗ 0.0003∗∗∗ 0.0003∗∗∗
(0.0001) (0.0001) (0.0001) (0.0001)
NumPatientsFirstHour 0.0002 0.001∗
(0.0002) (0.0003)
Pod (08:00 - 16:00) 0.741 1.019∗
(0.861) (1.069)
Pod (16:00 - 24:00) 0.689 1.571∗
(0.698) (0.867)
Is.Weekend −0.063 −0.094
(0.300) (0.353)
Is.Male 0.158 0.185
(0.250) (0.298)
Age −0.011 −0.017∗∗
(0.007) (0.009)
Is.Accompanied −1.246∗∗∗ −1.488∗∗∗
(0.310) (0.393)
Is.AutoSelf −0.495∗ −0.286
(0.291) (0.328)
ESI 4|5 0.915∗∗∗
(0.277)
Constant −7.348∗∗∗ −7.677∗∗∗ −7.176∗∗∗ −7.401∗∗∗ −7.514∗∗∗ −7.185∗∗∗
(0.103) (0.147) (1.351) (1.641) (1.214) (1.037)
Observations 5,853 5,853 5,853 357 357 357
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction ec17

Table EC.11 The effects of of over-estimation error and under-estimation error on the likelihood of low-acuity
patients leaving the ED without being seen by physicians

Dependent variable:
ESI 3|4|5 logit(LWBS) ESI 4|5 minor logit(LWBS)
∗∗ ∗∗ ∗∗
∆ × IIs.Type1 −0.015 −0.013 −0.013 −0.022∗∗ −0.024∗∗ −0.021∗∗
(0.008) (0.008) (0.007) (0.010) (0.010) (0.009)
∆ × IIs.Type2 0.022∗∗ 0.033∗∗ 0.029∗∗ 0.031∗∗ 0.034∗∗ 0.031∗∗
(0.008) (0.012) (0.013) (0.015) (0.016) (0.016)
Actual W 0.0004 0.0004 0.0002 0.0003
(0.0003) (0.0003) (0.0002) (0.0002)
NumPatientsFirstHour 0.0002 0.001
(0.0002) (0.0003)
Pod (08:00 - 16:00) 0.678 1.902∗
(0.873) (1.079)
Pod (16:00 - 24:00) 0.658 1.466∗
(0.720) (0.886)
Is.Weekend −0.091 −0.139
(0.299) (0.352)
Is.Male 0.172 0.209
(0.250) (0.298)
Age −0.011 −0.017∗
(0.007) (0.009)
Is.Accompanied −1.250∗∗∗ −1.496∗∗∗
(0.309) (0.393)
Is.AutoSelf −0.514∗ −0.315
(0.291) (0.328)
ESI 4|5 0.897∗∗∗
(0.277)
Constant −4.591∗∗∗ −5.059∗∗∗ −5.026∗∗∗ −4.142∗∗∗ −4.526∗∗∗ −4.079∗∗∗
(0.204) (0.406) (0.778) (0.235) (0.472) (0.885)
Observations 5,853 5,853 5,853 357 357 357
∗ ∗∗ ∗∗∗
Note: p<0.05; p<0.01; p<0.001
ec18 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

Table EC.12 Placebo tests using patients’ data during the time that the TV screen is off: The effect of
providing wait time information on the likelihood that a low-acuity patient will LWBS and on the likelihood that a
ESI 4|5 minor patient (ESI level 4|5 patient with chief of complaint type “dental pain” or “medication refill”) will
LWBS.

LWBS TV OFF
ESI 3|4|5 logit(LWBS) ESI 4|5 minor logit(LWBS)
(1) (2) (3)
WT provision indicator 0.052 −0.042 0.021 0.456 0.250 0.240
(0.120) (0.120) (0.121) (0.822) (1.212) (1.232)
Actual W 0.008∗∗∗ 0.009∗∗∗ 0.030∗∗ 0.020∗∗
(0.002) (0.002) (0.002) (0.002)
WT provision indicator * Actual W 0.001 0.001 0.007 0.008
(0.002) (0.002) (0.023) (0.024)
NumPatientsFirstHour −0.098∗∗∗ −0.371∗∗
(0.023) (0.179)
Pod (08:00 - 16:00) 0.911∗∗∗ 1.911∗
(0.218) (1.218)
Pod (16:00 - 24:00) 0.841∗∗∗ 1.841∗
(0.214) (1.214)
Is.Male 0.227∗∗ −0.347
(0.103) (0.776)
Is.Weekend −0.431∗∗ −0.580
(0.232) (0.930)
Age −0.006∗∗ −0.032
(0.003) (0.024)
Triage pain level −0.097 −0.194
(0.301) (0.386)
Is.Accompanied −0.984∗∗∗ −0.901∗∗∗
(0.126) (0.100)
Is.AutoSelf −0.309∗∗ −0.749∗
(0.125) (0.342)
Is.AfterStateEmergency −0.114 −0.214
(0.132) (0.253)
ESI 4|5 1.150∗∗∗
(0.119)
Constant −4.348∗∗∗ −4.677∗∗∗ −5.176∗∗∗ −4.401∗∗∗ −4.514∗∗∗ −4.185∗∗∗
(0.083) (0.117) (0.311) (0.711) (1.074) (1.017)
Observations 35,041 35,041 34,924 5347 5347 5347
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01
e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction ec19

Table EC.13 Placebo tests: The effect of waiting less or longer than the predicted wait time on the likelihood
that a low-acuity patient and a ESI 4|5 minor patient will leave the ED without being seen by physicians.

LWBS TV OFF
ESI 3|4|5 logit(LWBS) ESI 4|5 minor logit(LWBS)
(1) (2) (3) (4) (5) (6)
Is.Type1 0.224 0.287 0.276 0.331 -0.131 -0.247
(0.719) (0.720) (0.720) (0.508) (0.503) (0.507)
Is.Type2 0.459 0.312 0.297 0.201 0.381 0.227
(0.716) (0.724) (0.725) (0.502) (0.603) (0.602)
Actual W 0.0002∗∗∗ 0.0002∗∗∗ 0.0002∗∗∗ 0.0002∗∗∗
(0.00004) (0.00004) (0.00004) (0.00005)
NumPatientsFirstHour 0.0002 0.0001
(0.0001) (0.0002)
Pod (08:00 - 16:00) 0.870 0.976
(0.521) (0.668)
Pod (16:00 - 24:00) 0.374 0.559
(0.430) (0.468)
Is.Weekend −0.243 −0.094
(0.184) (0.353)
Is.Male 0.084 0.182
(0.146) (0.165)
Age −0.003 −0.003
(0.004) (0.004)
Is.Accompanied −0.345∗∗ −0.626∗∗
(0.098) (0.196)
Is.AutoSelf −0.375∗∗ −0.606∗∗∗
(0.183) (0.207)
Is.AfterStateEmergency 0.216 0.276
(0.175) (0.195)
ESI4|5 1.253∗∗∗
(0.182)
Constant −4.700∗∗∗ −5.048∗∗∗ −6.388∗∗∗ −4.7380∗∗∗ −5.037∗∗∗ −5.313∗∗∗
(0.710) (0.714) (0.833) (0.812) (0.754) (0.894)
Observations 17,747 17,747 17,681 2327 2327 2327
∗ ∗∗ ∗∗∗
Note: p<0.1; p<0.05; p<0.01
ec20 e-companion to : ED Experiment in Displaying an Algorithmic Wait Time Prediction

Table EC.14 Placebo tests: The effect of the magnitude of the error between the predicted wait time and the
patient’s actual wait time on the likelihood that a low-acuity patient and a ESI 4|5 minor patient will leave the
ED without being seen by physicians.

LWBS TV OFF
ESI 3|4|5 logit(LWBS) ESI 4|5 minor logit(LWBS)
(1) (2) (3) (4) (5) (6)
∆ × Iis.Type1 0.011 −0.0004 0.0002 0.016 −0.001 −0.00004
(0.009) (0.007) (0.008) (0.018) (0.009) (0.009)
∆ × Iis.Type2 0.013 −0.007 −0.006 -0.012 −0.014 −0.016
(0.008) (0.007) (0.007) (0.023) (0.027) (0.028)
Actual W 0.0003∗∗∗ 0.0003∗∗∗ 0.0004∗∗∗ 0.0005∗∗∗
(0.0001) (0.0001) (0.0001) (0.0001)
NumPatientsFirstHour 0.0002 0.0002
(0.0001) (0.0002)
Pod (08:00 - 16:00) 0.808 0.886
(0.535) (0.590)
Pod (16:00 - 24:00) 0.386 0.333
(0.443) (0.486)
Is.Weekend −0.244 −0.090
(0.183) (0.195)
Is.Male 0.102 0.176
(0.145) (0.165)
Age −0.003 −0.003
(0.004) (0.004)
Is.Accompanied −0.359∗∗ −0.439∗∗
(0.177) (0.196)
Is.AutoSelf −0.396∗∗ −0.609∗∗
(0.183) (0.307)
Is.AfterStateEmergency 0.194 0.204
(0.175) (0.193)
ESI 4| 5 1.364∗∗∗
(0.179)
Constant −4.578∗∗∗ −5.202∗∗∗ −6.463∗∗∗ −5.531∗∗∗ −6.304∗∗∗ −6.791∗∗∗
(0.119) (0.234) (0.495) (0.203) (0.234) (0.495)
Observations 17,747 17,747 17,681 2327 2327 2327
∗ ∗∗ ∗∗∗
Note: p<0.05; p<0.01; p<0.001

You might also like