0% found this document useful (0 votes)

5 views

Use of Statistical Analysis of Cytologic Interpretation to Determine the Causes of Interobserver Disagreement and in Quality Improvement

This study investigates interobserver disagreement in cytologic interpretation using statistical analyses on 80 cervicovaginal smears, primarily diagnosed as ASCUS. Results showed significant differences in diagnostic classification among observers, with poor interobserver agreement confirmed by kappa analysis. The findings suggest that variations in diagnostic thresholds and accuracy can be distinguished through distribution, threshold, and ROC curve analyses, which may aid in quality improvement efforts in cytopathology.

Uploaded by

sk LO

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Use of Statistical Analysis of Cytologic Interpretation to Determine the Causes of Interobserver Disagreement and in Quality Improvement

Uploaded by

sk LO

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

212

CANCER
CYTOPATHOLOGY

Use of Statistical Analysis of Cytologic Interpretation

to Determine the Causes of Interobserver
Disagreement and in Quality Improvement

Andrew A. Renshaw, M.D. BACKGROUND. Disagreements in cytologic interpretation can have several causes,
Kenneth R. Lee, M.D. including differences in diagnostic threshold and diagnostic accuracy. These can
Scott R. Granter, M.D. be distinguished by a combination of statistical analyses.
METHODS. For demonstration purposes, a nonrandom collection of 80 cervicovagi-
Division of Cytology, Departments of Pathology, nal smears, the majority of which (74) were originally diagnosed as atypical cells
Brigham & Women’s Hospital and Harvard Med- of undetermined significance (ASCUS), were reviewed by 3 separate observers and
ical School, Boston, Massachusetts. classified as either negative, negative and reactive, ASCUS favor reactive, ASCUS
not otherwise specified, ASCUS suggestive of a squamous intraepithelial lesion
(SIL), low grade SIL, or high grade SIL. The results were compared with correspond-
ing biopsies and analyzed with distribution analysis, the kappa statistic, threshold
analysis, and receiver operating characteristic (ROC) curve analysis.
RESULTS. Distribution analysis of diagnoses from the three observers demonstrated
statistically significant differences in how cases were classified and a low level
of agreement. Kappa analysis confirmed a very poor interobserver agreement.
Threshold analysis revealed that one observer used a threshold between negative
and ASCUS that was statistically more specific but less sensitive than the other
observers. ROC curve analysis showed that another observer was more accurate
than this observer.
CONCLUSIONS. Variation in cytologic interpretation may have several causes. Distri-
bution, threshold, and ROC analysis allow distinction between differences in diag-
nostic accuracy and diagnostic thresholds. This approach to analyzing cytologic
interpretation may be useful for quality improvement efforts. Cancer (Cancer Cyto-
pathol) 1997;81:212–9. q 1997 American Cancer Society.

KEYWORDS: cytopathology, papanicolaou smear, cervicovaginal, kappa statistic,

receiver operating characteristic curves, statistics.

T he interpretation of cytologic specimens may vary between labora-

tories and between observers. For example, in cervicovaginal (Pa-
panicolaou [Pap]) smears, the incidence of atypical squamous cells
of undetermined significance (ASCUS) varies considerably between
laboratories,1 as does the rate of squamous intraepithelial lesions (SIL)
on subsequent biopsies.2 – 4 Although the interobserver agreement for
Pap smears is good over the entire range of diagnoses,5 the interob-
server agreement for ASCUS is poor.3,5 – 7 Although the causes of this
variability currently are unknown, there are several different statistical
Address for reprints: Andrew Renshaw, M.D.,
methods available to analyze this problem. These include distribution
Department of Pathology, Brigham & Women’s
Hospital, 75 Francis St., Boston, MA 02115. analysis, kappa statistics, threshold analysis, and receiver operating
characteristic (ROC) curve analysis.
Received April 17, 1997; revision received June Distribution analysis is performed by assigning each diagnostic
10, 1997, accepted June 12, 1997. category a numeric value. The resulting distribution of diagnoses usu-

q 1997 American Cancer Society

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

Statistical Analysis of Cytologic Results/Renshaw et al. 213

ally will not be normal, and analysis of the distribution TABLE 1

usually will require nonparametric methods. The Distribution of Diagnoses
Mann – Whitney U test is a commonly used method
Observer 1 Observer 2 Observer 3
to compare medians of nonparametric distributions.
Using this test, one can determine whether the median Negativea SIL Negative SIL Negative SIL
diagnosis of one observer is different from another,
and whether one observer is more likely to diagnose Negativeb 3 1 4 0 9 4
N-rct 8 0 3 1 11 10
a lesion with a greater degree of atypia compared with
A-rct 14 10 5 3 6 7
another observer. Atypical 16 12 19 16 10 3
The most common method for examining the de- A-sugsil 4 4 15 8 9 5
gree of variability between observers is with the kappa LSIL 4 3 2 3 3 3
statistic.5,8,9 This statistic measures the amount of diag- HSIL 0 1 0 1 0 0
nostic agreement between several observers or be-
SIL: squamous intraepithelial lesion; N-rct: negative with reactive changes; A-rct: atypical squamous
tween repeated observations by the same observer. cells of undetermined significance favor reactive changes; Atypical: atypical squamous cells of undeter-
Although the kappa statistic and a consensus diagnosis mined significance not otherwise specified; A-sugsil: atypical squamous cells of undetermined signifi-
as a gold standard have been used as a measure of cance suggestive of squamous intraepithelial lesion; LSIL: low grade squamous intraepithelial lesion;
‘‘accuracy,’’ 5 this is not correct. Accuracy is a measure HSIL: high grade squamous intraepithelial lesion.
a
Biopsy diagnosis.
of how often a particular diagnosis agrees with a stan- b
Cytologic diagnosis.
dard. That standard must be measured independently
of the test that is being performed. Using a consensus
diagnosis to compare diagnoses made by the same
method as the consensus is a measure of the spread continuous display of how sensitivity and specificity
or variability of diagnoses, not accuracy. To illustrate interact. The area under such a curve is a measure of
this further, it is possible for most observers to agree the overall accuracy of an observer or test. ROC analy-
on an answer (high kappa value) and still be wrong sis has been used in nongynecologic19 – 21 and gyneco-
(low accuracy). However, when a consensus diagnosis logic cytology,22,23 histology,24 and interlaboratory set-
is the gold standard and kappa analysis is the test, by tings.25 A limitation of ROC analysis is that although
definition, the consensus diagnosis cannot have a low this method may be an excellent measure of overall
accuracy. accuracy, it does not determine the accuracy for an
Examining the sensitivity and specificity of diag- individual point or threshold along that curve. For ex-
noses at particular thresholds (threshold analysis) is ample, it is possible for one observer to have a higher
one of the most common methods of evaluating cyto- overall accuracy whereas another observer is more ac-
logic interpretations.10 The sensitivity and specificity curate at a particular threshold.
will be affected directly by the point at which the low To assess and illustrate the advantages of this
and high threshold for a diagnosis is set. A lower combined statistical method to distinguish causes of
threshold will have a higher sensitivity, and a higher diagnostic disagreement, the authors determined the
threshold with have a higher specificity. For example, sources of interobserver variability and differences in
assuming two observers are equally accurate and that diagnostic accuracy using the diagnosis of ASCUS on
one observer’s criteria for ASCUS are more sensitive Pap smears as a model.
and less specific than the other, it can be concluded
that this observer has a lower threshold for ASCUS. METHODS
In cytology practice there are often multiple Case Selection and Review
thresholds, or diagnoses. Criteria that result in a higher Cases were retrieved from the files of the Cytology
sensitivity often result in a lower specificity. In this Division of the Department of Pathology, Brigham &
situation, with multiple thresholds and variable sensi- Women’s Hospital in Boston, Massachusetts. Eighty
tivities and specificities, it can be difficult to determine cervicovaginal smears with biopsy follow-up within 90
whether one set of criteria is more accurate than an- days of the smear were selected. Seventy-four were
other. To assess the overall accuracy of cytologic inter- originally diagnosed as ASCUS, 2 as negative, 2 as low
pretation, ROC curve analysis is the method of grade SIL (LSIL), and 2 as high grade SIL (HSIL). Forty-
choice.11 – 18 In brief, ROC curves are constructed by eight smears were found to be SIL on subsequent bi-
calculating the true-positive rate (sensitivity) and opsy (31 LSIL and 17 HSIL) and 32 were negative. Cases
false-positive rate (1-specificity) of different observers were selected to be a mixture of those in which the
or tests at multiple thresholds. These different thresh- original favored cytologic diagnosis (i.e., favor SIL or
olds are plotted, and the points connected to give a favor reactive) correlated with the corresponding bi-

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

214 CANCER (CANCER CYTOPATHOLOGY) August 25, 1997 / Volume 81 / Number 4

FIGURE 1. Distribution of diagnoses for the three observers.

TABLE 2 result. Cases were classified as negative, negative with

Diagnoses of the Three Observersa reactive changes, ASCUS favor reactive changes, AS-
CUS not otherwise specified (NOS) (atypical), ASCUS
Observer 1 Observer 2 Observer 3
suggestive of a SIL, LSIL, or HSIL. For statistical analy-
Mean 3.7 4.1 3.1 sis, these categories were assigned a number on a scale
Median 4.0 4.0 3.0 from 1 to 7, respectively. In each case the original
Variance 1.7 1.4 2.5 diagnosis from the biopsy was used as the gold stan-
a dard. Patients with a biopsy diagnosis of either LSIL
negative Å 1; negative with reactive changes Å 2; atypical squamous cells of undetermined significance
favor reactive changes Å 3; atypical squamous cells of undetermined significance not otherwise speci- or HSIL were interpreted as having disease, whereas
fied Å 4; atypical squamous cells of undetermined significance suggestive of a squamous intraepithelial patients with other diagnoses were interpreted as not
lesion Å 5; low grade squamous intraepithelial lesion Å 6; high grade squamous intraepithelial lesion having disease.
Å 7.

Statistical Analysis
Distribution analysis was performed by assigning a
opsy and those in which it did not. This selection bias numeric value (1 to 7) to each diagnosis (negative
was used to magnify any differences in interobserver through HSIL, respectively) and performing a Mann –
variability, diagnostic threshold, or diagnostic accu- Whitney U test to determine whether the median diag-
racy. When reviewing the cases, each author knew that nosis was different. Kappa analysis was performed us-
the original diagnosis for most cases was ASCUS, that ing a one-tailed test model. Kappa values õ 0.40 repre-
each had a subsequent biopsy that was either negative sented poor agreement, and values ú 0.60 represented
or SIL, and that the cases were selected to be difficult. good agreement. Statistical analysis of diagnostic
However, the authors did not know the actual distribu- thresholds was performed using a two-tailed chi-
tion of diagnoses, what was favored, or the biopsy re- square test. ROC analysis was performed using the
sults. All cases were diagnosed according to the CORROC2 program available from the Department of
Bethesda system.26 Biopsy reports were used as the Radiology at the University of Chicago, written in part
gold standard; individual biopsies were not reviewed by Dr. Charles E. Metz and Helen B. Kronan.27 The
for this study. program calculates a maximum likelihood estimate
All cases were examined by each author and with- based on an effective pair of underlying bivariate-nor-
out clinical information or knowledge of the biopsy mal decision variable distributions. Statistical differ-

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

Statistical Analysis of Cytologic Results/Renshaw et al. 215

FIGURE 2. Level of agreement between the three observers. SIL: squamous intraepithelial lesion.

ences in accuracy are calculated using a two-tailed TABLE 3

univariate Z score test of the differences between the Comparison between the Three Observers’ Diagnoses
areas under the two ROC curves.
Observer 1 vs. 2 Observer 1 vs. 3 Observer 2 vs. 3

RESULTS P Valuea 0.009 0.018 0.00008

The distribution of diagnoses for the three observers Kappa value 0.37 0.21 0.15
is shown in Figure 1 and Table 1. Statistical analysis a
Two-tailed Mann–Whitney U test for differences in medians between the three observers’ diagnoses.
of these responses is shown in Table 2. Because none
of the distributions in Figure 1 was normal (normal
distribution test or Z test), variance rather than stan-
dard deviations were reported. A two-tailed Mann – TABLE 4
Whitney U test for the three observers, Table 3, Sensitivity and Specificity for the Prediction of SIL on Biopsy of the
showed that the median diagnosis of each was signifi- Three Observers at Different Thresholds
cantly different from the others. That is, Observer 3
Observer 1 Observer 2 Observer 3
was statistically more likely to diagnose a case as a
lesser degree of atypia than either Observer 1 and 2, Sens Spec Sens Spec Sens Spec
and Observer 2 was statistically more likely to diagnose
a case with a higher degree of atypia than either Ob- ASCUS 97 23 97 15 56 42
server 1 or 3. SIL 16 92 13 96 9 94
The percentage of cases in which all three, two of
SIL: squamous intraepithelial lesion; ASCUS: atypical squamous cells of undetermined significance;
three, or none of the observers had the same cytologic Sens: sensitivity; Spec: specificity.
diagnosis is shown in Figure 2. Results were stratified
according to biopsy diagnoses. Complete agreement
was obtained in only 11% of cases, and only 2 cases
that were originally diagnosed as ASCUS. Using all 7 shown in Table 3. The highest level of agreement (be-
categories and all 3 observers, the kappa value was tween Observers 1 and 2) had a kappa value of only
0.19. To increase the level of interobserver agreement, 0.37, which was poor (Table 3).
the diagnostic categories were reduced to only nega- Using only the three cytologic categories (negative,
tive, ASCUS, and SIL; the resulting kappa values are ASCUS, and SIL), the sensitivity and specificity for a diag-

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

216 CANCER (CANCER CYTOPATHOLOGY) August 25, 1997 / Volume 81 / Number 4

FIGURE 3. Receiver operator characteristic curves for the three observers.

TABLE 5 sensitive than Observer 3. Analysis of the differences in

Statistical Significance of the Differences in Sensitivity and Specificity specificity at different thresholds showed that Observer
at Two Thresholds for the Three Observersa
3 was significantly more specific for SIL on biopsy at the
Observer 1 vs. 2 Observer 1 vs. 3 Observer 2 vs. 3 negative/ASCUS threshold than either Observer 1 or 2.
The ROC curves for the three observers are shown
Sensitivity in Figure 3. A summary of the observers overall accuracy
Negative vs. ASCUS 0.43 0.08 0.006 is shown in Table 6. The significance of the differences
ASCUS vs. SIL 0.67 0.67 0.67
in accuracy between each of the two curves is shown
Specificity
Negative vs. ASCUS 1.0 0.0004 0.0004 in Table 7. The accuracy of Observer 1 was statistically
ASCUS vs. SIL 0.72 0.71 0.69 significantly higher than that of Observer 3.

ASCUS: atypical squamous cells of undetermined significance; SIL: squamous intraepithelial lesion.
a
Two-tailed P values. DISCUSSION
This study illustrates the use of several different meth-
ods to examine interobserver variability and accuracy
in the interpretation of cervicovaginal smears. Al-
nosis of SIL on biopsy for each observer with a diagnosis though cases were not randomly selected, they pro-
of ASCUS or SIL is shown in Table 4. These thresholds vided an excellent group of cases to demonstrate the
represent different points along the ROC curve as dem- usefulness of these various methods in determining
onstrated in Figure 3. Specifically, when the threshold the causes of interobserver variability. A valid interpre-
was set as ASCUS, all cytologic cases diagnosed as nega- tation of ASCUS variability awaits application of these
tive were interpreted as negative and all cytologic cases methods of analysis to a larger, unbiased, and random
diagnosed as either ASCUS or SIL were interpreted as sample,28 – 32 which the authors currently are trying to
positive. In contrast, when the threshold was set at SIL, assemble.
all negative and ASCUS cytology cases were interpreted The current data illustrate the fact that poor inter-
as negative and only cytology cases interpreted as SIL observer variability may have several causes. The dis-
were interpreted as positive. Analysis of the differences tribution analysis data demonstrates that each of the
in these thresholds is shown in Table 5. Observer 2 was three observers categorized these cases differently.
significantly (P õ 0.05) more sensitive in predicting SIL Observer 3 was statistically more likely to categorize
on biopsy at the negative/ASCUS threshold than Ob- a case as less atypical than the other observers, and
server 3, and Observer 1 was almost statistically more Observer 2 was more likely to categorize a case as more

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

Statistical Analysis of Cytologic Results/Renshaw et al. 217

TABLE 6 Several additional observations can be reached

Accuracya for the Three Observers from this analysis. First, although interobserver agree-
ment is desirable, it is not necessary for the interpreta-
Observer 1 Observer 2 Observer 3
tion of different observers to be relatively accurate. In
Accuracy 0.66 0.59 0.51 this study all three observers were more accurate than
Standard deviation 0.06 0.07 0.07 chance (admittedly a very low level of overall accu-
a
racy), but interobserver agreement was very low. Simi-
Accuracy as determined by area under the curve.
larly, others have shown that it is possible, although
admittedly uncommon, to have good diagnostic repro-
ducibility and accuracy without as good interobserver
agreement.33 For example, consider ten observers ex-
TABLE 7
Statistical Significance of the Differences Between the ROC Curvesa amining ten cases and each diagnosing nine correctly
and one incorrectly, but the case that is diagnosed
Observer 1 vs. 2 Observer 1 vs. 3 Observer 2 vs. 3 incorrectly is different for each observer. The test is
repeated and the exact same results are achieved. In
P value 0.24 0.017 0.12
this scenario, there is poor interobserver agreement
ROC: receiver operating characteristic. (no case has concordance of all 10 observers; overall
a
Two-tailed P value. concordance, 80%), yet each observer was 90% accu-
rate and 100% reproducible. Although this scenario is
somewhat unlikely, it serves to illustrate that accuracy,
reproducibility, and interobserver agreement are not
atypical than the other observers. The percentage of synonymous. The results of kappa analysis may be
cases with complete agreement among all three ob- misleading if the causes of the variability are not inves-
servers was low, and kappa analysis confirmed that the tigated. Of course, interobserver agreement is neces-
interobserver agreement was poor, even after reducing sary to establish criteria for diagnosis, may have im-
the number of categories from seven to three. Thus, portant medicolegal implications, and is desirable for
as shown by others,5,7 the level of interobserver agree- a laboratory in light of the many quality control efforts
ment for the diagnosis of ASCUS in cervicovaginal that rely on review of cases by more than one observer.
smears was poor. However, kappa analysis could not However, interobserver agreement is not necessary or
determine the cause of the variability. This requires sufficient for diagnostic accuracy.
threshold and ROC curve analysis. In addition, simple analysis of the distribution of
Despite the poor interobserver agreement, the di- diagnoses can aid in understanding the source of dis-
agnostic thresholds set by Observers 1 and 2 (as ob- agreement determined by kappa analysis. The distri-
served in Tables 4, 5, and 6) were very similar. In con- bution of diagnoses clearly showed that Observer 2
trast, Observer 3 was using a statistically different was more likely to call a smear more atypical, and
threshold than the other two observers. This clearly Observer 3 was more likely to diagnose a smear as less
was a source of some of the interobserver variability atypical than the other two observers. The results of
in this study. However, it was not possible from the this type of analysis delineate in what way an observer
threshold data alone to determine whether there were should change his or her interpretation to more closely
differences in diagnostic accuracy. In this situation, agree with others. If attaining a higher level of interob-
ROC analysis may provide an answer. In this study, server reproducibility is part of a laboratory’s goals,
ROC analysis confirmed that all three observers were these results may be very useful in quality improve-
more accurate at classifying the cases than chance ment efforts.
alone (area Å 0.50). However, Observer 1 (and not Although one is able to determine the thresholds
Observer 2) was significantly more accurate at classify- and accuracy of different observers from these data,
ing the cases than Observer 3. ROC analysis combined one is not able to determine which set of diagnostic
with threshold analysis demonstrates that the thresh- criteria are optimum or most clinically useful. The di-
olds used by Observers 2 and 3 represent a trade-off agnostic usefulness of a particular diagnosis depends
between sensitivity and specificity, whereas the on the situation. For example, although having a high
thresholds established by Observer 1 were more accu- accuracy is desirable, it may not always be clinically
rate than those of Observer 3. Thus, in this study, poor useful. An observer may be more accurate overall be-
interobserver agreement was the result of differences cause he or she is very good at distinguishing ASCUS
in both diagnostic thresholds as well as in diagnostic favor reactive from ASCUS NOS, but this higher accu-
accuracy. racy may not be useful to the clinician. Similarly,

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

218 CANCER (CANCER CYTOPATHOLOGY) August 25, 1997 / Volume 81 / Number 4

whether a threshold with a higher sensitivity is more squamous cells of undetermined significance: correlative
histologic and follow-up studies from an academic center.
useful than one with a higher specificity generally de-
Diagn Cytopathol 1997;16:1–7.
pends on whether the test is being used as a screening 3. Howell LP, Davis RL. Follow-up of Papanicolaou smears di-
or a diagnostic test. In addition, it is possible that an agnosed as atypical squamous cells of undetermined sig-
observer with a lower overall accuracy may be more nificance. Diagn Cytopathol 1996;14:20–4.
diagnostically useful than an observer with a higher 4. Kaye KS, Dhurandhar NR. Atypical cells of undetermined
accuracy due to differences in diagnostic thresholds. significance: follow-up biopsy and Pap smear findings. Am
For example, because of the uncertainties in patient J Clin Pathol 1993;99:332.
5. Cocchi V, Carretti D, Fanti S, Baldazzi P, Casotti MT, Piazza
management with a diagnosis of ASCUS, some clini-
R, et al. Intralaboratory quality assurance in cervical/vaginal
cians may find an observer who is more willing to cytology: evaluation of intercytologist diagnostic reproduc-
diagnose borderline specimens as either negative or ibility. Diagn Cytopathol 1997;16:87–92.
SIL more useful than one who is more accurate but 6. Sidaway MK, Tabbara SO. Reactive change and atypical
diagnoses many cases as ASCUS. Nevertheless, the au- squamous cells of undetermined significance in Papanico-
thors believe selecting the most diagnostically useful laou smears: a cytohistologic correlation. Diagn Cytopathol
1993;9:423–9.
criteria is much easier when the differences between 7. Sherman ME, Schiffman MH, Lorincz AT, Manos MM, Scott
diagnostic thresholds and accuracy are clearly defined. DR, Kurman RJ, et al. Toward objective quality assurance in
This allows one to determine the value of several diag- cervical cytopathology. Correlation of cytopathologic diag-
nostic criteria more effectively and objectively and se- noses with detection of high-risk human papillomavirus
lect those that are most appropriate. types. Am J Clin Pathol 1994;102:182–7.
8. Epstein JI, Grignon DJ, Humphrey PA, McNeal JE, Sester-
Finally, although ROC curve analysis is clearly use-
henn IA, Troncoso P, et al. Interobserver reproducibility in
ful, the main disadvantage is that there must be a gold the diagnosis of prostatic intraepithelial neoplasia. Am J Surg
standard. Fortunately, biopsy provides a reasonable Pathol 1995;19:873–86.
gold standard for most cytologic studies and often is 9. Allam CK, Bostwick DG, Hayes JA, Upton MP, Wade GG,
available. Although biopsy has limitations related to Domanowski GF, et al. Interobserver variability in the diag-
both sampling and interpretation, especially in cervi- nosis of high grade prostatic intraepithelial neoplasia and
adenocarcinoma. Mod Pathol 1996;9:742–51.
covaginal smears,34 – 39 inherent errors using it as the
10. Raab SS, Isacson C, Layfield LJ, Lenel JC, Slagel DD, Thomas
gold standard should be random and thus should not PA. Atypical glandular cells of undetermined significance.
affect comparisons of accuracy between observers ex- Am J Clin Pathol 1995;104:574–82.
amining large samples. 11. Beck JR, Shultz EK. The use of relative operating characteris-
In conclusion, this study demonstrated that a tic (ROC) curves in test performance evaluation. Arch Pathol
complete understanding of the results of cytologic in- Lab Med 1986;110:13–20.
12. Raab SS. Diagnostic accuracy in cytopathology. Diagn Cyto-
terpretation only can come from a comprehensive sta-
pathol 1994;10:68–75.
tistical analysis. Kappa analysis can determine the 13. Hanley JA, McNeil BJ. The meaning and use of the area
level of interobserver agreement, but distribution under a receiver operating characteristic (ROC) curve. Radi-
analysis allows one to determine in what way different ology 1982;143:29–36.
observers disagree. Threshold analysis is independent 14. McNeil BJ, Hanley JA. Statistical approaches to the analysis
of interobserver agreement, but provides only a lim- of receiver operating characteristic (ROC) curves. Med Decis
ited measure of accuracy. Overall accuracy can best Making 1984;4:137–50.
15. Hanley JA. Receiver operating characteristic (ROC) method-
be determined using ROC analysis. Poor interobserver
ology: the state of the art. Crit Rev Diagn Imaging 1989;
agreement can be the result of either differences in 29:307–35.
diagnostic thresholds or diagnostic accuracy. If one 16. Metz CE, Kronman HB. Statistical significance tests for bi-
wishes to improve one’s performance, then it is im- normal ROC curves. J Math Psych 1980;22:218–43.
portant to know in what way it is deficient; otherwise, 17. Dorfman DD, Alf E. Maximum-likelihood estimation param-
efforts may be misdirected. These different statistical eters of signal-detection theory and determination of confi-
dence intervals-rating-method data. J Math Psych 1969;
methods can be used for quality improvement in cyto-
6:487–96.
logic interpretation, and provide an objective mea- 18. Giard RWM, Hermans J. Interpretation of diagnostic cytol-
surement in determining overall diagnostic use- ogy with likelihood ratios. Arch Pathol Lab Med 1990;
fulness. 114:852–4.
19. Raab SS, Slagel DD, Jensen CS, Teague MW, Savell VH, Oz-
REFERENCES kutlu D, et al. Transitional cell carcinoma: cytologic criteria
1. Davey DD, Naryshkin S, Nielsen ML, Kline TS. Atypical squa- to improve diagnostic accuracy. Mod Pathol 1996;9:225–31.
mous cells of undertermined significance: interlaboratory 20. Cohen MB, Rodgers C, Hales MS, Gonzales JM, Ljung BME,
comparison and quality assurance monitors. Diagn Cytopa- Beckstead JH, et al. Influence of training and experience in
thol 1994;11:390–6. fine needle aspration biopsy of the breast. Arch Pathol Lab
2. Williams ML, Rimm DL, Pedigo MA, Frable WJ. Atypical Med 1987;111:518–20.

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

Statistical Analysis of Cytologic Results/Renshaw et al. 219

21. Raab SS, Thomas PA, Lenel JC, Bottles K, Fitzsimmons KM, study in a colposcopy unit [abstract]. Mod Pathol 1997;
Zaleski MS, et al. Pathology and probability. Am J Clin Pathol 10:33A.
1995;103:588–93. 30. Flynn C, Pitman M. Cytohistological correlation of subclassi-
22. Bacus JW, Wiley EL, Galbraith W, Marshall PN, Wilbanks GD, fied ASCUS Pap smears [abstract]. Mod Pathol 1997;10:33A.
Weinstein RS. Malignant cell detection and cervical cancer 31. Collins LC, Wang HH, Abu-Jawdeh GM. Qualifiers of atypical
screening. Anal Quant Cytol Histol 1984;6:121–30. squamous cells of undetermined significance help in patient
23. Raab SS, Snider TE, Potts SA, McDaniel HL, Robinson RA, management. Mod Pathol 1996;9:677–81.
Nelson DL, et al. Atypical glandular cells of undetermined 32. Kline MJ, Davey DD. Atypical squamous cells of undeter-
significance. Diagnostic accuracy and interobserver variabil- mined significance qualified: a follow-up study. Diagn Cyto-
ity using select cytologic criteria. Am J Clin Pathol 1997; pathol 1996;14:380–4.
107:299–307. 33. Cramer SF. Interobserver variability in surgical pathology.
24. Langley FA, Buckley CH, Tasker M. The use of ROC curves In: Weinstein RS, editor. Advances in pathology and labora-
in histopathologic decision making. Anal Quant Cytol Histol tory medicine. Vol. 9. St. Louis: C.V. Mosby, 1996:3–82.
1985;7:167–73. 34. Rohr R. Quality assurance in gynecologic cytology. Am J Clin
Pathol 1990;94:754–8.
25. Hermann GA, Herrera N, Sugiura HT. Comparison of inter-
35. Dodd LG, Sneige N, Villarreal Y, Fanning CV, Staerkel GA,
laboratory survey data in terms of receiver operating chract-
Caraway NP, et al. Quality-assurance study of simultane-
eristic (ROC) indices. J Nucl Med 1982;23:525–31.
ously sampled, non-correlating cervical cytology and biop-
26. The Bethesda Committee. The Bethesda system for re-
sies. Diagn Cytopathol 1993;9:138–44.
porting cervical/vaginal cytologic diagnoses. Acta Cytol 36. Cramer H, Schlenk E. An analysis of discrepancies between
1993;37:115–24. the cervical cytologic diagnosis and subsequent histopatho-
27. Metz CE, Wang PL, Kronan HB. A new approach for testing logic diagnosis in 1260 cases [abstract]. Acta Cytol 1994;
the significance of differences between ROC curves mea- 38:812.
sured from correlated data. In: DeConinck F, editor. Infor- 37. Tritz DM, Weeks JA, Spires SE, Sattich M, Banks H, Cibull
mation processing in medical imaging. The Hague: Martinus ML, et al. Etiologies for non-correlating cervical cytologies
Nijhoff, 1984:432–45. and biopsies. Am J Clin Pathol 1995;103:594–7.
28. Chhieng DC, Taylor J, Schmee J, McKenna BJ. Cytologic cri- 38. Joste NE, Crum CP, Cibas ES. Cytologic/histologic correla-
teria for subclassification of ASCUS improve correlation with tion for quality control in cervicovaginal cytology. Am J Clin
biopsy outcome [abstract]. Mod Pathol 1997;10:32A. Pathol 1995;103:32–4.
29. Ettler HC, Downing P, Wright VC, Joseph MG. Atypical squa- 39. Jones BA, Novis DA. Cervical biopsy-cytology correlation.
mous cells of undetermined significance: a cytohistologic Arch Pathol Lab Med 1996;120:523–31.

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

Urology OSCE
71% (7)
Urology OSCE
9 pages
Haematology Made Easy
From Everand
Haematology Made Easy
Adias
5/5 (3)
Pap Test
No ratings yet
Pap Test
24 pages
Thorax and Lungs Assessment Checklist
No ratings yet
Thorax and Lungs Assessment Checklist
8 pages
FAST Exam
50% (2)
FAST Exam
123 pages
Dui Yao For Ch.8
100% (1)
Dui Yao For Ch.8
4 pages
The Pap Test: Evidence To Date: Original Source: Alliance For Cervical Cancer Prevention (ACCP)
No ratings yet
The Pap Test: Evidence To Date: Original Source: Alliance For Cervical Cancer Prevention (ACCP)
14 pages
USMLE 2017 Gynecology Paraclinic All 141Q - Attempt Review
No ratings yet
USMLE 2017 Gynecology Paraclinic All 141Q - Attempt Review
132 pages
cervical cancer detailed study 1
No ratings yet
cervical cancer detailed study 1
9 pages
Clinical Management / Natural History of Cervical Dysplasia (CIN) and Related Findings
No ratings yet
Clinical Management / Natural History of Cervical Dysplasia (CIN) and Related Findings
65 pages
1996 Diagnostic Reproducibility of Pap Testing in Two Regions of Mexico_The Need for Quality Control Mechanisms
No ratings yet
1996 Diagnostic Reproducibility of Pap Testing in Two Regions of Mexico_The Need for Quality Control Mechanisms
9 pages
Diagnostics 12 00210 v2
No ratings yet
Diagnostics 12 00210 v2
11 pages
Computer-Aided Diagnosis Tool
No ratings yet
Computer-Aided Diagnosis Tool
17 pages
My Pap Smear Is Abnormal !
No ratings yet
My Pap Smear Is Abnormal !
70 pages
Bethesda Reporting System in Cytology
No ratings yet
Bethesda Reporting System in Cytology
25 pages
Cervical Cytology Pap Smear Report
No ratings yet
Cervical Cytology Pap Smear Report
1 page
Ca Cervik
No ratings yet
Ca Cervik
7 pages
LJLDLBFLJFDBGLDFN
No ratings yet
LJLDLBFLJFDBGLDFN
41 pages
ASC Abstracts 05
No ratings yet
ASC Abstracts 05
110 pages
Ding 2016
No ratings yet
Ding 2016
5 pages
Elements Of Clinical Study Design, Biostatistics & Research
From Everand
Elements Of Clinical Study Design, Biostatistics & Research
Aditya Patel
No ratings yet
Topic Colposcopy Nitisa
No ratings yet
Topic Colposcopy Nitisa
83 pages
The PAP Smear
No ratings yet
The PAP Smear
23 pages
69th AACC Annual Scientific Meeting Abstract eBook
From Everand
69th AACC Annual Scientific Meeting Abstract eBook
American Association for Clinical Chemistry (AACC)
No ratings yet
Marta Del Pino Colposcopy Prediction of Progression in
No ratings yet
Marta Del Pino Colposcopy Prediction of Progression in
8 pages
Complementary and Alternative Medical Lab Testing Part 17: Oncology
From Everand
Complementary and Alternative Medical Lab Testing Part 17: Oncology
Ronald Steriti
No ratings yet
Management of Cervical Smears Golden
No ratings yet
Management of Cervical Smears Golden
47 pages
P16/Ki67 Dual Staining Improves The Detection Specificity of High Grade Cervical Lesions
No ratings yet
P16/Ki67 Dual Staining Improves The Detection Specificity of High Grade Cervical Lesions
14 pages
Presentation
No ratings yet
Presentation
15 pages
CA Cervix
No ratings yet
CA Cervix
54 pages
Ijims 2017 105
No ratings yet
Ijims 2017 105
7 pages
gore1997
No ratings yet
gore1997
3 pages
OBGY_13_2_11
No ratings yet
OBGY_13_2_11
6 pages
Statistical Methods in Diagnostic Medicine
From Everand
Statistical Methods in Diagnostic Medicine
Xiao-Hua Zhou
4/5 (1)
Cervical Cancer Screening New Approaches FINAL
No ratings yet
Cervical Cancer Screening New Approaches FINAL
41 pages
Screening For Female Genital Tract Malignancy
100% (1)
Screening For Female Genital Tract Malignancy
40 pages
214 Indian J Dermatol Venereol Leprol - March-April 2009 - Vol 75 - Issue 2
No ratings yet
214 Indian J Dermatol Venereol Leprol - March-April 2009 - Vol 75 - Issue 2
3 pages
71st AACC Annual Scientific Meeting
From Everand
71st AACC Annual Scientific Meeting
CTI Meeting Technology
No ratings yet
A Fuzzy Based Automatic Pap Screening System - ROI Detection
No ratings yet
A Fuzzy Based Automatic Pap Screening System - ROI Detection
4 pages
Management of HPV Positive Cases After Screening: Alexandros I. Daponte
No ratings yet
Management of HPV Positive Cases After Screening: Alexandros I. Daponte
37 pages
Interobserver and Intraobserver Variability in the Cytologic Diagnosis of Normal and Abnormal Metaplastic Cells
No ratings yet
Interobserver and Intraobserver Variability in the Cytologic Diagnosis of Normal and Abnormal Metaplastic Cells
7 pages
Chatzistamatiou 2016
No ratings yet
Chatzistamatiou 2016
8 pages
Jurnal 4
No ratings yet
Jurnal 4
7 pages
Raman and Cancer
No ratings yet
Raman and Cancer
10 pages
CS-PUB-LAB-1 Cervical Cytology Management Recommendations Explanatory Guide
100% (1)
CS-PUB-LAB-1 Cervical Cytology Management Recommendations Explanatory Guide
11 pages
AGC HPV Study Summary Corrected
No ratings yet
AGC HPV Study Summary Corrected
2 pages
stratOG Statictic
No ratings yet
stratOG Statictic
15 pages
PAP Smear Interpretation
No ratings yet
PAP Smear Interpretation
12 pages
Effective Screening17
No ratings yet
Effective Screening17
9 pages
SEMINAR 2
No ratings yet
SEMINAR 2
8 pages
Detection of Cervical Smears
No ratings yet
Detection of Cervical Smears
4 pages
Pap Smear
No ratings yet
Pap Smear
4 pages
Uso de La Genotipicación Del PVH para El Tamizaje y Manejo de Lesiones Premalignas de Cuello Uterino
No ratings yet
Uso de La Genotipicación Del PVH para El Tamizaje y Manejo de Lesiones Premalignas de Cuello Uterino
42 pages
Tutorial 2 Bio stat
No ratings yet
Tutorial 2 Bio stat
2 pages
OCSPScreeningGuidelines June2020
No ratings yet
OCSPScreeningGuidelines June2020
1 page
Citología Cervical Screening ACOG
No ratings yet
Citología Cervical Screening ACOG
11 pages
Analysis of Patterns of Patient Compliance After An Abnormal Pap Smear Result The Influence of Demographic Characteristics On Patient Compliance
No ratings yet
Analysis of Patterns of Patient Compliance After An Abnormal Pap Smear Result The Influence of Demographic Characteristics On Patient Compliance
5 pages
Elseiver
No ratings yet
Elseiver
8 pages
Cervical Cancer
No ratings yet
Cervical Cancer
4 pages
Anexo 13. Frotis de Papanicolaou Anormal
No ratings yet
Anexo 13. Frotis de Papanicolaou Anormal
14 pages
Female Genital Tumor
No ratings yet
Female Genital Tumor
55 pages
Pap Smear
No ratings yet
Pap Smear
4 pages
ART20176809
No ratings yet
ART20176809
5 pages
Case Studies in Advanced Skin Cancer Management: An Osce Viva Resource
From Everand
Case Studies in Advanced Skin Cancer Management: An Osce Viva Resource
James Bricknell
No ratings yet
Trichomoniasis - Mary K. Klassen-Fischer and Izzat S. Ali 2011
No ratings yet
Trichomoniasis - Mary K. Klassen-Fischer and Izzat S. Ali 2011
9 pages
Cytomorphometric and Morphological Analysis in Women With Trichomonas Vaginalis Infection- Micronucleus Frequency in Exfoliated Cervical Epithelial Cells
No ratings yet
Cytomorphometric and Morphological Analysis in Women With Trichomonas Vaginalis Infection- Micronucleus Frequency in Exfoliated Cervical Epithelial Cells
7 pages
A comparative evaluation of the Papanicolaou test for the diagnosis of trichomoniasis
No ratings yet
A comparative evaluation of the Papanicolaou test for the diagnosis of trichomoniasis
1 page
Frequency of Trichomonas Vaginalis in Four Differemt Decades
No ratings yet
Frequency of Trichomonas Vaginalis in Four Differemt Decades
6 pages
Evaluation of Different Staining Techniques in the Diagnosis of Trichomonas Vaginalis Infection in Females of Reproductive Age Group
No ratings yet
Evaluation of Different Staining Techniques in the Diagnosis of Trichomonas Vaginalis Infection in Females of Reproductive Age Group
4 pages
Prevalence of Pap Smear among Female Health Personnel in Hospital Tuanku Jaafar Seremban Malaysia
No ratings yet
Prevalence of Pap Smear among Female Health Personnel in Hospital Tuanku Jaafar Seremban Malaysia
8 pages
Lost in digitization – A systematic review about the diagnostic test accuracy in Digital Pathology Solution
No ratings yet
Lost in digitization – A systematic review about the diagnostic test accuracy in Digital Pathology Solution
11 pages
Knowledge of Human Papillomavirus Infection, Cervical Cancer and Willingness to Pay for Cervical Cancer Vaccination Among Ethnically Diverse Medical Students in Malaysia
No ratings yet
Knowledge of Human Papillomavirus Infection, Cervical Cancer and Willingness to Pay for Cervical Cancer Vaccination Among Ethnically Diverse Medical Students in Malaysia
7 pages
Histopathological Images Analysis and Predictive Modeling Implemented in Digital Pathology—Current Affairs and Perspectives
No ratings yet
Histopathological Images Analysis and Predictive Modeling Implemented in Digital Pathology—Current Affairs and Perspectives
21 pages
Human Papillomavirus (HPV) Prevalence and Type Distribution in Urban Areas of Malaysia
No ratings yet
Human Papillomavirus (HPV) Prevalence and Type Distribution in Urban Areas of Malaysia
7 pages
Annotated Bibliography Final Draft
No ratings yet
Annotated Bibliography Final Draft
14 pages
Photosensitive Diseases: 1. Solar Dermatitis, Sunburn
No ratings yet
Photosensitive Diseases: 1. Solar Dermatitis, Sunburn
8 pages
List of Super Specialty Treatment (SST) Tie Up Hospitals in West Bengal For The Year 2016-20181
No ratings yet
List of Super Specialty Treatment (SST) Tie Up Hospitals in West Bengal For The Year 2016-20181
6 pages
Pet
No ratings yet
Pet
8 pages
1-Kidney Function Test - PO3921596584-773 - 230517 - 125229
No ratings yet
1-Kidney Function Test - PO3921596584-773 - 230517 - 125229
7 pages
Cigarettes: What The Warning Label Doesn't Tell You
No ratings yet
Cigarettes: What The Warning Label Doesn't Tell You
149 pages
(Original PDF) Chest Radiology Patterns and Differential Diagnoses instant download
100% (11)
(Original PDF) Chest Radiology Patterns and Differential Diagnoses instant download
46 pages
Fanconi Anemia - D. Schindler, Et. Al., (Karger, 2007) WW
No ratings yet
Fanconi Anemia - D. Schindler, Et. Al., (Karger, 2007) WW
243 pages
Bullous Pemphigoid
No ratings yet
Bullous Pemphigoid
13 pages
Jcla 22392
No ratings yet
Jcla 22392
8 pages
Breastfeeding An Education Resource For Schools
No ratings yet
Breastfeeding An Education Resource For Schools
54 pages
Causes of Abdominal Pain in Adults
100% (1)
Causes of Abdominal Pain in Adults
41 pages
Carcinoma of The Larynx: Guzman, Bennison A. Fernandez, Jenette B. Malapira, Rykielle Joyce U
No ratings yet
Carcinoma of The Larynx: Guzman, Bennison A. Fernandez, Jenette B. Malapira, Rykielle Joyce U
41 pages
Chapter 34 Lower GI
No ratings yet
Chapter 34 Lower GI
8 pages
Chapter 43 - Thrombocytopenia and Thrombocytosis
No ratings yet
Chapter 43 - Thrombocytopenia and Thrombocytosis
6 pages
Anatomyy PDF
No ratings yet
Anatomyy PDF
196 pages
High Resolution Chest CT (HRCT) : Protocol, Indications, and Pathologies
No ratings yet
High Resolution Chest CT (HRCT) : Protocol, Indications, and Pathologies
36 pages
Medical: All Hospital Fellowship
No ratings yet
Medical: All Hospital Fellowship
5 pages
Gujral FCM 2
No ratings yet
Gujral FCM 2
128 pages
Monoclonal Antibodies: K. Prameela Department of Biotechnology Gitam University
100% (1)
Monoclonal Antibodies: K. Prameela Department of Biotechnology Gitam University
20 pages
Isabella Chavez - Personal Statement
No ratings yet
Isabella Chavez - Personal Statement
2 pages
2.Post-Lecture Exam - General Pathology and Histopathologic Techniques
No ratings yet
2.Post-Lecture Exam - General Pathology and Histopathologic Techniques
10 pages
Idiopathic Calcinosis Cutis of Scrotum
No ratings yet
Idiopathic Calcinosis Cutis of Scrotum
4 pages
Moringa Oleifera Aqueous Leaf Extract Down-Regulates Nuclear Factor-Kappab and Increases Cytotoxic Effect of Chemotherapy in Pancreatic Cancer Cells
No ratings yet
Moringa Oleifera Aqueous Leaf Extract Down-Regulates Nuclear Factor-Kappab and Increases Cytotoxic Effect of Chemotherapy in Pancreatic Cancer Cells
7 pages
Drug Delivery On Rectal Absorption: Suppositories: Review Article
No ratings yet
Drug Delivery On Rectal Absorption: Suppositories: Review Article
7 pages
Org Chem Lab - ELGA (Final)
No ratings yet
Org Chem Lab - ELGA (Final)
17 pages

Use of Statistical Analysis of Cytologic Interpretation to Determine the Causes of Interobserver Disagreement and in Quality Improvement

Uploaded by

Use of Statistical Analysis of Cytologic Interpretation to Determine the Causes of Interobserver Disagreement and in Quality Improvement

Uploaded by

212

Use of Statistical Analysis of Cytologic Interpretation

KEYWORDS: cytopathology, papanicolaou smear, cervicovaginal, kappa statistic,

T he interpretation of cytologic specimens may vary between labora-

q 1997 American Cancer Society

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

ally will not be normal, and analysis of the distribution TABLE 1

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

FIGURE 1. Distribution of diagnoses for the three observers.

TABLE 2 result. Cases were classified as negative, negative with

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

ences in accuracy are calculated using a two-tailed TABLE 3

RESULTS P Valuea 0.009 0.018 0.00008

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

FIGURE 3. Receiver operator characteristic curves for the three observers.

TABLE 5 sensitive than Observer 3. Analysis of the differences in

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

TABLE 6 Several additional observations can be reached

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

/ 7306$$1277 07-30-97 14:12:39 ccyta W: Can Cyto

You might also like