0% found this document useful (0 votes)
81 views

Problem - Statement, - Conceptual - Framework (Q1) Cited19x

This chapter discusses review criteria that reviewers should consider when evaluating different sections of a research report, including the introduction, method, results, discussion, and other elements like the title, authors, and abstract. For each section, the authors provide background information on what should be presented and issues reviewers should consider in judging the quality, details, and evidence provided. The goal is to enable reviewers to thoroughly analyze manuscripts and determine whether the interpretation of findings appears reliable, valid, and trustworthy overall.

Uploaded by

filzah husna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Problem - Statement, - Conceptual - Framework (Q1) Cited19x

This chapter discusses review criteria that reviewers should consider when evaluating different sections of a research report, including the introduction, method, results, discussion, and other elements like the title, authors, and abstract. For each section, the authors provide background information on what should be presented and issues reviewers should consider in judging the quality, details, and evidence provided. The goal is to enable reviewers to thoroughly analyze manuscripts and determine whether the interpretation of findings appears reliable, valid, and trustworthy overall.

Uploaded by

filzah husna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

C H A P T E R 2

Review Criteria
ABSTRACT
Downloaded from http://journals.lww.com/academicmedicine by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AWnYQp/IlQrHD33O0GyUXvnrvbCWKh+ZXDV0HDLAFyKop0hsPTC1o6E6M99wWh0YWfdg== on 10/14/2020

Following the common IMRaD format for scientific re- when the data analysis was planned. Further, the author
search reports, the authors present review criteria and dis- discusses the presentation of the body of evidence col-
cuss background information and issues related to the re- lected within the study, offering information for reviewers
view criteria for each section of a research report. on evaluating the selection and organization of data, the
Introduction. The authors discuss the criteria reviewers balance between descriptive and inferential statistics, nar-
should be aware of for establishing the context for the rative presentation, contextualization of qualitative data,
research study: prior literature to introduce and describe and the use of tables and figures.
the problem statement, the conceptual framework (the- Discussion. The authors provide information to enable
ory) underlying the problem, the relevance of the re- reviewers to evaluate whether the interpretation of the
search questions, and the justification of their research evidence is adequately discussed and appears reliable,
design and methods. valid, and trustworthy. Further, they discuss how review-
Method. The authors discuss a variety of methods used ers can weigh interpretations, given the strengths and
to advance knowledge and practice in the health profes- limitations of the study, and can judge the generalizability
sions, including quantitative research on educational in- and practical significance of conclusions drawn by inves-
terventions, qualitative observational studies, test and tigators.
measurement development projects, case reports, exposi- Title, authors, and abstract. The author discusses a re-
tory essays, and quantitative and qualitative research syn- viewer’s responsibility in judging the title, authors, and
thesis. As background information for reviewers, the au- abstract of a manuscript submitted for publication. While
thors discuss how investigators use these and other this triad orients the reader at the beginning of the review
methods in concert with data-collection instruments, process, only after the manuscript is analyzed thoroughly
samples of research participants, and data-analysis pro- can these elements be effectively evaluated.
cedures to address educational, policy, and clinical ques- Other. The authors discuss the reviewer’s role in eval-
tions. The authors explain the key role that research uating the clarity and effectiveness of a study’s written
methods play in scholarship and the role of the reviewer presentation and issues of scientific conduct (plagiarism,
in judging their quality, details, and richness. proper attribution of ideas and materials, prior publica-
Results. The author describes issues related to reporting tion, conflict of interest, and institutional review board
statistical analyses in the results, particularly data that do approval).
not have many of the properties that were anticipated Acad. Med. 2001;76:922–951.

922 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


MANUSCRIPT INTRODUCTION

Problem Statement, Conceptual Framework, and Research Question

William C. McGaghie, Georges Bordage, and Judy A. Shea*

REVIEW CRITERIA
䡲 The introduction builds a logical case and context for the problem statement.
䡲 The problem statement is clear and well articulated.
䡲 The conceptual (theoretical) framework is explicit and justified.
䡲 The research question (research hypothesis where applicable) is clear, concise, and complete.
䡲 The variables being investigated are clearly identified and presented.

ISSUES AND EXAMPLES RELATED TO THE CRITERIA statements are: ‘‘With the national trend toward more pa-
tient care in outpatient settings, the numbers of patients on
Introduction inpatient wards have declined in many hospitals, contrib-
uting to the inadequacy of inpatient wards as the primary
A scholarly manuscript starts with an Introduction that tells setting for teaching students,’’ 2 and ‘‘The process of profes-
a story. The Introduction orients the reader to the topic of sional socialization, regardless of the philosophical approach
the report, moving from broad concepts to more specific of the educational program, can be stressful . . . few studies
ideas.1 The Introduction should convince the reader, and all have explored the unique stressors associated with PBL in
the more the reviewer, that the author has thought the topic professional education.’’ 3 These statements help readers an-
through and has developed a tight, ‘‘researchable’’ problem. ticipate the goals of each study. In the case of the second
The Introduction should move logically from the known to example, the Introduction ended with the following state-
the unknown. The actual components of an Introduction ment: ‘‘The purpose of this qualitative study was to identify
(including its length, complexity, and organization) will vary stressors perceived by physiotherapy students during their in-
with the type of study being reported, the traditions of the itial unit of study in a problem-based program.’’ In laying
research community or discipline in which it is based, and out the issues and context, the Introduction should not con-
the style and tradition of the journal receiving the manu- tain broad generalizations or sweeping claims that will not
script. It is helpful for the reviewer to evaluate the Intro- be backed up in the paper’s literature review. (See the next
duction by thinking about its overall purpose and its indi- article.)
vidual components: problem statement, conceptual
framework, and research question. Two related articles, ‘‘Ref-
Conceptual Framework
erence to the Literature’’ and ‘‘Relevance,’’ follow the pres-
ent article.
Most research reports cast the problem statement within the
Problem Statement context of a conceptual or theoretical framework.4 A descrip-
tion of this framework contributes to a research report in at
The Introduction to a research manuscript articulates a prob- least two ways because it (1) identifies research variables,
lem statement. This essential element conveys the issues and and (2) clarifies relationships among the variables. Linked
context that gave rise to the study. Two examples of problem to the problem statement, the conceptual framework ‘‘sets
the stage’’ for presentation of the specific research question
that drives the investigation being reported. For example,
the conceptual framework and research question would be
*Lloyd Lewis, PhD, emeritus professor of the Medical College of Georgia,
participated in early meetings of the Task Force and contributed to the different for a formative evaluation study than for a sum-
earliest draft of this section. mative study, even though their variables might be similar.

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 923


Scholars argue that a conceptual or theoretical framework For some journals, the main study variables (e.g., medical
always underlies a research study, even if the framework is competence) will be defined in the Introduction. Other jour-
not articulated.5 This may seem incongruous, because many nals will place this in the Methods section. Whether specific
research problems originate from practical educational or hypotheses or more general research questions are stated, the
clinical activities. Questions often arise such as ‘‘I wonder reviewer (reader) should be able to anticipate what will be
why such an event did not [or did] happen?’’ For example, revealed in the Methods.
why didn’t the residents’ test-interpretation skills improve
after they were given feedback? There are also occasions SUMMARY
when a study is undertaken simply to report or describe an
event, e.g., pass rates for women versus men on high-stakes The purpose of the Introduction is to construct a logical
examinations such as the United States Medical Licensing ‘‘story’’ that will educate the reader about the study that
Examination (USMLE) Step 1. Nevertheless, it is usually follows. The order of the components may vary, with the
possible to construct at least a brief theoretical rationale for problem statement sometimes coming after the conceptual
the study. The rationale in the USMLE example may be, for framework, while in other reports the problem statement
instance, about gender equity and bias and why these are may appear in the first paragraph to orient the reader about
important issues. Frameworks are usually more elaborate and what to expect. However, in all cases the Introduction will
detailed when the topics that are being studied have long engage, educate, and encourage the reader to finish the man-
scholarly histories (e.g., cognition, psychometrics) where ac- uscript.
tive researchers traditionally embed their empirical work in
well-established theories.
REFERENCES
Research Question 1. Zeiger M. Essentials of Writing Biomedical Research Papers. 2nd Ed.
London, U.K.: McGraw–Hill, 1999.
A more precise and detailed expression of the problem state- 2. Fincher RME, Case SM, Ripkey DR, Swanson DB. Comparison of am-
ment cast as a specific research question is usually stated at bulatory knowledge of third-year students who learned in ambulatory
settings with that of students who learned in inpatient settings. Acad
the end of the Introduction. To illustrate, a recent research
Med. 1997;72(10 suppl):S130–S132.
report states, ‘‘The research addressed three questions. First, 3. Soloman P, Finch E. A qualitative study identifying stressors associated
do students’ pulmonary physiology concept structures change with adapting to problem-based learning. Teach Learn Med. 1998;10:
from random patterns before instruction to coherent, inter- 58–64.
pretable structures after a focused block of instruction? Sec- 4. Chalmers AF. What is This Thing Called Science? St. Lucia, Qld., Aus-
tralia: University of Queensland Press, 1982.
ond, can an MDS [multidimensional scaling] solution ac-
5. Hammond KR. Introduction to Brunswikian theory and methods. In:
count for a meaningful proportion of variance in medical Hammond KR, Wascoe NE (eds). New Directions for Methodology of
and veterinary students’ concept structures? Third, do indi- Social and Behavioral Sciences, No. 3: Realizations of Brunswik’s Rep-
vidual differences in the ways in which medical and veteri- resentative Design. San Francisco, CA: Jossey–Bass, 1980.
nary students intellectually organize the pulmonary physi- 6. McGaghie WC, McCrimmon DR, Thompson JA, Ravitch MM, Mitch-
ell G. Medical and veterinary students’ structural knowledge of pulmo-
ology concepts as captured by MDS correlate with course
nary physiology concepts. Acad Med. 2000;75:362–8.
examination achievement?6 7. Fraenkel JR, Wallen NE. How to Design and Evaluate Research in Ed-
ucation. 4th ed. New York: McGraw–Hill, 2000.
Variables 8. DelVecchio Good M-J. American Medicine: The Quest for Competence.
Berkeley, CA: University of California Press, 1995.
In experimental research, the logic revealed in the Intro-
duction might result in explicitly stated hypotheses that
would include specification of dependent and independent RESOURCES
variables.7 By contrast, much of the research in medical ed-
ucation is not experimental. In such cases it is more typical American Psychological Association. Publication Manual of the American
Psychological Association. 4th ed. Washington, DC: APA, 1994:11–2.
to state general research questions. For example, ‘‘In this Creswell JW. Research Design: Qualitative and Quantitative Approaches.
[book] section, the meaning of medical competence in the Thousand Oaks, CA: Sage Publications, 1994:1–16.
worlds of practicing clinicians is considered through the lens Day RA. How to Write and Publish a Scientific Paper. 5th ed. Phoenix,
of an ethnographic story. The story is about the evolution AZ: Oryx Press, 1998:33–35.
of relationships among obstetrical providers and transfor- Erlandson DA, Harris EL, Skipper BL, Allen SD. Doing Naturalistic In-
quiry: A Guide to Methods. Newbury Park, CA: Sage Publications, 1993:
mations in obstetrical practice in one rural town in Califor- 42–65.
nia, which I will call ‘Coast Community,’ over the course of Glesne C, Peshkin A. Becoming Qualitative Researchers: An Introduction.
a decade.’’ 8 White Plains, NY: Longman Publishing Group, 1992:13–37.

924 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


Reference to the Literature and Documentation

Sonia J. Crandall, Addeane S. Caelleigh, and Ann Steinecke

REVIEW CRITERIA
䡲 The literature review is up-to-date.
䡲 The number of references is appropriate and their selection is judicious.
䡲 The review of the literature is well integrated.
䡲 The references are mainly primary sources.
䡲 Ideas are acknowledged appropriately (scholarly attribution) and accurately.
䡲 The literature is analyzed and critically appraised.

ISSUES AND EXAMPLES RELATED TO THE CRITERIA use of specific methods, tools, and (statistical) analyses, add-
ing citations in the appropriate sections of the manuscript.
Research questions come from observing phenomena or At the qualitative end of the spectrum, the researchers
reading the literature. Regardless of what inspired the re- weave the relevant literature into all phases of the study and
search, however, study investigators must thoroughly review use it to guide the evolution of their thinking as data are
existing literature to adequately understand the scope of the gathered, transcribed, excerpted, analyzed, and placed before
issues relevant to their questions. Although systematic re- the reader.2 They also use the literature to reframe the prob-
views of the literature conducted in the social and biomed- lem as the study evolves. Although the distinction is not
ical sciences, such as those produced by the Cochrane crystal-clear, the difference between the ends of the contin-
Collaboration (for clinical issues) and the Campbell Collab- uum might be viewed as the difference between testing the-
oration (for areas of social science) may be quite different ory-driven hypotheses (quantitative) and generating theory-
in terms of the types of evidence provided and the natures building hypotheses (qualitative).
of the outcomes, their goals are the same, that is, to present Researchers all along this continuum use the literature to
the best evidence to inform research, practice, and policy. inform their early development of research interests, prob-
These reviews are usually carried out by large teams, which lems, and questions and later in the conduct of their research
follow strict protocols common to the whole collaboration. and the interpretation of their findings. A review of relevant
Individual researchers also conduct thorough reviews, albeit literature sets the stage for a study. It provides a logically
usually less structured and in-depth. They achieve three key organized world view of the researcher’s question, or of the
research aims through a thorough analysis of the literature: situation the researcher has observed—what knowledge ex-
refinement of their research questions, defense of their re- ists relevant to the research question, how the question or
search design, and ultimately support of their interpretations problem has been previously studied (types of designs and
of outcomes and conclusions. Thus, in the research report, methodologic concerns), and the concepts and variables that
the reviewer should find a clear demonstration of the liter- have been shown to be associated with the problem (ques-
ature’s contribution to the study and its context.1 tion).3 The researcher evaluates previous work ‘‘in terms of
Before discussing the specifics of each of the three aims, its relevance to the research question of interest,4 and syn-
it is important to offer some distinctions regarding the re- thesizes what is known, noting relationships that have been
search continuum. Where researchers fit along the quanti- well studied and identifying areas for elaboration, questions
tative–qualitative continuum influences how they use lit- that remain unanswered, or gaps in understanding.1,3,5,6 The
erature within a study, although there are no rigid rules researcher documents the history and present status of the
about how to use it. Typically, at the quantitative end of the study’s question or problem. The literature reviewed should
spectrum, researchers review the bulk of the literature pri- not only be current, but also reflect the contributions of
marily at the beginning of the study in order to establish the salient published and unpublished research, which may be
theoretical or conceptual framework for the research ques- quite dated but play a significant role in the evolution of
tion or problem. They also use the literature to validate the the research. Regardless of perspective (qualitative, quanti-

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 925


tative, or mixed method), the researcher must frame the ERIC. Reviewers can tell whether multiple databases were
problem or research questions as precisely as possible from a searched for relevant literature by the breadth of disciplines
chronologic and developmental perspective, given the con- represented by the citations. Thus, it is important that the
fines of the literature.2 For example, when presenting the researcher describe how he or she found the previous work
tenets of adult learning as a basis for program evaluation an used to study his or her problem.11
author would be remiss if he or she omitted the foundational A caveat for reviewers is to be wary of researchers who
writings of Knowles,7 Houle,8 and perhaps Lindeman9 from have not carried out a thorough review of the literature.
the discussion. They may report that there is a paucity of research in their
Equally important to using the literature to identify cur- area when in fact plenty exists. At times, authors must be
rent knowledge is using it to defend and support the study pushed. At the very minimum, reviewers should comment
and to inform the design and methods.10 The researcher in- on whether the researchers described to the reviewers’ sat-
terprets and weighs the evidence, presents valid points mak- isfaction how they found study-related literature and the cri-
ing connections between the literature and the study design, teria used to select the sources that were discussed. Review-
reasons logically for specific methods, and describes in detail ers must decide whether this process was satisfactorily
the variables or concepts that will be scrutinized. Through described. If only published reports found in electronic da-
the literature, the researcher provides a map guiding the tabases are discussed, then the viewpoint presented ‘‘may be
reader to the conclusion that the current study is important biased toward well-known research’’ that presents only sta-
and necessary and the design is appropriate to answer the tistically significant outcomes.1
questions.6 When considering the perspectives presented by the au-
Once they have the study outcomes, researchers offer ex- thor, reviewers should pay attention to whether the discus-
planations, challenge assumptions, and make recommenda- sion presents all views that exist in the literature base, that
tions considering the literature used initially to frame the is, conflicting, consensus, or controversial opinions.5,12 The
research problem. Authors may apply some of the most sa- thoroughness of the discussion also depends upon the au-
lient literature at the end of the manuscript to support their thor’s explanation of how literature was located and chosen
conclusions (fully or partially), refute current knowledge, re- for inclusion. For example, Bland and colleagues13 have pro-
vise a hypothesis, or reframe the problem.5 The authors use vided an excellent example of how the process of location
literature to bring the reader back to the theory tested and selection was accomplished.
(quantitative) or the theory generated (qualitative).
Reviewers must consider the pertinence of the literature The mechanics of citing references are covered in ‘‘Presen-
and documentation with regard to the three key research tation and Documentation’’ later in this chapter.
aims stated earlier. They should also consider the types of
resources cited and the balance of the perspectives discussed
within the literature reviewed. When considering the types REFERENCES
of resources cited, reviewers should determine whether the
references are predominantly general sources (textbooks),4 1. Haller KB. Conducting a literature review. MCN: Am Maternal Child
Nurs. 1988;13:148.
primary sources (research articles written by those who con- 2. Haller EJ, Kleine PF. Teacher empowerment and qualitative research.
ducted the research),4 or secondary sources (articles where a In: Haller EJ, Kleine PF (eds). Using Educational Research: A School
researcher describes the work of others).4 References should Administrator’s Guide. New York: Addison Wesley Longman, 2000:
be predominantly primary sources, whether published or un- 193–237.
published. Secondary sources are acceptable, and desirable, 3. Rodgers J, Smith T, Chick N, Crisp J. Publishing workshops number
4. Preparing a manuscript: reviewing literature. Nursing Praxis in New
if primary sources are unavailable or if they provide a review Zealand. 1997;12:38–42.
(meta-analysis, for example) of what is known about the 4. Fraenkel JR, Wallen NE. How to Design and Evaluate Research in
research problem. Researchers may use general resources as Education. 4th ed. Boston, MA: McGraw-Hill Higher Education, 2000.
a basis for describing, for example, a theoretical or meth- 5. Martin PA. Writing a useful literature review for a quantitative research
project. Appl Nurs Res. 1997;10:159–62.
odologic principle, or a statistical procedure.
6. Bartz C. It all starts with an idea. Alzheimer Dis and Assoc Dis. 1999;
Researchers may have difficulty finding all of the pertinent 13:S106–S110.
literature because it may not be published (dissertations), 7. Knowles MS. The Modern Practice of Adult Education: From Pedagogy
and not all published literature is indexed in electronic da- to Andragogy. Chicago, IL: Association Press, 1980.
tabases. Manual searching is still necessary. Reviewers are 8. Houle CO. The Inquiring Mind. Madison, WI: University of Wisconsin
Press, 1961.
cautioned to look for references that appear inclusive of the 9. Lindeman EC. The Meaning of Adult Education. Norman, OK: Uni-
whole body of existing literature. For example, some relevant versity of Oklahoma Research Center for Continuing Professional and
articles are not indexed in Medline, but are indexed in Higher Education, 1989 [Orig. pub. 1926].

926 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


10. Glesne C, Peshkin A. Becoming Qualitative Researchers: An Intro- The Cochrane Collaboration. 具http://www.cochrane.org典. Accessed 3/30/01.
duction. White Plains, NY: Longman Publishing Group, 1992. Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: synthesis of best
11. Smith AJ, Goodman NW. The hypertensive response to intubation: do evidence for clinical decisions. Ann Intern Med. 1997;126:376–80.
researchers acknowledge previous work? Can J Anaesth. 1997;44:9–13. Cooper HM. Synthesizing Research. A Guide for Literature Reviews. 3rd
12. Bruette V, Fitzig C. The literature review. J NY State Nurs Assoc. 1993; ed. Thousand Oaks, CA: Sage, 1998.
24:14–5. Fraenkel JR, Wallen NE. Reviewing the literature. In: How to Design and
13. Bland CJ, Meurer LN, Maldonado G. Determinants of primary care Evaluate Research in Education. 4th ed. Boston, MA: McGraw–Hill
specialty choice: a non-statistical meta-analysis of the literature. Acad Higher Education, 2000;70–101.
Med. 1995;70:620–41. Gall JP, Gall MD, Borg WR. Applying Educational Research: A Practical
Guide. 4th ed. White Plains, NY: Longman Publishing Group, 1998:
chapters 2, 3, 4.
RESOURCES Mulrow CD. Rationale for systematic reviews. BMJ. 1994;309:597–9.

Bartz C. It all starts with an idea. Alzheimer Dis and Assoc Dis. 1999;13: (Although the following Web sites are learning resources for evidence-based
S106–S110. research and practice, the information is applicable across research disci-
Best Evidence in Medical Education (BEME). 具http://www.mailbase.ac.uk/ plines.)
lists/beme典. Accessed 3/30/01.
Bland CJ, Meurer LN, Maldonado G. A systematic approach to conducting Middlesex University. Teaching/Learning Resources for Evidence Based
a non-statistical meta-analysis of research literature. Acad Med. 1995;70: Practice. 具http://www.mdx.ac.uk/www/rctsh/ebp/main.htm典. Accessed
642–53. 3/30/01.
The Campbell Collaboration. 具http://campbell.gse.upenn.edu典. Accessed Centres for Health Evidence. Users’ Guides to Evidence-Based Practice.
3/30/01. 具http://www.cche.net/principles/content㛭all.asp典. Accessed 3/30/01.

Relevance

Louis Pangaro and William C. McGaghie

REVIEW CRITERIA
䡲 The study is relevant to the mission of the journal or its audience.
䡲 The study addresses important problems or issues; the study is worth doing.
䡲 The study adds to the literature already available on the subject.
䡲 The study has generalizability because of the selection of subjects, setting, and educational
intervention or materials.

ISSUES AND EXAMPLES RELATED TO CRITERIA question will affect what readers will do in their daily work,
for example, or what researchers will do in their next study,
An important consideration for editors in deciding whether or even what policymakers may decide. This can be true
to publish an article is its relevance to the community (or even if a study is ‘‘negative,’’ that is, does not confirm the
usually, communities) the journal serves. Relevance has sev- hypothesis at hand. For studies without hypotheses (for in-
eral connotations and all are judged with reference to a spe- stance, a systematic review of prior research or a meta-anal-
cific group of professionals and to the tasks of that group. ysis), the same question applies: Does this review achieve a
Indeed, one thing is often spoken of as being ‘‘relevant to’’ synthesis that will directly affect what readers do?
something else, and that something is the necessary context Second, a manuscript, especially one involving qualitative
that establishes relevance. research, may be pertinent to the community by virtue of
First, editors and reviewers must gauge the applicability of its contribution to theory building, generation of new hy-
the manuscript to problems within the journal’s focus; the potheses, or development of methods. In this sense, the
more common or important the problem addressed by an manuscript introduces, refines, or critiques issues that, for
article is to those involved in it, the more relevant it is. The example, underlie the teaching and practice of medicine,
essential issue is whether a rigorous answer to this study’s such as cognitive psychology, ethics, and epistemology. Thus

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 927


a study may be quite relevant even though its immediate, that of the topic per se, and the relevance includes the im-
practical application is not worked out. portance of the topic as well as whether the execution of
Third, each manuscript must be judged with respect to its the study or of the discussion is powerful enough to affect
appropriateness to the mission of the specific journal. Re- what others in the field think or do.
viewers should consider these three elements of relevance Relevance is, at times, a dichotomous, or ‘‘yes–no,’’ de-
irrespective of the merit or quality of an article. cision; but often it is a matter of degree, as illustrated by the
The relevance of an article is often most immediately ap- criteria. In this more common circumstance, relevance is a
parent in the first paragraphs of a manuscript, especially in summary conclusion rather than a simple observation. It is
how the research question or problem posed by the paper is a judgment supported by the applicability of the principles,
framed. As discussed earlier in ‘‘Problem Statement, Con- methods, instruments, and findings that together determine
ceptual Framework, and Research Question,’’ an effective ar- the weight of the relevance. Given a limited amount of
ticle explicitly states the issue to be addressed, in the form space in each issue of a journal, editors have to choose
of either a question to be answered or a controversy to be among competing manuscripts, and relevance is one way of
resolved. A conceptual or theoretical framework underlies a summarizing the importance of a manuscript’s subject, thesis,
research question, and a manuscript is stronger when this and conclusions to the journal’s readership.
framework is made explicit. An explicit presentation of the Certain characteristics or strengths can establish a man-
conceptual framework helps the reviewer and makes the uscript’s relevance: Would a large part of the journal’s com-
study’s importance or relevance more clear. munity—or parts of several of its overlapping communities
The relevance of a research manuscript may be gauged by —consider the paper worth reading? Is it important that this
its purpose or the intention of the study, and a vocabulary paper be published even though the journal can publish only
drawn from clinical research is quite applicable here. Fein- a fixed percentage of the manuscripts it receives each year?
stein classifies research according to its ‘‘architecture,’’ the As part of their recommendation to the editor (see Chapter
effort to create and evaluate research structures that have 3), reviewers are asked to rate how important a manuscript
both ‘‘the reproducible documentation of science and the is: extremely, very, moderately, slightly, or not important.
elegant design of art.’’ 1 Descriptive research provides collec- Issues that may influence reviewers and editors to judge a
tions of data that characterize a problem or provide infor- paper to be relevant include:
mation; no comparisons are inherent in the study design,
1. Irrespective of a paper’s methods or study design, the
and the observations may be used for policy decisions or to
topic at hand would be considered common and/or serious
prepare future, more rigorous studies. Many papers in social
by the readership. As stated before, relevance is a summary
science journals, including those in health professions edu-
judgment and not infallible. One study of clinical research
cation, derive their relevance from such an approach. In
papers showed that readers did not always agree with re-
cause–effect research, specific comparisons (for instance, to
viewers on the relevance of studies to their own practice.3
the subjects’ baseline status or to a separate control group)
Editors of medical education research journals, for example,
are made to reach conclusions about the efficacy or impact
must carefully choose to include the perspective of educa-
of an intervention (for instance, a new public health cam-
tional practitioners in their judgment of relevance, and try
paign or an innovative curriculum). The relevance of such
to reflect the concerns of these readers.
research architecture derives from its power to establish the
2. Irrespective of immediate and practical application, the
causality, or at least the strong effects, from innovations. In author(s) provides important insights for understanding the-
research that deals with process issues, as defined by Fein- ory, or the paper suggests innovations that have the potential
stein, the products of a new process or the performance of a to advance the field. In this respect, a journal leads its read-
particular procedure (for instance, a new tool for the assess- ership and does not simply reflect it. The field of clinical
ment of clinical competence) are studied as an indication of medicine is filled with examples of great innovations, such
the quality or value of the process or procedure. In this case as the initial description of radioimmunoassay or the citric
relevance is not from a cause-and-effect relationship but acid cycle by Krebs, that were initially rejected for publica-
from a new measurement tool that could be applied to a tion.4 To use medical education as the example again, spe-
wide variety of educational settings.1,p.15–16 cific evaluation methods, such as using actors to simulate
The relevance of a topic is related to, but is not the same patients, gradually pervaded undergraduate medical educa-
as, the feasibility of answering a research question. Feasibility tion but initially might have seemed unfeasible.5
is related to study design and deals with whether and how 3. The methods or conclusions described in the paper are
we can get an answer. Relevance more directly addresses applicable in a wide variety of settings.
whether the question is significant enough to be worth ask-
ing.2 The relevance of a manuscript is more complex than In summary, relevance is a necessary but not sufficient

928 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


criterion for the selection of articles to publish in journals. ucation. 4th ed. New York: McGraw–Hill Higher Education, 2000:30–
7.
The rigorous study of a trivial problem, or one already well
3. Justice AC, Berlin JA, Fletcher SW, Fletcher RH, Goodman SN. Do
studied, would not earn pages in a journal that must deal readers and peer reviewers agree on manuscript quality? JAMA. 1994;
with competing submissions. Reviewers and editors must de- 272:117–9.
cide whether the question asked is worth answering at all, 4. Horrobin DF. The philosophical basis of peer review and the suppression
whether its solution will contribute, immediately or in the of innovation. JAMA. 1990;263:1438–41.
5. Barrows HS. Simulated patients in medical teaching. Can Med Assoc J.
longer term, to the work of medical education and, finally,
1968;98:674–6.
whether the manuscript at hand will be applicable to the
journal’s readership. RESOURCES
Feinstein AR, Clinical Epidemiology: The Architecture of Clinical Re-
REFERENCES search. Philadelphia, PA: W. B. Saunders, 1985.
Fraenkel JR, Wallen NE. How to Design and Evaluate Research in Edu-
1. Feinstein AR. Clinical Epidemiology: The Architecture of Clinical Re- cation. 4th ed. New York: McGraw–Hill Higher Education, 2000.
search. Philadelphia, PA: W. B. Saunders, 1985;4. Fincher RM (ed). Guidebook for Clerkship Directors. Washington, DC:
2. Fraenkel JR, Wallen NE. How to Design and Evaluate Research in Ed- Association of American Medical Colleges, 2000.

METHOD
Research Design

William C. McGaghie, Georges Bordage, Sonia Crandall, and Louis Pangaro

REVIEW CRITERIA
䡲 The research design is defined and clearly described, and is sufficiently detailed to permit the
study to be replicated.
䡲 The design is appropriate (optimal) for the research question.
䡲 The design has internal validity; potential confounding variables or biases are addressed.
䡲 The design has external validity, including subjects, settings, and conditions.
䡲 The design allows for unexpected outcomes or events to occur.
䡲 The design and conduct of the study are plausible.

ISSUES AND EXAMPLES RELATED TO THE CRITERIA going from structured and formal to evolving and flexible.
A simplistic distinction between quantitative and qualitative
Research design has three key purposes: (1) to provide an- inquiry does not work because research excellence in many
swers to research questions, and (2) to provide a road map areas of inquiry often involves the best of both. The basic
for conducting a study using a planned and deliberate ap- issues are: (1) Given a research question, what are the best
proach that (3) controls or explains quantitative variation research design options? (2) Once a design is selected and
or organizes qualitative observations.1 The design helps the implemented, how is its use justified in terms of its strengths
investigator focus on the research question(s) and plan an and limits in a specific research context?
orderly approach to the collection, analysis, and interpreta- Reviewers should take into account key features of re-
tion of data that address the question. search design when evaluating research manuscripts. The
Research designs have features that range on a continuum key features vary in different sciences, of course, and review-
from controlled laboratory investigations to observational ers, as experts, will know the ones for their fields. Here the
studies. The continuum is seamless, not sharply segmented, example is from the various social sciences that conduct re-

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 929


search into human behavior, including medical education expected may not properly reflect real-world conditions or
research. The key features for such studies are stated below may stifle the expression of the true phenomenon studied.
as a series of five general questions addressing the following Is the research design implausible, given the research question,
topics: appropriateness of the design, internal validity, ex- the intellectual context of the study, and the practical circum-
ternal validity, unexpected outcomes, and plausibility. stances where the study is conducted? Common flaws in re-
search design include failure to randomize correctly in a con-
Is the research design appropriate (or as optimal as possible) trolled trial, small sample sizes resulting in low statistical
for the research question? The matter of congruence, or ‘‘fit,’’ power, brief or weak experimental interventions, and missing
is at issue because most research in medical education is or inappropriate comparison (control) groups. Signals of re-
descriptive, comparative, or correlational, or addresses new search implausibility include an author’s failure to describe
the research design in detail, failure to acknowledge context
developments (e.g., creation of measurement scales, manip-
effects on research procedures and outcomes, and the pres-
ulation of scoring rules, and empirical demonstrations such
ence of features of a study that appear unbelievable, e.g.,
as concept mapping2,3).
perfect response rates, flawless data. Often there are tradeoffs
Scholars have presented many different ways of classifying
in research between theory and pragmatics, precision and
or categorizing research designs. For example, Fraenkel and
richness, elegance and application. Is the research design at-
Wallen4 have recently identified seven general research
tentive to such compromises?
methods in education: experimental, correlational, causal–
comparative, survey, content analysis, qualitative, and his- Kenneth Hammond explains the bridge between design and
torical. Their classification illustrates some of the overlap conceptual framework, or theory:
(sometimes confusion) that can exist among design, data-
collection strategies, and data analysis. One could use an Every method, however, implies a methodology, expressed or
not; every methodology implies a theory, expressed or not. If
experimental design and then collect data using an open-
one chooses not to examine the methodological base of [one’s]
ended survey and analyze the written answers using a con- work, then one chooses not to examine the theoretical con-
tent analysis. Each method or design category can be sub- text of that work, and thus becomes an unwitting technician
divided further. Rigorous attention to design details at the mercy of implicit theories.1
encourages an investigator to focus the research method on
the research question, which brings precision and clarity to REFERENCES
a study. To illustrate, Fraenkel and Wallen4 break down ex-
1. Hammond KR. Introduction to Brunswikian theory and methods. In:
perimental research into four subcategories: weak experi-
Hammond KR, Wascoe NE (eds). New Directions for Methodology of
mental designs, true experimental designs, quasi-experi- Social and Behavioral Sciences, No. 3: Realizations of Brunswik’s Rep-
mental designs, and factorial designs. Medical education resentative Design. San Francisco, CA: Jossey–Bass, 1980:2.
research reports should clearly articulate the link between 2. McGaghie WC, McCrimmon DR, Mitchell G, Thompson JA, Ravitch
research question and research design and should embed that MM. Quantitative concept mapping in pulmonary physiology: compar-
ison of student and faculty knowledge structures. Am J Physiol: Adv
description in citations to the methodologic literature to Physiol Educ. 2000;23:72–81.
demonstrate awareness of fine points. 3. West DC, Pomeroy JR, Park JK, Gerstenberger EA, Sandoval J. Critical
Does the research have internal validity (i.e., integrity) to ad- thinking in graduate medical education: a role for concept mapping as-
dress the question rigorously? This calls for attention to a po- sessment? JAMA. 2000;284:1105–10.
tentially long list of sources of bias or confounding variables, 4. Fraenkel JR, Wallen NE. How to Design and Evaluate Research in Ed-
ucation. 4th ed. New York: McGraw–Hill, 2000.
including selection bias, attrition of subjects or participants,
intervention bias, strength of interventions (if any), mea-
RESOURCES
surement bias, reactive effects, study management, and many
more. Campbell DT, Stanley JC. Experimental and Quasi-experimental Designs
Does the research have external validity? Are its results for Research. Boston, MA: Houghton Mifflin, 1981.
Cook TD, Campbell DT. Quasi-experimentation: Design and Analysis Is-
generalizable to subjects, settings, and conditions beyond the sues for Field Settings. Chicago, IL: Rand McNally, 1979.
research situation? This is frequently (but not exclusively) a Fletcher RH, Fletcher SW, Wagner EH. Clinical Epidemiology: The Essen-
matter of sampling subjects, settings, and conditions as de- tials. 3rd ed. Baltimore, MD: Williams & Wilkins, 1996.
liberate features of the research design. Hennekens CH, Buring JE. Epidemiology in Medicine. Boston, MA: Little,
Does the research design permit unexpected outcomes or events Brown, 1987.
Kazdin AE (ed). Methodological Issues and Strategies in Clinical Research.
to occur? Are allowances made for expression of surprise re- Washington, DC: American Psychological Association, 1992.
sults the investigator did not consider or could not antici- Patton MQ. Qualitative Evaluation and Research Methods. 2nd ed. New-
pate? Any research design too rigid to accommodate the un- bury Park, CA: Sage, 1990.

930 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


Instrumentation, Data Collection, and Quality Control

Judy A. Shea, William C. McGaghie, and Louis Pangaro

REVIEW CRITERIA
䡲 The development and content of the instrument are sufficiently described or referenced, and
are sufficiently detailed to permit the study to be replicated.
䡲 The measurement instrument is appropriate given the study’s variables; the scoring method is
clearly defined.
䡲 The psychometric properties and procedures are clearly presented and appropriate.
䡲 The data set is sufficiently described or referenced.
䡲 Observers or raters were sufficiently trained.
䡲 Data quality control is described and adequate.

ISSUES AND EXAMPLES RELATED TO CRITERIA Selection and Development

Instrumentation refers to the selection or development and Describing the instrumentation starts with specifying in
the later use of tools to make observations about variables what way(s) the variables will be captured or measured. The
in a research study. The observations are collected, recorded, reviewer needs to know what was studied and how the data
and used as primary data. were collected. There are many means an author can choose.
In the social and behavioral sciences—covering health A broad definition is used here that includes, but is not
outcomes, medical education, and patient education re- limited to, a wide variety of tools such as tests and exami-
search, for example—these instruments are usually ‘‘paper- nations, attitude measures, checklists, surveys, abstraction
and-pencil’’ tools. In contrast, the biological sciences and forms, interview schedules, and rating forms. Indeed, schol-
physical sciences usually rely on tools such as microscopes, ars recommend that investigators use multiple measures to
CAT scans, and many other laboratory technologies. Yet the address the same research construct, a process called trian-
goals and process in developing and using instruments are gulation.1 Instrumentation is often relatively direct because
the same across the sciences, and therefore each field has existing and well-known tools are used to capture a variable
appropriate criteria within the overall standards of scientific of interest (e.g., Medical College Admission Test [MCAT]
research. Throughout this section, the focus and examples for medical school ‘‘readiness’’ or ‘‘aptitude’’; National Board
are from the social sciences and in particular from health of Medical Examiners [NBME] subject examinations for ‘‘ac-
professions research, although the general principles of the quisition of medical knowledge’’; Association of American
criteria apply across the sciences. Medical Colleges [AAMC] Graduation Questionnaire for
Instrumentation builds on the study design and problem ‘‘curricular experiences’’). But sometimes the process is less
statement and assumes that both are appropriately specified. straightforward. For example, if clinical competence of med-
In considering the quality of instrumentation and data col- ical students after a required core clerkship is the variable of
lection, the reviewer should focus on the rigor with which interest, it may be measured from a variety of perspectives.
data collection is executed. Reviewers are looking for or One approach is to use direct observations of students per-
evaluating four aspects of the execution: (1) selecting or de- forming a clinical task, perhaps with standardized patients.
veloping the instrument, (2) creating scores from the data Another approach is to use a written test to ask them what
captured by the instrument, (3) using the instrument appro- they would do in hypothetical situations. Another option is
priately, and (4) a sense that the methods employed met at to collect ratings made by clerkship directors at the end of
least minimum quality standards. the clerkship that attest to students’ clinical skills. Other

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 931


alternatives are peer- and self-ratings of competence. Or pa- scores. Nevertheless, reviewers need to be clear about how
tient satisfaction data could be collected. Choosing among investigators operationalized research variables and judged
several possible measures of a variable is a key decision when the technical properties (i.e., reliability and validity) of re-
planning a research study. search data.
Often a suitable measurement instrument is not available, Decisions made about cut-scores and classifications also
and instruments must be developed. Typically, when new need to be conveyed to readers. For example, in a study on
instruments are used for research, more detail about their the perceived frequency of feedback from preceptors and res-
development is expected than when existing measures are idents to students, the definition of ‘‘feedback’’ needs to be
employed. Reviewers do not have to be experts in instru- reported and justified. For example, is it a report of any feed-
ment development, but they need to be able to assess that back in a certain amount of time, or is it feedback at a higher
the authors did the right things. Numerous publications de- frequency, maybe more than twice a day? Investigators make
scribe the methods that should be followed in developing many decisions in the course of conducting a study. Not all
academic achievement tests,2,3 rating and attitude scales,4,6 need to be reported in a paper but enough should be present
checklists,7 and surveys.8 There is no single best approach to to allow readers to understand the operationalization of the
instrument development, but the process should be described variables of interest.
rigorously and in detail, and reviewers should look for cita- This discussion of score creation applies equally when the
tions provided for readers to access this information. source of data is an existing data set, such as the AAMC
Instrument development starts with specifying the content Faculty Roster or the AMA Master File. These types of data
domain, conducting a thorough review of past work to see raise yet more issues about justification of analytic decisions.
what exists, and, if necessary, beginning to create a new in- A focus of these manuscripts should be how data were se-
strument. If an existing instrument is used, the reviewer lected, cleaned, and manipulated. For example, if the AMA
needs to know and learn from the manuscript the rationale Master File is being used for a study on primary care provid-
and original sources. When new items are developed, the ers, how exactly was the sample defined? Was it by training,
content can be drawn from many sources such as potential board certification, or self-reports of how respondents spent
subjects, other instruments, the literature, and experts. What their professional time? Does it include research and admin-
the reviewer needs to see is that the process followed was istrative as well as clinical time? Does it include both family
more rigorous than a single investigator (or two) simply put- medicine and internal medicine physicians? When research-
ting thoughts on paper. The reviewers should make sure that ers do secondary data analyses, they lose intimate knowledge
the items were critically reviewed for their clarity and mean- of the database and yet must provide information. The re-
ing, and that the instrument was pilot tested and revised, as viewer must look for evidence of sound decisions about sam-
necessary. For some instruments, such as a data abstraction ple definition and treatment of missing data that preceded
form, pilot testing might mean as little as trying out the form the definition of scores.
on a sample of hospital charts. More stringent testing is
needed for instruments that are administered to individuals. Use of the Instrument

Creating Scores Designing an instrument and selecting and scoring it are


only two parts of instrumentation. The third and comple-
For any given instrument, the reviewer needs to be able to mentary part involves the steps taken to ensure that the
discern how scores or classifications are derived from the instrument is used properly. For many self-administered
instrument. For example, how were questionnaire responses forms, the important information may concern incentives
summed or dichotomized such that respondents were and processes used to gather complete data (e.g., contact of
grouped into those who ‘‘agreed’’ and ‘‘disagreed’’ or those non-responders, location of missing charts). For instruments
who were judged to be ‘‘competent’’ and ‘‘not competent’’? that may be more reactive to the person using the forms
If a manuscript is about an instrument, as opposed to the (e.g., rating forms, interviews), it is necessary to summarize
more typical case, when authors use an instrument to assess coherently the actions that were taken to minimize differ-
some question, investigators might present methods for for- ences related to the instrument user. This typically involves
mal scale development and evaluation, often focusing on discussions of rater or interviewer training and computation
subscale definition, reliability estimation, reproducibility, of inter- or intra-rater reliability coefficients.5
and homogeneity.9 Large development projects for instru-
ments designed to measure individual differences on a vari- General Quality Control
able of interest will also need to pay attention to validity
issues, sensitivity, and stability of scores.10 Other types of In addition to reviewing the details about the actual instru-
instruments do not lend themselves well to aggregated ments used in the study, reviewers need to gain a sense that

932 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


a study was conducted soundly.11 In most cases, it is impos- 2. Linn RL, Gronlund NE. Measurement and Assessment in Teaching.
sible and unnecessary to report internal methods that were 7th ed. Englewood Cliffs, NJ: Prentice–Hall, 1995.
3. Millman J, Green J. The specification and development of tests of
put in place for monitoring data collection and quality. This achievement and ability. In: Linn RL (ed). Educational Measurement.
level of detail might be expected for a proposal application, 3rd ed. New York: McMillan, 1989:335–66.
but it does not fit in most manuscripts. Still, depending on 4. Medical Outcomes Trust. Instrument review criteria. Med Outcomes
the methods of the study under review, the reviewer must Trust Bull. 1995;2:I–IV.
assess a variety of issues such as unbiased recruitment and 5. Streiner DL, Norman GR. Health Measurement Scales: A Practical
Guide to Their Development and Use. 2nd ed. Oxford, U.K.: Oxford
retention of subjects, appropriate training of data collectors,
University Press, 1995.
and sensible and sequential definitions of analytic variables. 6. DeVellis RF. Scale Development: Theory and Applications. Applied
The source of any funding must also be reported. Social Research Methods Series, Vol. 26. Newbury Park, CA: Sage,
These are generic concerns for any study. It would be too 1991.
unwieldy to consider here all possible elements, but the re- 7. McGaghie WC, Renner BR, Kowlowitz V, et al. Development and eval-
viewer needs to be convinced that the methods are sound uation of musculoskeletal performance measures for an objective struc-
tured clinical examination. Teach Learn Med. 1994;6:59–63.
—sloppiness or incompleteness in reporting (or worse)
8. Woodward CA. Questionnaire construction and question writing for
should raise a red flag. In the end the reviewer must be research in medical education. Med Educ. 1998;22:347–63.
convinced that appropriate rigor was used in selecting, de- 9. Kerlinger FN. Foundations of Behavioral Research. 3rd ed. New York:
veloping, and using measurement tools for the study. With- Holt, Rinehart and Winston, 1986.
out being an expert in measurement, the reviewer can look 10. Nunnally JC. Psychometric Theory. New York: McGraw–Hill, 1978.
for relevant details about instrument selection and subse- 11. McGaghie WC. Conducting a research study. In: McGaghie WC, Frey
JJ (eds). Handbook for the Academic Physician. New York: Springer
quent score development. Optimally the reviewer would be
Verlag, 1986:217–33.
left confident and clear about the procedures that the author
followed in developing and implementing data collection
tools. RESOURCES

REFERENCES Fraenkel JR, Wallen NE. How to Design and Evaluate Research in Edu-
cation. 3rd ed. New York: McGraw–Hill, 1996.
1. Campbell DT, Fiske DW. Convergent and discriminant validation by Linn RL, Gronlund NE. Measurement and Assessment in Teaching. 8th
the multitrait–multimethod matrix. Psychol Bull. 1959;56:81–105. ed. Englewood Cliffs, NJ: Merrill, 2000.

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 933


Population and Sample

William C. McGaghie and Sonia Crandall*

REVIEW CRITERIA
䡲 The population is defined clearly, for both subjects (participants) and stimulus (intervention),
and is sufficiently described to permit the study to be replicated.
䡲 The sampling procedures are sufficiently described.
䡲 Subject samples are appropriate to the research question.
䡲 Stimulus samples are appropriate to the research question.
䡲 Selection bias is addressed.

ISSUES AND EXAMPLES RELATED TO CRITERIA Given a population of interest (e.g., North American
medical students), how does an investigator define a popu-
Investigators in health outcomes, public health, medical ed- lation subset (sample) for the practical matter of conducting
ucation, clinical practice, and many other domains of schol- a research study? Textbooks provide detailed, scholarly de-
arship and science are expected to describe the research pop- scriptions of purist sampling procedures3,4 Other scholars,
ulation(s), sampling procedures, and research sample(s) for however, offer practical guides. For example, Fraenkel and
the empirical studies they undertake. These descriptions Wallen5 identify five sampling methods that a researcher
must be clear and complete to allow reviewers and research may use to draw a representative subset from a population
consumers to decide whether the research results are valid of interest. The five sampling methods are: random, simple,
internally and can be generalized externally to other research systematic, stratified random, and cluster.
samples, settings, and conditions. Given necessary and suf- Experienced reviewers know that most research in medical
ficient information, reviewers and consumers can judge education involves convenience samples of students, resi-
whether an investigator’s population, sampling methods, and dents, curricula, community practitioners, or other units of
research sample are appropriate to the research question. analysis. Generalizing the results of studies done on conven-
Sampling from populations has become a key issue in 20th ience samples of research participants or other units is risky
and 21st century applied research. Sampling from popula- unless there is a close match between research subjects and
tions addresses research efficiency and accuracy. To illustrate, the target population where research results are applied. In
the Gallup Organization achieves highly accurate (⫾3 per- some areas, such as clinical studies, the match is crucial, and
centage points) estimates about opinions of the U.S. popu- there are many excellent guides (for example, see Fletcher,
lation (280 million) using samples of approximately 1,200 Fletcher and Wagner6). Sometimes research is deliberately
individuals.1 done on ‘‘significant’’ 7 or specifically selected samples, such
Sampling from research populations goes in at least two as Nobel Laureates or astronauts and cosmonauts,8 where
dimensions: from subjects or participants (e.g., North Amer- descriptions of particular subjects, not generalization to a
ican medical students), and from stimuli or conditions (e.g., subject population, is the scholarly goal.
clinical problems or cases). Some investigators employ a Once a research sample is identified and drawn, its mem-
third approach—matrix sampling—to address research sub- bers may be assigned to study conditions (e.g., treatment and
jects and stimuli simultaneously.2 In all cases, however, re- control groups in the case of intervention research). By con-
viewers should find that the subject and stimulus populations trast, measurements are obtained uniformly from a research
and the sampling procedures are defined and described sample for single-group observational studies looking at sta-
tistical correlations among variables. Qualitative observa-
clearly.
tional studies of intact groups such as the surgery residents
described in Forgive and Remember9 and the internal medi-
cine residents in Getting Rid of Patients10 follow a similar ap-
*Lloyd Lewis, PhD, emeritus professor of the Medical College of Georgia,
participated in early meetings of the Task Force and contributed to the proach but use words, not numbers, to describe their research
earliest draft of this section. samples.

934 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


Systematic sampling of subjects or other units of analysis a valid estimate of population parameters and have low sta-
from a population of interest allows an investigator to gen- tistical power. Reviewers must be attentive to these potential
eralize research results beyond the information obtained flaws. Research reports should also describe use of incentives,
from the sample values. The same logic holds for the stimuli compensation for participation, and whether the research
or independent variables involved in a research enterprise participants are volunteers.
(e.g., clinical cases and their features in problem-solving re-
search). Careful attention to stimulus sampling is the cor-
nerstone of representative research.11–13 REFERENCES
An example may make the issue clearer. (The specifics
1. Gallup Opinion Index. Characteristics of the Sample. Princeton, NJ:
here are from medical education and are directly applicable
Gallup Organization, 1999.
to health professions education and generally applicable to 2. Sirotnik KA. Introduction to matrix sampling for the practitioner. In:
wide areas of social sciences.) Medical learners and practi- Popham WJ (ed). Evaluation in Education: Current Applications.
tioners are expected to solve clinical problems of varied de- Berkeley, CA: McCutchan, 1974.
grees of complexity as one indicator of their clinical com- 3. Henry GT. Practical sampling. In: Applied Social Research Methods
Series, Vol. 21. Newbury Park, CA: Sage, 1990.
petence. However, neither the population of eligible
4. Patton MQ. Qualitative Evaluation and Research Methods. 2nd ed.
problems nor clear-cut rules for sampling clinical problems Newbury Park, CA: Sage, 1990.
from the parent population have been made plain. Thus the 5. Fraenkel JR, Wallen NE. How to Design and Evaluate Research in
problems, often expressed as cases, used to evaluate medical Education. 4th ed. Boston, MA: McGraw–Hill, 2000.
personnel are chosen haphazardly. This probably contributes 6. Fletcher RH, Fletcher SW, Wagner EH. Clinical Epidemiology: The
Essentials. 3rd ed. Baltimore, MD: Williams & Wilkins, 1996.
to the frequently cited finding of case specificity (i.e., non-
7. Simonton DK. Significant samples: the psychological study of eminent
generalizability) of performance in research on medical prob- individuals. Psychol Meth. 1999;4:425–51.
lem solving.14 An alternative hypothesis is that case speci- 8. Santy PA. Choosing the Right Stuff: The Psychological Selection of
ficity has more to do with how the cases are selected or Astronauts and Cosmonauts. Westport, CT: Praeger, 1994.
designed than with the problem-solving skill of physicians 9. Bosk CL. Forgive and Remember: Managing Medical Failure. Chicago,
IL: University of Chicago Press, 1979.
in training or practice.
10. Mizrahi T. Getting Rid of Patients: Contradictions in the Socialization
Recent work on construction of examinations of academic of Physicians. New Brunswick, NJ: Rutgers University Press, 1986.
achievement in general15,16 and medical licensure examina- 11. Brunswik E. Systematic and Representative Design of Psychological Ex-
tions in particular17 is giving direct attention to stimulus periments. Berkeley, CA: University of California Press, 1947.
sampling and representative design. Conceptual work in the 12. Hammond KR. Human Judgment and Social Policy. New York: Oxford
University Press, 1996.
field of facet theory and design18 also holds promise as an
13. Maher BA. Stimulus sampling in clinical research: representative de-
organizing framework for research that takes stimulus sam- sign revisited. J Consult Clin Psychol. 1978;46:643–7.
pling seriously. 14. van der Vleuten CPM, Swanson DB. Assessment of clinical skills with
Research protocols that make provisions for systematic, standardized patients: state of the art. Teach Learn Med. 1990;2:58–
simultaneous sampling of subjects and stimuli use matrix 76.
15. Linn RL, Gronlund NE. Measurement and Assessment in Teaching.
sampling.2 Matrix sampling is especially useful when an in-
7th ed. Englewood Cliffs, NJ: Prentice–Hall, 1995.
vestigator aims to judge the effects of an overall program on 16. Millman J, Green J. The Specification and Development of Tests of
a broad spectrum of participants. Achievement and Ability. In: Linn RL (ed). Educational Measurement.
Isolating and ruling out sources of bias is a persistent prob- 3rd ed. New York: Macmillan, 1989.
lem when identifying research samples. Subject-selection 17. LaDuca A. Validation of professional licensure examinations: profes-
sions theory, test design, and construct validity. Eval Health Prof. 1994;
bias is more likely to occur when investigators fail to specify
17:178–97.
and use explicit inclusion and exclusion criteria; when there 18. Shye S, Elizur D, Hoffman M. Introduction to Facet Theory: Content
is differential attrition (drop out) of subjects from study con- Design and Intrinsic Data Analysis in Behavioral Research. Applied
ditions; or when samples are insufficient (too small) to give Social Methods Series Vol. 35. Thousand Oaks, CA: Sage, 1994.

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 935


Data Analysis and Statistics

William C. McGaghie and Sonia Crandall*

REVIEW CRITERIA
䡲 Data-analysis procedures are sufficiently described, and are sufficiently detailed to permit the
study to be replicated.
䡲 Data-analysis procedures conform to the research design; hypotheses, models, or theory drives
the data analyses.
䡲 The assumptions underlying the use of statistics are fulfilled by the data, such as measurement
properties of the data and normality of distributions.
䡲 Statistical tests are appropriate (optimal).
䡲 If statistical analysis involves multiple tests or comparisons, proper adjustment of significance
level for chance outcomes was applied.
䡲 Power issues are considered in statistical studies with small sample sizes.
䡲 In qualitative research that relies on words instead of numbers, basic requirements of data
reliability, validity, trustworthiness, and absence of bias were fulfilled.

ISSUES AND EXAMPLES RELATED TO THE CRITERIA scription of research samples and data-analysis procedures in
such papers.
Data analysis along the ‘‘seamless web’’ of quantitative and Statistical analysis methods such as t-tests or analysis of
qualitative research (see ‘‘Research Design,’’ earlier in this variance (ANOVA) used to assess group differences, corre-
chapter) must be performed and reported according to schol- lation coefficients used to assess associations among mea-
arly conventions. The conventions apply to statistical treat- sured variables within intact groups, or indexes of effect such
ment of data expressed as numbers and to qualitative data as odds ratios and relative risk in disease studies flow directly
expressed as observational records, field notes, interview re- from the investigator’s research design. (Riegelman and
ports, abstracts from hospital charts, and other archival Hirsch1 give specific examples.) Designs focused on differ-
records. Data analysis must ‘‘get it right’’ to ensure that the ences between experimental and control groups should use
research progression of design, methods (including data anal- statistics that feature group contrasts. Designs focused on
ysis), results, and conclusions and interpretation is orderly within-group associations should report results as statistical
and integrated. Amplification of the seven data-analysis and correlations in one or more of their many forms. Other data-
statistical review criteria in this section underscores this as- analytic methods include meta-analysis,2 i.e., quantitative
sertion. The next article, entitled ‘‘Reporting of Statistical integration of research data from independent investigations
Analyses,’’ extends these ideas. of the same research problem; procedures used to reduce
large, complex data sets into more simplified structures, as
Quantitative in factor analysis or cluster analysis; and techniques to dem-
onstrate data properties empirically, as in reliability analyses
Statistical, or quantitative, analysis of research data is not
of achievement-test or attitude-scale data, multidimensional
the keystone of science. It does, however, appear in a large
scaling, and other procedures. However, in all cases research
proportion of the research papers submitted to medical ed-
design dictates statistical analysis of research data. Statistical
ucation journals. Reviewers expect a clear and complete de-
analyses, when they are used, must be driven by the hy-
potheses, models, or theories that form the foundation of
the study being judged.
*Lloyd Lewis, PhD, emeritus professor of the Medical College of Georgia,
participated in early meetings of the Task Force and contributed to the Statistical analysis of research data often rests on assump-
earliest draft of this section. tions about data measurement properties and the normality

936 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


of data distributions, and many other features. These as- sion of the investigator’s personal orientation (e.g.,
sumptions must be satisfied to make the data analysis legit- homeopathy) in the written report.
imate. By contrast, nonparametric, or ‘‘distribution-free,’’ Qualitative data analysis has a deep and longstanding re-
statistical methods can be used to evaluate group differences search legacy in medical education and medical care. Well-
or the correlations among variables when research measure- known and influential examples are Boys in White, the classic
ments are in the form of categories (female–male, working– study of student culture in medical school, published by
retired) or ranks (tumor stages, degrees of edema). Reviewers Howard Becker and colleagues5; psychiatrist Robert Coles’
need to look for signs that the statistical analysis methods five-volume study, Children of Crisis6; the classic participant
were based on sound assumptions about characteristics of the observation study by clinicians of patient culture on psychi-
data and research design. atric wards published in Science7; and Terry Mizrahi’s obser-
A reviewer must be satisfied that statistical tests presented vational study of the culture of residents on the wards, Get-
in a research manuscript have been used and reported prop- ting Rid of Patients.8 Reviewers should be informed about the
erly. Signs of flawed data analysis include inappropriate or scholarly contribution of qualitative research in medical ed-
suboptimal analysis (e.g., wrong statistics) and failure to ucation. Prominent resources on qualitative research9–13 pro-
specify post hoc analyses before collecting data. vide research insights and methodologic details that would
Statistical analysis of data sets that is done without atten- be useful for the review of a complex or unusual study.
tion to an explicit research design or an a priori hypothesis
can quickly become an exercise in ‘‘data dredging.’’ The
availability of powerful computers, user-friendly statistical
REFERENCES
software, and large institutional data sets increases the like-
lihood of such mindless data analyses. Being able to perform 1. Riegelman RK, Hirsch RP. Studying a Study and Testing a Test: How
hundreds of statistical tests in seconds is not a proxy for to Read the Medical Literature. 2nd ed. Boston, MA: Little, Brown,
1989.
thoughtful attention to research design and focused data
2. Wolf FM. Meta-Analysis: Quantitative Methods for Research Synthe-
analysis. Reviewers should also be aware that, for example, sis. Sage University Paper Series on Quantitative Applications in the
in the context of only 20 statistical comparisons, one of the Social Sciences, No. 59. Beverly Hills, CA: Sage, 1986.
tests will be likely to achieve ‘‘significance’’ solely by chance. 3. Dawson B, Trapp RG. Basic and Clinical Biostatistics. 3rd ed. New
Multiple statistical tests or comparisons call for adjustment York: Lange Medical Books/McGraw-Hill, 2001.
4. Cohen J. Statistical Power Analysis for the Behavioral Sciences. Rev.
of significance levels ( p-values) using the Bonferroni or a
ed. New York: Academic Press, 1977.
similar procedure to ensure accurate data interpretation.3 5. Becker HS, Geer B, Hughes EC, Strauss A. Boys in White: Student
Research studies that involve small numbers of partici- Culture in Medical School. Chicago, IL: University of Chicago Press,
pants often lack enough statistical power to demonstrate sig- 1961.
nificant results.4 This shortfall can occur even when a larger 6. Coles R. Children of Crisis: A Study of Courage and Fear. Vols. 1–5.
Boston, MA: Little, Brown, 1967–1977.
study would show a significant effect for an experimental
7. Rosenhan DL. On being sane in insane places. Science. 1973;179:250–
intervention or for a correlation among measured variables. 8.
Whenever a reviewer encounters a ‘‘negative’’ study, the 8. Mizrahi T. Getting Rid of Patients: Contradictions in the Socialization
power question needs to be posed and ruled out as the reason of Physicians. New Brunswick, NJ: Rutgers University Press, 1986.
for a nonsignificant result. 9. Glaser BG, Strauss AL. The Discovery of Grounded Theory: Strategies
for Qualitative Research. Chicago, IL: Aldine, 1967.
10. Miles MB, Huberman AM. Qualitative Data Analysis: An Expanded
Qualitative Sourcebook. 2nd ed. Thousand Oaks, CA: Sage, 1994.
11. Harris IB. Qualitative methods. In: Norman GR, van der Vleuten
CPM, Newble D (eds). International Handbook for Research in Med-
Analysis of qualitative data, which involves manipulation of ical Education. Dordrecht, The Netherlands, Kluwer, 2001.
words and symbols rather than numbers, is also governed by 12. Gicomini MK, Cook DJ. Users’ guide to the medical literature. XXIII.
rules and rigor. Qualitative investigators are expected to use qualitative research in health care. A. Are the results of the study valid?
established, conventional approaches to ensure data quality JAMA. 2000;284:357–62.
13. Gicomini MK, Cook DJ. Users’ guide to the medical literature. XXIII.
and accurate analysis. Qualitative flaws include (but are not
qualitative research in health care. B. What are the results and how
limited to) inattention to data triangulation (i.e., cross- do they help me care for my patients? JAMA. 2000;284:478–82.
checking information sources); insufficient description (lack
of ‘‘thick description’’) of research observations; failure to
use recursive (repetitive) data analysis and interpretation;
lack of independent data verification by colleagues (peer de- RESOURCES
briefing); lack of independent data verification by stakehold- Goetz JP, LeCompte MD. Ethnography and Qualitative Design in Educa-
ers (member checking); and absence of the a priori expres- tional Research. Orlando, FL: Academic Press, 1984.

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 937


Guba EG, Lincoln YS. Effective Evaluation. San Francisco, CA: Jossey– Press, 1993.
Bass, 1981. Patton MQ. Qualitative Evaluation and Research Methods. 2nd ed. New-
Fleiss JL. Statistical Methods for Rates and Proportions. 2nd ed. New York: bury Park, CA: Sage, 1990.
John Wiley & Sons, 1981. Winer BJ. Statistical Principles in Experimental Design. 2nd ed. New York:
Pagano M, Gauvreau K. Principles of Biostatistics. Belmont, CA: Duxbury McGraw–Hill, 1971.

RESULTS

Reporting of Statistical Analyses

Glenn Regehr

REVIEW CRITERIA
䡲 The assumptions underlying the use of statistics are considered, given the data collected.
䡲 The statistics are reported correctly and appropriately.
䡲 The number of analyses is appropriate.
䡲 Measures of functional significance, such as effect size or proportion of variance accounted for,
accompany hypothesis-testing analyses.

ISSUES AND EXAMPLES RELATED TO THE CRITERIA sure, to the extent possible, that the data as collected con-
tinue to be amenable to the statistics that were originally
Even if the planned statistical analyses as reported in the intended. Often this is difficult because the data necessary
Method section are plausible and appropriate, it is sometimes to make this assessment are not presented. It is often nec-
the case that the implementation of the statistical analysis essary simply to assume, for example, that the sample distri-
as reported in the Results section is not. Several issues may butions were roughly normal, since the only descriptive sta-
have arisen in performing the analyses that render them in- tistics presented are the mean and standard deviation. When
appropriate as reported in the Results section. Perhaps the the opportunity does present itself, however, the reviewer
most obvious is the fact that the data may not have many should evaluate the extent to which the data collected for
of the properties that were anticipated when the data anal- the particular study satisfy the assumptions of the statistical
ysis was planned. For example, although a correlation be- tests that are presented in the Results section.
tween two variables was planned, the data from one or the Another concern that reviewers should be alert to is the
other (or both) of the variables may demonstrate a restric- possibility that while appropriate analyses have been selected
tion of range that invalidates the use of a correlation. When and performed, they have been performed poorly or inap-
a strong restriction of range exists, the correlation is bound propriately. Often enough data are presented to determine
to be low, not because the two variables are unrelated, but that the results of the analysis are implausible given the de-
because the range of variation in the particular data set does scriptive statistics, that ‘‘the numbers just don’t add up.’’ Al-
not allow for the expression of the relationship in the cor- ternatively, it may be the case that data and analyses are
relation. Similarly, it may be the case that a t-test was insufficiently reported for the reviewer to determine the ac-
planned to compare the means of two groups, but on review curacy or legitimacy of the analyses. Either of these situa-
of the data, there is a bimodal distribution that raises doubts tions is a problem and should be addressed in the review.
about the use of a mean and standard deviation to describe A third potential concern in the reporting of statistics is
the data set. If so, the use of a t-test to evaluate the differ- the presence in the Results section of analyses that were not
ences between the two groups becomes inappropriate. The anticipated in the Method section. In practice, the results
reviewer should be alert to these potential problems and en- of an analysis or a review of the data often lead to other

938 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


obvious questions, which in turn lead to other obvious anal- independent variable accounts for only a very small propor-
yses that may not have been anticipated. This type of ex- tion of the variance in the dependent variable, the result
pansion of analyses is not necessarily inappropriate, but the may not be sufficiently interesting to warrant extensive at-
reviewer must determine whether it has been done with con- tention in the Discussion section. If none of the indepen-
trol and reflection. If the reviewer perceives an uncontrolled dent variables accounts for a reasonable proportion of the
proliferation of analyses or if the new analyses appear with- variance, then the study may not warrant publication.
out proper introduction or explanation, then a concern
should be raised. It may appear to the reviewer that the
author has fallen into a trap of chasing an incidental finding RESOURCES
too far, or has enacted an unreflective or unsystematic set of
analysis to ‘‘look for anything that is significant.’’ Either of Begg C, Cho M, Eastwood S., et al. Improving the quality of reporting of
randomized controlled trials: the CONSORT statement. JAMA. 1996;
these possibilities implies the use of inferential statistics for 276:637–9.
purposes beyond strict hypothesis testing and therefore Cohen J. The earth is round ( p < .05). Am Psychol. 1994;49:997–1003.
stretches the statistics beyond their intended use. Dawson B, Trapp RG. Basic and Clinical Biostatistics. 3rd ed. New York:
On a similar note, reviewers should be mindful that as the Lange Medical Books/McGraw–Hill, 2001.
number of statistical tests increases, the likelihood that at Hays WL. Statistics. New York: Holt, Rinehart and Winston, 1988.
Hopkins KD, Glass GV. Statistical Methods in Education and Psychology.
least one of the analyses will be ‘‘statistically significant’’ by Boston, MA: Allyn & Bacon, 1995.
chance alone also increases. When analyses proliferate it is Howell DC. Statistical Methods for Psychology. 4th ed. Belmont, CA:
important for the reviewer to determine whether the signif- Wadsworth, 1997.
icance levels (p-values) have been appropriately adjusted to Lang TA, Secic M. How to Report Statistics in Medicine. Philadelphia,
reflect the need to be more conservative. PA: College of Physicians, 1997.
Meehl PE. Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and
Finally, it is important to note that statistical significance the slow progress of soft psychology. J Consult Clin Psychol. 1978;46:
does not necessarily imply practical significance. Tests of sta- 806–34.
tistical significance tell an investigator the probability that Moher D, Cook DJ, Eastwood S, Olkin I, Rennie D, Stroup DF. Improving
chance alone is responsible for study outcomes. But infer- the quality of reports of meta-analyses of randomised controlled trials:
ential statistical tests, whether significant or not, do not re- the QUOROM statement. Quality of Reporting of Meta-analyses. Lan-
cet. 1999. 354:1896–900.
veal the strength of association among research variables or
Norman GR, Striner DL. Biostatistics: The Bare Essentials. St. Louis, MO:
the effect size. Strength of association is gauged by indexes Mosby, 1994 [out of print].
of the proportion of variance in the dependent variable that Norusis MJ. SPSS 9.0 Guide to Data Analysis. Upper Saddle River, NJ:
is ‘‘explained’’ or ‘‘accounted for’’ by the independent vari- Prentice–Hall, 1999.
ables in an analysis. Common indexes of explained variation Rennie D. CONSORT revised—improving the reporting of randomized
trials. JAMA. 2001;285:2006–7.
are eta2 (␩2) in ANOVA and R2 (coefficient of determina-
Stroup DF, Berlin JA, Morton SC, et al. Meta-analysis of observational
tion) in correlational analyses. Reviewers must be alert to studies in epidemiology: a proposal for reporting. Meta-analysis Of Ob-
the fact that statistically significant research results tell only servational Studies in Epidemiology (MOOSE) group. JAMA. 2000;283:
part of the story. If a result is statistically significant, but the 2008–12.

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 939


Presentation of Results

Glenn Regehr

REVIEW CRITERIA
䡲 Results are organized in a way that is easy to understand.
䡲 Results are presented effectively; the results are contextualized.
䡲 Results are complete.
䡲 The amount of data presented is sufficient and appropriate.
䡲 Tables, graphs, or figures are used judiciously and agree with the text.

ISSUES AND EXAMPLES RELATED TO CRITERIA duction, it would be foreshadowed by the descriptions pro-
vided in the Method section, and it would anticipate the
The Results section of a research paper lays out the body of organization of points to be elaborated in the Discussion. If
evidence collected within the context of the study to support there are several research questions, hypotheses, or impor-
the conclusions and generalizations that are presented in the tant findings, the Results section may be best presented as a
Discussion section. To be effective in supporting conclusions, series of subsections, with each subsection presenting the
the study results and their relation to the research questions results that are relevant to a given question, hypothesis, or
and discussion points must be clear to the reader. Unless this set of findings. This type of organization clarifies the point
relationship is clear, the reader cannot effectively judge the of each set of results or analyses and thus makes it relatively
quality of the evidence or the extent to which it supports easy to determine how the results or analyses speak to the
the claims in the Discussion section. Several devices can research questions. In doing so, this organization also pro-
maximize this presentation, and reviewers need to be aware vides an easy method for determining whether each of the
of these techniques so that they can effectively express their research questions has been addressed appropriately and
concerns about the Results section and provide useful feed- completely, and it provides a structure for identifying post
back to the authors. hoc or additional analyses and serendipitous findings that
might not have been initially anticipated.
However, there are other ways to organize a Results sec-
Organization of the Data and Analyses tion that also maintain clarity and coherence and may better
represent the data and analyses. Many of these methods are
The organization of the data and analyses is critical to the used in the context of qualitative research, but may also be
coherence of the Results section. The data and analyses relevant to quantitative/experimental/hypothesis-testing re-
should be presented in an orderly fashion, and the logic in- search designs. Similar to the description above, the results
herent in that order should be made explicit. There are sev- may be grouped according to themes arising in response to
eral possible ways to organize the data, and the choice of articulated research objectives (although, because themes of-
organization ought to be strategic, reflecting the needs of the ten overlap, care must be taken to focus the reader on the
audience and the nature of the findings being presented. The theme under consideration while simultaneously identifying
reviewer should be alert to the organization being adopted and explaining its relationship to the others). Alternately,
and determine whether this particular organization is effec- the data may be organized according to the method of col-
tive in conveying the results coherently. lection (interviews, observations, documents) or to critical
One very helpful type of organization is to use a parallel phases in the data-analysis process (e.g., primary node coding
structure across the entire research paper, that is, to make and axial coding).
the organization of the results consistent with the organi- Regardless of the choice of organization, if it does not
zation of the other sections of the paper. Thus, the organi- clearly establish the relevance of the data presented and the
zation of the results section would mirror the organization analyses performed, then the point of the presentation has
of the research questions that were established in the Intro- not been properly established and the Results section has

940 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


failed in its purpose. If the results are not coherent, the re- The Use of Narration for Quantitative Data
viewer must consider whether the problem lies in a poor
execution of the analyses or in a poor organization of the The Results section is not the place to elaborate on the
Results section. If the first, the paper is probably not ac- implications of data collected, how the data fit into the
ceptable. If the second, the reviewer might merely want to larger theory that is being proposed, or how they relate to
suggest an organizational structure that would convey the other literature. That is the role of the Discussion section.
results effectively. This being said, however, it is also true that the Results
section of a quantitative/hypothesis-testing study should not
be merely a string of numbers and Greek letters. Rather, the
Selection of Qualitative Data for Presentation results should include a narrative description of the data, the
point of the analysis, and the implications of the analysis for
Qualitative research produces great amounts of raw material. the data. The balance between a proper and complete de-
And while the analysis process is designed to order and ex- scription of the results and an extrapolation of the impli-
plain this raw material, at the point of presenting results the cations of the results for the research questions is a fine line.
author still possesses an overwhelming set of possible ex- The distinction is important, however. Thus, it is reasonable
cerpts to provide in a Results section. Selecting which data —in fact, expected—that a Results section include a state-
to present in a Results section is, therefore, critical. The ment such as ‘‘Based on the pattern of data, the statistically
logic that informs this selection process should be transpar- significant two-way interaction in the analysis of variance
ent and related explicitly to the research questions and ob- implies that the treatment group improved on our test of
jectives. Further, the author should make clear any implicit knowledge more than the control group.’’ It is not appro-
relationships among the results presented in terms of trends, priate for the Results section to include a statement such as
contrasting cases, voices from a variety of perspectives on an ‘‘The ANOVA demonstrates that the treatment is effective’’
issue, etc. Attention should be paid to ensuring that the or, even more extreme, ‘‘the ANOVA demonstrates that we
selection process does not distort the overall gist of the en- should be using our particular educational treatment rather
tire data set. Further, narrative excerpts should be only as than the other.’’ The first statement is a narrative description
long as required to represent a theme or point of view, with of the data interpreted in the context of the statistical anal-
care taken that the excerpts are not minimized to the point ysis. The second statement is an extrapolation of the results
of distorting their meaning or diluting their character. This to the research question and belongs in the Discussion. The
is a fine line, but its balance is essential to the efficient yet third is an extreme over-interpretation of the results, a
accurate presentation of findings about complex social phe- highly speculative value judgment about the importance of
nomena. the outcome variables used in the study relative to the huge
number of other variables and factors that must be weighed
in any decision to adopt a new educational method (and, at
least in the form presented above, should not appear any-
The Balance of Descriptive and Inferential Statistics for
where in the paper). It is the reviewer’s responsibility to
Quantitative Data
determine whether the authors have found the appropriate
balance of description. If not, areas of concern (too little
In quantitative/hypothesis-testing papers, a rough parallel to
description or too much interpretation) should be identified
the qualitative issue of selecting data for presentation is the
in feedback to the authors.
balance of descriptive and inferential statistics. One com-
mon shortcoming in quantitative/hypothesis-testing papers is
that the Results section focuses very heavily on inferential
statistics with little attention paid to proper presentation of Contextualization of Qualitative Data
descriptive statistics. It is often forgotten that the inferential
statistics are presented only to aid in the reasonable inter- Again, there is a parallel issue regarding the narrative pre-
pretation of the descriptive statistics. If the data (or pattern sentation of data in qualitative studies. In the process of
of data) to which the inferential statistics are being applied selecting material from a set of qualitative data (for example,
are not clear, then the point of the inferential statistics has when carving out relevant narrative excerpts from analyzed
not been properly established and the Results section has focus group transcripts), it is important that data not become
failed in its purpose. Again, however, this is a fine balance. ‘‘disconnected’’ and void of their original meaning(s). Nar-
Excessive presentation of descriptive statistics that do not rative results, like numeric data, cannot stand on their own.
speak to the research objectives may also make the Results They require descriptions of their origins in the data set, the
section unwieldy and uninterpretable. nature of the analysis conducted, and the implications of the

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 941


analysis for the understandings achieved. A good qualitative relevant research question, hypothesis, or analysis. It is also
Results section provides a framework for the selected data worth noting that, although somewhat mundane, an impor-
to ensure that their original contexts are sufficiently appar- tant responsibility of the reviewer is to determine whether
ent that the reader can judge whether the ensuing interpre- the data in the tables, the figures, and the text are consistent.
tation is faithful to and reflects those contexts. If the numbers or descriptions in the text do not match those
in the tables or figures, serious concern must be raised about
The Use of Tables and Figures the quality control used in the data analysis and interpre-
tation.
Tables and figures present tradeoffs because they often are
the best way to convey complex data, yet they are also gen- The author gratefully acknowledges the extensive input and feedback for
this chapter provided by Dr. Lorelei Lingard.
erally expensive of a journal’s space. This is true for print
(that is, paper) journals; but the situation is often different
with electronic journals or editions. Most papers are still
RESOURCES
published in print journals, however. Thus, the reviewer
must evaluate whether the tables and figures presented are American Psychological Association. Publication Manual. 4th ed. Wash-
the most efficient or most elucidating method of presenting ington, DC: American Psychological Association, 1994.
the data and whether they are used appropriately sparingly. Harris IB. Qualitative methods. In: Norman GR, van der Vleuten CPM,
Newble D (eds). International Handbook for Research in Medical Edu-
If it would be easy to present the data in the text without cation. Amsterdam, The Netherlands: Kluwer, 2001.
losing the structure or pattern of interest, this should be the Henry GT. Graphing Data: Techniques for Display and Analysis. Applied
preferred method of presentation. If tables or figures are used, Social Research Methods Series Vol. 36. Thousand Oaks, CA: Sage,
every effort should be made to combine data into only a few. 1995.
In addition, if data are presented in tables or figures, they Regehr G. The experimental tradition. In: Norman GR, van der Vleuten
CPM, Newble D (eds). International Handbook for Research in Medical
should not be repeated in their entirety in the text. Rather, Education. Amsterdam, The Netherlands: Kluwer, 2001.
the text should be used to describe the table or figure, high- Tufte ER. The Visual Display of Quantitative Information. Cheshire, CT:
lighting the key elements in the data as they pertain to the Graphics Press, 1983 (1998 printing).

DISCUSSION AND CONCLUSION


Discussion and Conclusion: Interpretation

Sonia J. Crandall and William C. McGaghie

REVIEW CRITERIA
䡲 The conclusions are clearly stated; key points stand out.
䡲 The conclusions follow from the design, methods, and results; justification of conclusions is
well articulated.
䡲 Interpretations of the results are appropriate; the conclusions are accurate (not misleading).
䡲 The study limitations are discussed.
䡲 Alternative interpretations for the findings are considered.
䡲 Statistical differences are distinguished from meaningful differences.
䡲 Personal perspectives or values related to interpretations are discussed.
䡲 Practical significance or theoretical implications are discussed; guidance for future studies is
offered.

942 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


ISSUES AND EXAMPLES RELATED TO THE CRITERIA previous research. Results or analyses should not be discussed
unless they are presented in the Results section.
Research follows a logical process. It starts with a problem Data may be misrepresented or misinterpreted, but more
statement and moves through design, methods, and results. often errors come from over-interpreting the data from a
Researchers’ interpretations and conclusions emerge from theoretical perspective. For example, a reviewer may see a
these four interconnected stages. Flaws in logic can arise at statement such as ‘‘The sizeable correlation between test
any of these stages and, if they occur, the author’s interpre- scores and ‘depth of processing’ measures clearly demon-
tations of the results will be of little consequence. Flaws in strates that the curriculum should be altered to encourage
logic can also occur at the interpretation stage. The re- students to process information more deeply.’’ The curricular
searcher may have a well-designed study but obscure the true implication may be true but it is not supported by data. Al-
meaning of the data by misreading the findings.1 though the data show that encouraging an increased depth
Reviewers need to have a clear picture of the meaning of of processing improves test scores, this outcome does not
research results. They should be satisfied that the evidence demonstrate the need to change curriculum. The intent to
is discussed adequately and appears reliable, valid, and trust- change the curriculum is a value statement based on a judg-
worthy. They should be convinced that interpretations are ment about the utility of high test scores and their impli-
justified given the strengths and limitations of the study. In cations for professional performance. Curricular change is
addition, given the architecture, operations, and limitations not implied directly from the connection between test scores
of the study, reviewers should judge the generalizability and and professional performance.
practical significance of its conclusions. The language used in the Discussion needs to be clear and
The organization of the Discussion section should match precise. For example, in research based on a correlation de-
the structure of the Results section in order to present a sign, the Discussion needs to state whether the correlations
coherent interpretation of data and methods. Reviewers derive from data collected concurrently or over a span of
need to determine how the discussion and conclusions relate time.3 Correlations over time suggest a predictive relation-
to the original problem and research questions. Most im- ship among variables, which may or may not reflect the in-
portant, the conclusions must be clearly stated and justified, vestigator’s intentions. The language used to discuss such an
illustrating key points. Broadly, important aspects to consider outcome must be unambiguous.
include whether the conclusions are reasonable based on the
description of the results; on how the study results relate to Qualitative Approaches
other research outcomes in the field, including consensus,
conflicting, and unexpected findings; on how the study out- Qualitative researchers must convince the reviewer that
comes expand the knowledge base in the field and inform their data are trustworthy. To describe the trustworthiness of
future research; and on whether limitations in the design, the collected data, the author may use criteria such as cred-
procedures, and analyses of the study are described. Failure ibility (internal validity) and transferability (external valid-
to discuss the limitations of the study should be considered ity) and explain how each was addressed.4 (See Giacomini
a serious flaw. and Cook, for example, for a thorough explanation of as-
On a more detailed level, reviewers must evaluate whether sessing validity in qualitative health care research.5) Credi-
the authors distinguish between (1) inferences drawn from bility may be determined through data triangulation, mem-
the results, which are based on data-analysis procedures and ber checking, and peer debriefing.4,6 Triangulation compares
(2) extrapolations to the conceptual framework used to de- multiple data sources, such as a content analysis of curricu-
sign the study. This is the difference between formal hy- lum documents, transcribed interviews with students and the
pothesis testing and theoretical discussion. faculty, patient satisfaction questionnaires, and observations
of standardized patient examinations. Member checking is a
Quantitative Approaches process of ‘‘testing’’ interpretations and conclusions with the
individuals from whom the data were collected (interviews).4
From the quantitative perspective, when interpreting hy- Peer debriefing is an ‘‘external check on the inquiry process’’
pothesis-testing aspects of a study, authors should discuss the using disinterested peers who parallel the analytic procedures
meaning of both statistically significant and non-significant of the researcher to confirm or expand interpretations and
results. A statistically significant result, given its p-value and conclusions.4 Transferability implies that research findings
confidence interval, may have no implications for practice.2 can be used in other educational contexts (generalizabil-
Authors should explain whether each hypothesis is con- ity).6,7 The researcher cannot, however, establish external
firmed or refuted and whether each agrees or conflicts with validity in the same way as in quantitative research.4 The

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 943


reviewer must judge whether the conclusions transfer to 8. Peshkin A. The Color of Strangers, the Color of Friends. Chicago, IL:
University of Chicago Press, 1991.
other contexts.

Biases

Both qualitative and quantitative data are subject to bias. RESOURCES


When judging qualitative research, reviewers should care-
Day RA. How to Write and Publish A Scientific Paper. 5th ed. Phoenix,
fully consider the meaning and impact of the author’s per-
AZ: Oryx Press, 1998 [chapter 10].
sonal perspectives and values. These potential biases should Erlandson DA, Harris EL, Skipper BL, Allen SD. Doing Naturalistic In-
be clearly explained because of their likely influence on the quiry: A Guide to Methods. Newbury Park, CA: Sage, 1993.
analysis and presentation of outcomes. Those biases include Fraenkel JR, Wallen NE. How to Design and Evaluate Research in Edu-
the influence of the researcher on the study setting, the se- cation. 4th ed. Boston, MA: McGraw–Hill Higher Education, 2000
[chapters 19, 20].
lective presentation and interpretation of results, and the Gehlbach SH. Interpreting the Medical Literature. 3rd ed. New York:
thoroughness and integrity of the interpretations. Peshkin’s McGraw–Hill, 1992.
work is a good example of announcing one’s subjectivity and Guiding Principles for Mathematics and Science Education Research Meth-
its potential influence on the research process.8 He and other ods: Report of a Workshop. Draft. Workshop on Education Research
qualitative researchers acknowledge their responsibility to Methods, Division of Research, Evaluation and Communication, Na-
tional Science Foundation, November 19–20, 1998, Ballston, VA. Sym-
explain how their values may affect research outcomes. Re- posium presented at the meeting of the American Education Research
viewers of qualitative research need to be convinced that Association, April 21, 1999, Montreal, Quebec, Canada. 具http://bear
the influence of subjectivity has been addressed.6 .berkeley.edu/publications/report11.html典. Accessed 5/1/01.
Huth EJ. Writing and Publishing in Medicine. 3rd ed. Baltimore, MD: Wil-
liams & Wilkins, 1999.
Lincoln YS, Guba EG. Naturalistic Inquiry. Newbury Park, CA: Sage
REFERENCES Publications, 1985 [chapter 11].
1. Day RA. How to Write and Publish a Scientific Paper. 5th ed. Phoenix, Miller WL, Crabtree BF. Clinical research. In: Denzin NK, Lincoln YS
AZ: Oryx Press, 1998. (eds). Handbook of Qualitative Research. Thousand Oaks, CA: Sage,
2. Rosenfeld RM. The seven habits of highly effective data users [editorial]. 1994:340–53.
Otolaryngol Head Neck Surg. 1998;118:144–58. Patton MQ. Qualitative Evaluation and Research Methods. 2nd ed. New-
3. Fraenkel JR, Wallen NE. How to Design and Evaluate Research in bury Park, CA: Sage, 1990.
Education. 4th ed. Boston, MA: McGraw–Hill Higher Education, Peshkin A. The goodness of qualitative research. Educ Res. 1993;22:23–9.
2000. Riegelman RK, Hirsch RP. Studying a Study and Testing a Test: How to
4. Lincoln YS, Guba EG. Chapter 11. Naturalistic Inquiry. Newbury Park, Read the Health Science Literature. 3rd ed. Boston, MA: Little, Brown,
CA: Sage, 1985. 1996.
5. Giacomini MK, Cook DJ. Users’ guide to the medical literature. XXIII. Teaching/Learning Resources for Evidence Based Practice. Middlesex Uni-
Qualitative research in health care. A. Are the results of the study valid? versity, London, U.K. 具http://www.mdx.ac.uk/www/rctsh/ebp/main.htm典.
JAMA. 2000;284:357–62. Accessed 5/1/01.
6. Grbich C. Qualitative Research in Health. London, U.K.: Sage, 1999. Users’ Guides to Evidence-Based Practice. Centres for Health Evidence
7. Erlandson DA, Harris EL, Skipper BL, Allen SD. Doing Naturalistic [Canada]. 具http://www.cche.net/principles/content㛭all.asp典. Accessed
Inquiry: A Guide to Methods. Newbury Park, CA: Sage, 1993. 5/1/01.

944 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


TITLE, AUTHORS, AND ABSTRACT

Title, Authors, and Abstract

Georges Bordage and William C. McGaghie

REVIEW CRITERIA
䡲 The title is clear and informative.
䡲 The title is representative of the content and breadth of the study (not misleading).
䡲 The title captures the importance of the study and the attention of the reader.
䡲 The number of authors appears to be appropriate given the study.
䡲 The abstract is complete (thorough); essential details are presented.
䡲 The results in the abstract are presented in sufficient and specific detail.
䡲 The conclusions in the abstract are justified by the information in the abstract and the text.
䡲 There are no inconsistencies in detail between the abstract and the text.
䡲 All of the information in the abstract is present in the text.
䡲 The abstract overall is congruent with the text; the abstract gives the same impression as the
text.

ISSUES AND EXAMPLES RELATED TO THE CRITERIA title tells the reader about the nature of the study, while the
informative aspect presents the message derived from the
When a manuscript arrives, the reviewer immediately sees study results. To illustrate, consider the following title: ‘‘A
the title and the abstract, and in some instances—depend- Survey of Academic Advancement in Divisions of General
ing on the policy of the journal—the name of the authors. Internal Medicine.’’ This title tells the readers what was
This triad of title, authors, and abstract is both the beginning done (i.e., it is indicative) but fails to convey a message (i.e.,
and the end of the review process. It orients the reviewer, it is not informative). A more informative title would read
but it can be fully judged only after the manuscript is ana- ‘‘A Survey of Academic Advancement in Divisions of Gen-
lyzed thoroughly. eral Internal Medicine: Slower Rate and More Barriers for
Women.’’ The subtitle now conveys the message while still
Title being concise.

The title can be viewed as the shortest possible abstract. Authorship


Consequently, it needs to be clear and concise while accu-
rately reflecting the content and breadth of the study. As Reviewers are not responsible for setting criteria for author-
one of the first ‘‘outside’’ readers of the manuscript, the re- ship. This is a responsibility of editors and their editorial
viewer can judge if the title is too general or misleading, boards. When authors are revealed to the reviewer, however,
whether it lends appropriate importance to the study, and if the reviewer can help detect possible ‘‘authorship inflation’’
it grabs the reader’s attention. (too many authors) or ‘‘ghost authors’’ (too few true au-
The title of an article must have appeal because it prompts thors).
the reader’s decision to study the report. A clear and inform- The Uniform Requirements for Manuscripts Submitted to Bi-
ative title orients the readers and reviewers to relevant in- omedical Journals2 covers a broad range of issues and contains
formation. Huth1 describes two key qualities of titles, ‘‘in- perhaps the most influential single definition of authorship,
dicative’’ and ‘‘informative.’’ The indicative aspect of the which is that

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 945


Each author should have participated sufficiently in the work authorship. Such contributions can be recognized in a foot-
to take public responsibility for the content. Authorship note or in an acknowledgement. Other limited or indirect
credit should be based only on substantial contributions to contributions include providing subjects, participating in a
(a) conception and design, or analysis and interpretation of pilot study, or providing materials or research space.7 Finally,
data; and to (b) drafting the article or revising it critically for
some so-called ‘‘contributions’’ are honorary, such as credit-
important intellectual content; and on (c) final approval of
the version to be published. Conditions (a), (b), and (c) must
ing department chairpersons, division chiefs, laboratory di-
all be met. rectors, or senior faculty members for pro forma involvement
in creative work.8
Conversely, no person involved significantly in the study
Furthermore, ‘‘Any part of an article critical to its main con-
should be omitted as an author. Flanagin et al.8 found that
clusions must be the responsibility of at least one author,’’
11% of articles in three large-circulation general medicine
that is, a manuscript should not contain any statement or
journals in 1996 had ‘‘ghost authors,’’ individuals who were
content for which none of the authors can take responsibil-
not named as authors but who had contributed substantially
ity. More than 500 biomedical journals have voluntarily al-
to the work. A reviewer may suspect ghost authorship when
lied themselves with the Uniform Requirements standards, al-
reviewing a single-authored manuscript reporting a complex
though not all of them accept this strict definition of
study.
authorship. Instead, they use different numbers of authors
When authors’ names are revealed on a manuscript, re-
and/or combinations of the conditions for their definitions.
viewers should indicate to the editor any suspicion about
Also, different research communities have different
there being too many or too few authors.
traditions of authorship, some of which run counter to the
Uniform Requirements definition.
The number of authors per manuscript has increased Abstracts
steadily over the years, both in medical education and in
clinical research. Dimitroff and Davis report that the number Medical journals began to include abstracts with articles in
of articles with four or more authors in medical education is the late 1960s. Twenty years later an ad hoc working group
increasing faster than the number of papers with fewer au- proposed ‘‘more informative abstracts’’ (MIAs) based on pub-
thors.3 Comparing numbers in 1975 with those in 1998, lished criteria for the critical appraisal of the medical liter-
Drenth found that the mean number of authors of original ature.9 The goals of the MIAs were threefold: ‘‘(1) assist
articles in the British Medical Journal steadily increased from readers to select appropriate articles more quickly, (2) allow
3.21 (SD = 1.89) to 4.46 (SD = 2.04), a 1.4-fold jump.4 more precise computerized literature searches, and (3) facil-
While having more authors is likely to be an indication of itate peer review before publication.’’ The group proposed a
the increased number of people involved in research activ- 250-word, seven-part abstract written in point form (versus
ities, it could also signal inflation in the number of authors narrative). The original seven parts were soon increased to
to build team members’ curricula vitae for promotion. From eight10,11: objective (the exact question(s) addressed by the
an editorial standpoint, this is ‘‘unauthorized’’ authorship. article), design (the basic design of the study), setting (the
More and more journals are publishing their specific cri- location and level of clinical care [or education]), patients or
teria for authorship to help authors decide who should be participants (the manner of selection and numbers of patients
included in the list of authors. Some journals also require or participants who entered and completed the study), inter-
each author to complete and sign a statement of authorship ventions (the exact treatment or intervention, if any), main
indicating their significant contributions to the manuscript. outcome measures (the primary study outcome measure), re-
For example, the Annals of Internal Medicine offers a list of sults (key findings), and conclusions (key conclusions includ-
contribution codes that range from conception and design ing direct clinical [or educational] applications).
of the study to obtaining funds or collecting and assembling The working group’s proposal was published in the Annals
data, as well as a space for ‘‘other contributions.’’ The con- of Internal Medicine and was called by Annals editor Edward
tribution codes and signed statement are a sound reminder Huth the ‘‘structured abstract.’’ 12 Most of the world’s leading
and acknowledgement for authors and a means for editors clinical journals followed suit. Journal editors anticipated
to judge eligibility of authorship. that giving reviewers a clear summary of salient features of
Huth argues that certain conditions alone do not justify a manuscript as they begin their review would facilitate the
authorship. These conditions include acquiring funds, col- review process. The structured abstract provides the reviewer
lecting data, administering the project, or proofreading or with an immediate and overall sense of the reported study
editing manuscript drafts for style and presentation, not right from the start of the review process. The ‘‘big picture’’
ideas.5,6 Under these conditions, doing data processing with- offered by the structured abstract helps reviewers frame their
out statistical conceptualization is insufficient to qualify for analysis.

946 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


The notion of MIAs, or structured abstracts, was soon 9. Ad Hoc Working Group for Critical Appraisal of the Medical Litera-
ture. A proposal for more informative abstracts of clinical articles. Ann
extended to include review articles.13 The proposed format
Intern Med. 1987;106:598–604.
of the structured abstract for review articles contained six 10. Altman DG, Gardner MJ. More informative abstracts (letter). Ann
parts: purpose (the primary objective of the review article), Intern Med. 1987;107:790–1.
data identification (a succinct summary of data sources), study 11. Haynes RB, Mulrow CD, Huth EJ, Altman DG, Gardner MJ. More
selection (the number of studies selected for review and how informative abstracts revisited. Ann Intern Med. 1990;113:69–76.
12. Huth EJ. Structured abstracts for papers reporting clinical trials. Ann
they were chosen), data extraction (the type of guidelines
Intern Med. 1987;106:626–7.
used for abstracting data and how they were applied), results 13. Mulrow CD, Thacker SB, Pugh JA. A proposal for more informative
of data synthesis (the methods of data analysis and key re- abstracts of review articles. Ann Intern Med. 1988;108:613–5.
sults), and conclusions (key conclusions, including potential 14. Comans ML, Overbeke AJ. The structured summary: a tool for reader
applications and research needs). and author. Ned Tijdschr Geneeskd. 1990;134:2338–43.
While there is evidence that MIAs do provide more in- 15. Taddio A, Pain T, Fassos FF, Boon H, Ilersich AL, Einarson TR. Quality
of nonstructured and structured abstracts of original research articles in
formation,14,15 some investigators found that substantial the British Medical Journal, the Canadian Medical Association Journal and
amounts of information expected in the abstract was still the Journal of the American Medical Association. Can Med Assoc J. 1994;
missing even when that information was present in the 150:1611–4.
text.16 A study by Pitkin and Branagan showed that specific 16. Froom P, Froom J. Deficiencies in structured medical abstracts. J Clin
Epidemiol. 1993;46:591–4.
instructions to authors about three types of common defects
17. Pitkin RM, Branagan MA. Can the accuracy of abstracts be improved
in abstracts—inconsistencies between abstract and text, in- by providing specific instructions? A randomized controlled trial.
formation present in the abstract but not in the text, and JAMA. 1998;280:267–9.
conclusions not justified by the information in the abstract
—were ineffective in lowering the rate of defects.17 Thus
reviewers must be especially attentive to such defects. RESOURCES
American College of Physicians. Resources for Authors—Information for
REFERENCES authors: Annals of Internal Medicine. Available from: MS Internet Ex-
plorer via the Internet 具http://www.acponline.org/journals/resource/
1. Huth EJ. Types of titles. In: Writing and Publishing in Medicine. 3rd info4aut.htm)典. Accessed 9/27/00.
ed. Baltimore, MD: Williams & Wilkins, 1999:131–2. Fye WB. Medical authorship: traditions, trends, and tribulations. Ann In-
2. International Committee of Medical Journal Editors. Uniform require- tern Med. 1990;113:317–25.
ments for manuscripts submitted to biomedical journals. 5th ed. JAMA. Godlee F. Definition of authorship may be changed. BMJ. 1996;312:
1997;277:927–34. 具http://jama.ama-assn.org/info/auinst典. Accessed 5/ 1501–2.
23/01. Huth EJ. Writing and Publishing in Medicine. 3rd ed. Baltimore, MD: Wil-
3. Dimitroff A, Davis WK. Content analysis of research in undergraduate liams & Wilkins, 1999.
education. Acad Med. 1996;71:60–7. Lundberg GD, Glass RM. What does authorship mean in a peer-reviewed
4. Drenth JPH. Multiple authorship. The contribution of senior authors. medical journal? [editorial]. JAMA. 1996;276:75.
JAMA. 1998;280:219–21. National Research Press. Part 4: Responsibilities. In: Publication Policy.
5. Huth EJ. Chapter 4. Preparing to write: materials and tools. appendix 具http: // www.monographs.nrc.ca/cgi-bin/cisti/journals/rp/rp2㛭cust㛭e?pub
A, guidelines on authorship, and appendix B, the ‘‘uniform require- policy典. Accessed 6/5/01.
ments’’ document: an abridged version. In: Writing and Publishing in Pitkin RM, Branagan MA, Burmeister LF. Accuracy of data in abstracts of
Medicine, 3rd ed. Baltimore, MD: Williams & Wilkins, 1999:41–4, published research articles. JAMA. 1999;281:1110–1.
293–6, 297–9. Rennie D, Yank V, Emanuel L. When authorship fails. A proposal to make
6. Huth EJ. Guidelines on authorship of medical papers. Ann Intern Med. contributors accountable. JAMA. 1997:278:579–85.
1986;104:269–74. Shapiro DW, Wenger NS, Shapiro MF. The contributions of authors to
7. Hoen WP, Walvoort HC, Overbeke JPM. What are the factors deter- multiauthored biomedical research papers. JAMA. 1994;271:438–42.
mining authorship and the order of the authors’ names? JAMA. 1998; Slone RM. Coauthors’ contributions to major papers published in the AJR:
280:217–8. frequency of undeserved coauthorship. Am J Roentgenol. 1996;167:
8. Flanagin A, Carey LA, Fontanarosa PB, et al. Prevalence of articles 571–9.
with honorary authors and ghost authors in peer-reviewed medical jour- Smith J. Gift authorship: a poisoned chalice? Not usually, but it devalues
nals. JAMA. 1998;280:222–4. the coinage of scientific publication. BMJ. 1994;309:1456–7.

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 947


OTHER

Presentation and Documentation

Gary Penn, Ann Steinecke, and Judy A. Shea

REVIEW CRITERIA
䡲 The text is well written and easy to follow.
䡲 The vocabulary is appropriate.
䡲 The content is complete and fully congruent.
䡲 The manuscript is well organized.
䡲 The data reported are accurate (e.g., numbers add up) and appropriate; tables and figures are
used effectively and agree with the text.
䡲 Reference citations are complete and accurate.

ISSUES AND EXAMPLES RELATED TO THE CRITERIA ideas that would take too many words to tell. Tables, lists,
and figures should not simply repeat information that is
Presentation refers to the clarity and effectiveness with given in the text; nor should they introduce data that are
which authors communicate their ideas. In addition to eval- not accounted for in the Method section or contradict in-
uating how well the researchers have constructed their study, formation given in the text.
collected their data, and interpreted important patterns in Whatever form the presentation of information takes, the
the information, reviewers need to evaluate whether the au- reviewer should be able to grasp the substance of the com-
thors have successfully communicated all of these elements. munication without having to work any harder than nec-
Ensuring that ideas are properly presented, then, is the re- essary. Of course, some ideas are quite complex and require
viewer’s final consideration when assessing papers for publi- both intricate explanation and great effort to comprehend,
cation. but too often simple ideas are dressed up in complicated
Clear, effective communication takes different forms. language without good reason. The reviewer needs to con-
Straight prose is the most common; carefully chosen words, sider how well the author has matched the level of com-
sentences, and paragraphs convey as much or as little detail munication to the complexity of the substance in his or her
as necessary. The writing should not be complicated by in- presentation.
appropriate vocabulary such as excessive jargon; inaccurately Poor presentation may, in fact, directly reflect poor con-
used words; undefined acronyms; or new, controversial, or tent. When the description of the method of a study is in-
evolving vocabulary. Special terms should be defined, and comprehensible to the reviewer, it may hint at the re-
the vocabulary chosen for the study and presentation should searcher’s own confusion about the elements of his or her
be used consistently. Clarity is also a function of a manu- study. Jargon-filled conclusions may reflect a researcher’s in-
script’s organization. In addition to following a required for- ability to apply his or her data to the real world. This is not
mat, such as IMRaD, a manuscript’s internal organization always true, however; some excellent researchers are simply
(sentences and paragraphs) should follow a logical progres- unable to transfer their thoughts to paper without assistance.
sion that supports the topic. All information contained in Sorting these latter authors from the former is a daunting
the text should be clearly related to the topic. task, but the reviewer should combine a consideration of the
In addition to assessing the clarity of the prose, reviewers presentation of the study with his or her evaluation of the
should be prepared to evaluate graphic representations of methodologic and interpretive elements of the paper.
information—tables, lists, and figures. When well done, they The reviewer’s evaluation of the presentation of the man-
present complex information efficiently, and they reveal uscript should also extend to the presentation of references.

948 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


Proper documentation ensures that the source of material RESOURCES
cited in the manuscript is accurately and fully acknowledged.
Becker HS, Richards P. Writing for Social Scientists: How to Start and
Further, accurate documentation allows readers to quickly Finish Your Thesis, Book, or Article. Chicago, IL: University of Chicago
retrieve the referenced material. And finally, proper docu- Press, 1986.
mentation allows for citation analysis, a count of the times Browner WS. Publishing and Presenting Clinical Research. Baltimore, MD:
a published article is cited in subsequent articles. Journals Lippincott, Williams & Wilkins, 1999.
describe their documentation formats in their instructions to Day RA. How to Write and Publish a Scientific Paper. 4th ed. Phoenix,
authors, and the Uniform Requirements for Manuscripts Sub- AZ: Oryx Press, 1994.
Day RA. Scientific English: A Guide for Scientists and Other Professionals.
mitted to Biomedical Journals details suggested formats. Re- Phoenix, AZ: Oryx Press, 1992.
viewers should not concern themselves with the specific de- Fishbein M. Medical Writing: The Technic and the Art. 4th ed. Springfield,
tails of a reference list’s format; instead, they should look to IL: Charles C Thomas, 1972.
see whether the documentation appears to provide complete Hall GM. How to Write a Paper. London, U.K.: BMJ Publishing Group,
and up-to-date information about all the material cited in 1994.
the text (e.g., author’s name, title, journal, date, volume, Howard VA, Barton JH. Thinking on Paper: Refine, Express, and Actually
Generate Ideas by Understanding the Processes of the Mind. New York:
page number). Technologic advances in the presentation of
William Morrow and Company, 1986.
information have meant the creation of citation formats for International Committee of Medical Journal Editors. Uniform Require-
a wide variety of media, so reviewers can expect there to be ments for Manuscripts Submitted to Biomedical Journals. Ann Intern
documentation for any type of material presented in the text. Med. 1997;126:36–47; 具www.acponline.org/journals/annals/01janr97/
The extent to which a reviewer must judge presentation unifreq典 (updated May 1999).
depends on the journal. Some journals (e.g., Academic Med- Kirkman J. Good Style: Writing for Science and Technology. London, U.K.:
icine) employ editors who work closely with authors to E & FN Spon, 1997.
Matkin RE, Riggar TF. Persist and Publish: Helpful Hints for Academic
clearly shape text and tables; reviewers, then, can concen-
Writing and Publishing. Niwot, CO: University Press of Colorado, 1991.
trate on the substance of the study. Other journals publish Morgan P. An Insider’s Guide for Medical Authors and Editors. Philadel-
articles pretty much as authors have submitted them; in phia, PA: ISI Press, 1986.
those cases, the reviewers’ burden is greater. Reviewers may Sheen AP. Breathing Life into Medical Writing: A Handbook. St. Louis,
not be expected to edit the papers, but their comments can MO: C. V. Mosby, 1982.
help authors revise any presentation problems before final Tornquist EM. From Proposal to Publication: An Informal Guide to Writing
acceptance. about Nursing Research. Menlo Park, CA: Addison–Wesley, 1986.
Tufte ER. Envisioning Information. Cheshire, CT: Graphics Press, 1990.
Because ideas are necessarily communicated through
Tufte ER. The Visual Display of Quantitative Information. Cheshire, CT:
words and pictures, presentation and substance often seem Graphics Press, 1983.
to overlap. As much as possible, the substantive aspects of Tufte ER. Visual Explanations. Cheshire, CT: Graphics Press, 1997.
the criteria for this section are covered in other sections of Zeiger M. Essentials of Writing Biomedical Research Papers. 2nd ed. New
this guide. York: McGraw–Hill, 1999.

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 949


Scientific Conduct

Louis Pangaro and William C. McGaghie

REVIEW CRITERIA
䡲 There are no instances of plagiarism.
䡲 Ideas and materials of others are correctly attributed.
䡲 Prior publication by the author(s) of substantial portions of the data or study is appropriately
acknowledged.
䡲 There is no apparent conflict of interest.
䡲 There is an explicit statement of approval by an institutional review board (IRB) for studies
directly involving human subjects or data about them.

ISSUES AND EXAMPLES RELATED TO THE CRITERIA rigorous. Such a study merits, and perhaps even requires,
publication, and reviewers should not quickly dismiss such a
Reviewers provide an essential service to editors, journals, paper without full consideration of the study’s relevance and
and society by identifying issues of ethical conduct that are its methods.3 Yet authors may not have the confidence to
implicit in manuscripts.1 Concerns for reviewers to consider include results that do not support the hypothesis. Reviewers
include issues of ‘‘authorship’’ (defining who is responsible should be alert to this fear about negative results and read
for the material in the manuscript—see ‘‘Title, Authors, and carefully to detect the omission of data that would be ex-
Abstract’’ earlier in this chapter), plagiarism (attributing pected. (It is important to note that nowhere in this docu-
others’ words or ideas to oneself), lack of correct attribution ment of guidance for reviewers is there a criterion that labels
of ideas and insights (even if not attributing them to one- a ‘‘negative study’’ as flawed because it lacks a ‘‘positive’’
self), falsifying data, misrepresenting publication status,2 and conclusion.)
deliberate, inappropriate omission of important prior re- Reviewers should be alert to several possible kinds of con-
search. Because authors are prone to honest omissions in flict of interest. The most familiar is a material gain for the
their reviews of prior literature, or in their awareness of oth- author from specific outcomes of a study. In their scrutiny of
ers’ work, reviewers may also be useful by pointing out miss- methods (as covered in all articles in the ‘‘Method’’ section
ing citations and attributions. It is not unusual for authors of this chapter), reviewers safeguard the integrity of research,
to cite their own work in a manuscript’s list of references, but financial interest in an educational project may not be
and it is the reviewer’s responsibility to determine the extent apparent. Reviewers should look for an explicit statement
and appropriateness of these citations (see ‘‘Reference to the concerning financial interest when any marketable product
Literature and Documentation’’) earlier. Multiple publica- (such as a CD-ROM or software program) either is used or
tion of substantially the same studies and data is a more is the subject of investigation. Such an ‘‘interest’’ does not
vexing issue. Reviewers cannot usually tell whether parts of preclude publication, but the reviewer should expect a clear
the study under review have already been published or detect statement that there is no commercial interest or of how
when part or all of the study is also ‘‘in press’’ with another such a conflict of interest has been handled.
journal. Some reviewers try to do a ‘‘search’’ on the topic of Recently, regulations for the protection of human subjects
a manuscript, and, when authorship is not masked, of the have been interpreted as applying to areas of research at
authors themselves. This may detect prior or duplicate pub- universities and academic medical centers that they have not
lication and also aid in a general review of citations. been applied to before.4 For instance, studying a new edu-
Finally, reviewers should be alert to authors’ suppression cational experience with a ‘‘clinical research’’ model that
of negative results. A negative study, one with conclusions uses an appropriate control group might reveal that one of
that do not ultimately confirm the study’s hypothesis (or that the two groups had had a less valuable educational experi-
reject the ‘‘null hypothesis’’), may be quite valuable if the ence. Hence, informed consent and other protections would
research question was important and the study design was be the expected standard for participation, as approved by

950 ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001


an IRB.5 In qualitative research, structured qualitative in- 5. Casarett D, Karlawish J, Sugarman J. Should patients in quality improve-
ment activities have the same protections as participants in research
terviews could place a subject at risk if unpopular opinions
studies? JAMA. 2000;284:1786–8.
could be attributed to the individual. Here again, an ethical
and legal responsibility must be met by the researchers. We
should anticipate that medical education research journals
(and perhaps health professions journals also) will require
statements about IRB approval in all research papers. RESOURCES
In summary, manuscripts should meet standards of ethical The Belmont Report [1976]. 具http://ddonline.gsm.com/demo/consult/belm㛭
behavior, both in the process of publication and in the con- int.htm典. Accessed 5/23/01.
duct of research. Any field that involves human subjects— Committee on Publication Ethics. The COPE Report 1998. 具http://
particularly fields in the health professions—should meet www.bmj.com/misc/cope/tex1.shtml典. Accessed 5/9/01.
Committee on Publication Ethics. The COPE Report 2000. 具http://
the ethical standards for such research, including the new www.bmjpg.com/publicationethics/cope/cope.htm典. Accessed 5/9/01.
requirements for education research. Therefore, reviewers Council of Biology Editors. Ethics and Policy in Scientific Publication. Be-
fulfill an essential function in maintaining the integrity of thesda, MD: Council of Biology Editors, 1990.
academic publications. Council for International Organizations of Medical Sciences (CIOMS), In-
ternational Guidelines for Ethical Review of Epidemiological Studies,
Geneva, 1991. In: King NMP, Henderson GE, Stein J (eds). Beyond
Regulations: Ethics in Human Subjects Research. Chapel Hill, NC: Uni-
REFERENCES versity of North Carolina Press, 1999.
The Hastings Center’s Bibliography of Ethics, Biomedicine, and Professional
1. Caelleigh A. Role of the journal editor in sustaining integrity in re- Responsibility. Frederick, MD: University Publications of America in As-
search. Acad Med. 1993;68(9 suppl):S23–S29. sociation with the Hastings Center, 1984.
2. LaFolette MC. Stealing Into Print: Fraud, Plagiarism, and Misconduct Henry RC, Wright DE. When are medical students considered subjects in
in Scientific Publishing. Berkeley, CA: University of California Press, program evaluation? Acad Med. 2001;76:871–5.
1992. National Research Press. Part 4: Responsibilities. In: Publication Policy.
3. Chalmers I. Underreporting research is scientific misconduct, JAMA. 具http: // www.monographs.nrc.ca/cgi-bin/cisti/journals/rp/rp2㛭cust㛭e?pub
1990;263:1405–6. policy典. Accessed 6/5/01.
4. Code of Federal Regulation, Title 45, Public Welfare, Part 46—Protec- Roberts LW, Geppert C, Connor R, Nguyen K, Warner TD. An invitation
tion of Human Subjects, Department of Human Services. 具http:// for medical educators to focus on ethical and policy issues in research
www.etsu.edu/ospa/exempt2.htm典. Accessed 4/1/00. and scholarly practice. Acad Med. 2001;76:876–85.

ACADEMIC MEDICINE, VOL. 76, NO. 9 / SEPTEMBER 2001 951

You might also like