0% found this document useful (0 votes)
19 views

Poldrack 2011

The document discusses the need for a knowledge base in cognitive neuroscience to integrate information from many studies. It proposes a new project called the Cognitive Atlas that aims to develop a framework for characterizing mental processes and relating them to brain function through collaborative knowledge building.

Uploaded by

Marina
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Poldrack 2011

The document discusses the need for a knowledge base in cognitive neuroscience to integrate information from many studies. It proposes a new project called the Cognitive Atlas that aims to develop a framework for characterizing mental processes and relating them to brain function through collaborative knowledge building.

Uploaded by

Marina
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

ORIGINAL RESEARCH ARTICLE

NEUROINFORMATICS published: 06 September 2011


doi: 10.3389/fninf.2011.00017

The cognitive atlas: toward a knowledge foundation for


cognitive neuroscience
Russell A. Poldrack 1 *, Aniket Kittur 2 , Donald Kalar 3 , Eric Miller 4 , Christian Seppa 4 , Yolanda Gil 5 ,
D. Stott Parker 6 , Fred W. Sabb 7 and Robert M. Bilder 7
1
Imaging Research Center and Departments of Psychology and Neurobiology, University of Texas, Austin, TX, USA
2
Human–Computer Interaction Institute, Carnegie Mellon University, Pittsburgh, PA, USA
3
National Aeronautics and Space Administration, Ames Research Center, Mountain View, CA, USA
4
Interactive Design, Squishymedia, Portland, OR, USA
5
Information Sciences Institute, University of Southern California, Marina Del Rey, CA, USA
6
Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
7
Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA

Edited by: Cognitive neuroscience aims to map mental processes onto brain function, which begs
Daniel Gardner, Weill Cornell Medical
the question of what “mental processes” exist and how they relate to the tasks that are
College, USA
used to manipulate and measure them. This topic has been addressed informally in prior
Reviewed by:
Mihail Bota, University of Southern work, but we propose that cumulative progress in cognitive neuroscience requires a more
California, USA systematic approach to representing the mental entities that are being mapped to brain
Neil R. Smalheiser, University of function and the tasks used to manipulate and measure mental processes. We describe a
Illinois–Chicago, USA
new open collaborative project that aims to provide a knowledge base for cognitive neu-
Stephen C. Strother, University of
Toronto, Canada roscience, called the Cognitive Atlas (accessible online at http://www.cognitiveatlas.org),
*Correspondence: and outline how this project has the potential to drive novel discoveries about both mind
Russell A. Poldrack , Imaging and brain.
Research Center and Departments of
Keywords: ontology, informatics, neuroimaging, cognitive science
Psychology and Neurobiology,
University of Texas, 3925-B W. Braker
Lane, Austin, TX 78759, USA.
e-mail: [email protected]

“We’re drowning in information and starving for knowledge” – PubMed, as of February 2011 there were 2613 published research
Rutherford B. Rogers papers that mentioned “working memory” along with either func-
The field of cognitive neuroscience faces an increasingly criti- tional magnetic resonance imaging (fMRI), positron emission
cal challenge: How can we integrate knowledge from an exploding tomography (PET), EEG/ERP, or lesion analysis. Despite this sub-
number of studies across multiple methodologies in order to char- stantial body of published research, it remains difficult to integrate
acterize how mental processes are implemented in the brain? The across this work in order to understand the concept of “work-
creation of neuroimaging databases containing data from large ing memory” and how it relates to brain function, for two major
numbers of studies has provided the basis for powerful meta- reasons: ambiguous terminology and confounding of cognitive
analyses (Laird et al., 2005). However, the semantic infrastruc- processes with the tasks used to measure these.
ture for characterizing the psychological aspects of these studies AMBIGUOUS TERMINOLOGY
has lagged far behind the technical infrastructure for databasing There is substantial ambiguity in the way that terms are used in
and analyzing the imaging results. We propose that cumulative cognitive neuroscience. On the one hand, many terms are used
progress in cognitive neuroscience requires such a semantic infra- to denote multiple, potentially distinct processes. For example,
structure, and that this problem must be addressed through the the term “working memory” has several distinct definitions in the
development of knowledge bases of mental processes (Price and neuroscience literature:
Friston, 2005; Bilder et al., 2009). Here we outline a new project
called the Cognitive Atlas (CA)1 that aims to develop such a • holding information online in memory, as used by Goldman-
framework through collaborative social knowledge building. Rakic (1995) and measured in non-human primates using tasks
such as the oculomotor delayed response task
WHAT IS THE PROBLEM? • manipulating information held in memory, as used by Badde-
The cognitive neuroscientist wishes to answer questions such as: ley (1992) and measured in humans using tasks such as the
“What are the neural substrates of working memory?”According to letter–number sequencing task
• memory for temporally varying aspects of a task (roughly equiv-
1 By “cognitive” we mean to refer to mental processes very broadly, which we take alent to the concept of episodic memory), as used by Olton et al.
to include domains such as emotion or motivation that have historically been (1979) and measured in rodents using a radial arm maze task
distinguished from cognition. with varied food locations

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 1


Poldrack et al. The cognitive atlas

Thus, searching for “working memory” may retrieve papers that ability to identify the particular usage of terms, to allow brows-
are relevant to a range of different specific psychological processes. ing for related concepts, and to allow the identification of relevant
On the other hand, many processes are described in the literature evidence from the literature that is related to these concepts. This
using several different terms. For example, the first sense of “work- would allow intelligent aggregation of research findings, which
ing memory” listed above is often described using other terms could help overcome the information overload that currently
including “short-term memory” or “active maintenance.” For this afflicts researchers. We propose that this challenge can be best
reason, searches that only include “working memory” will fail to addressed through the development and widespread implemen-
retrieve papers that use those other terms unless the search is tation of an ontology for cognitive neuroscience. In philosophy,
expanded to include those other terms, whereas query expansions “ontology” refers to the study of existence or being. However, in
that do include those other terms may yield unacceptably high bioinformatics the term is increasingly used in the sense defined
numbers of irrelevant documents. by Gruber (1993) as an “explicit specification of a conceptual-
ization,” or a structured knowledge base meant to support the
TASKS VERSUS CONSTRUCTS sharing of knowledge as well as automated reasoning about that
There is a longstanding tendency within the cognitive neuroscience knowledge. Ontologies have also provided the basis for effec-
literature to equate tasks with mental constructs. For example, tive knowledge accumulation in molecular biology and genomics
the “Sternberg item recognition task” is often referred to as the (Bard and Rhee, 2004). One of the best known examples is the
“Sternberg working memory task,” which implies that it measures Gene Ontology2 (GO; Ashburner et al., 2000). This ontology pro-
a specific mental construct (“working memory”). This conflation vides consistent descriptors for gene products, including cellular
of tasks and constructs causes a number of difficulties. First, the components (e.g., “ribosome”), biological processes (e.g., “signal
measurement of a psychological construct requires a comparison transduction”), and molecular functions (e.g.,“catalytic activity”).
between specific task conditions (Sternberg, 1969); thus, whereas GO provides the basis on which to annotate datasets regarding
the contrast of particular conditions within the Sternberg task their function, which prevents the common problem of different
(e.g., high load versus low load) may indeed be associated with the researchers using different names to describe the same biological
construct of working memory, other contrasts may not (e.g., probe structure or process across different organisms. It also provides
match versus probe mismatch). Second, any link between tasks the ability to traverse the ontology in order to discover larger-scale
and constructs reflects a particular theory about how the task is regularities by expanding the search to include the subordinate
performed; thus, equating tasks with constructs makes theoretical terms in the ontology. There are increasingly powerful tools that
assumptions that may not be shared throughout the community are built around ontologies such as GO; given a dataset (such as
(and further, those community assumptions may be incorrect). a gene expression pattern), these tools provide a broad range of
For example, the color–word Stroop task is sometimes referred to functions such as the comparison of genetic datasets based on
as an “inhibition task” (Donohoe et al., 2006). However, the role the similarity of their GO annotation patterns (Ruths et al., 2009)
of an active inhibitory process in producing the Stroop effect has and the extraction of novel biological facts from the text of arti-
been questioned by a number of investigators (Cohen et al., 1990). cles (Müller et al., 2004). Ontologies have also been developed
Similarly, while the N -back task is often referred to as the “N -back in a number of other domains in neuroscience (Martone et al.,
working memory task,” serious questions have been raised regard- 2004); most relevant to cognitive neuroscience, there are well-
ing whether it truly measures the construct of “working memory” developed ontologies of brain structure (Bowden and Dubach,
(Kane et al., 2007). The equation of tasks and processes thus pro- 2003).
duces substantial confusion about what is actually being measured A large body of research in cognitive science has developed
by cognitive neuroscience studies. detailed domain-specific theories of mental processes, but there
One problem that arises from this is that a single task is often has been very little work to systematically characterize how these
associated with multiple constructs in the literature. As an exam- processes are defined and how they fit together into a larger
ple, Sabb et al. (2008) used literature mining tools to examine the structure. In part this likely reflects the functionalist character
published literature related to the construct of “cognitive control.” of modern psychology, which arose in reaction to the struc-
They found that this construct was associated with a number of turalist approach of the nineteenth century (e.g., as seen in the
other constructs (including “working memory,” “response inhibi- so-called “faculty psychology” that was employed by phrenolo-
tion,”“response selection,”and“task/set switching”), and that there gists; Boring, 1950). There have been some attempts at larger-
were no tasks that were uniquely associated with the construct of scale “unified theories of cognition” such as Anderson’s ACT-R
“cognitive control” in the literature; each task was also associated (Anderson et al., 2004) and Newell’s SOAR (Laird et al., 1987),
with at least one of those other constructs. Further, the association but these approaches have primarily focused on the develop-
in the literature between these tasks and constructs changed over ment of general unifying computational principles rather than
time. This lack of consistency in the way that tasks and concepts on a systematic characterization of the broad range of cognitive
are treated in the literature makes it difficult to draw meaningful processes.
inferences from existing literature and limits the cumulative value Other extant vocabularies, such as the medical subject headings
of the knowledge represented in this literature. (MeSH), contain some content relevant to cognitive neuroscience,
but suffer from serious limitations. For example, the MeSH
TOWARD AN ONTOLOGY FOR COGNITION
What is urgently needed is an informatics resource that can solve
2 http://www.geneontology.org
the problems listed above. Such a resource would provide the

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 2


Poldrack et al. The cognitive atlas

hierarchy for “Cognition” includes just the following concepts: psychological tasks is not in question (i.e., nearly everyone will
Awareness, Cognitive Dissonance, Comprehension, Conscious- agree on what the “Stroop task” is), but the relation of those tasks
ness, Imagination, and Intuition. These terms possess no mean- to the latent mental constructs is at the center of many debates in
ingful relation to the current conceptual framework of cognitive cognitive science. For this reason, we propose that it is essential
science. In addition, the MeSH terms are a mixture of mental to make a clear distinction between mental processes and psy-
processes (e.g.,“comprehension”), experimental phenomena (e.g., chological tasks, and to develop separate ontologies for those two
“illusions”), and experimental procedures (e.g., “maze learning”), domains (resulting in two separate but interlinked ontologies that
along with outdated terms such as “neurolinguistic programming” form a bipartite graph).
(which is best characterized as a pseudoscience). Given that MeSH
is the lexicon used for indexing articles and expanding queries STRUCTURE OF THE KNOWLEDGE BASE
in PubMed, this suggests that searches of this literature could be Development of the schemas for the CA knowledge base required
greatly improved through the use of vocabularies that better reflect an analysis of the kinds of knowledge structures that are used in
current thinking. cognitive science. An initial vocabulary of more than 800 terms
The development of formal ontologies of cognition faces a dis- was identified manually through analysis of a broad set of publi-
tinct challenge in comparison to other domains in biology, such as cations on cognitive psychology and cognitive neuroscience and
neuroanatomy or cellular function: There is precious little consen- curated by three of the authors (Russell A. Poldrack, Robert M.
sus across the field regarding the basic units of mental function. Bilder, Fred W. Sabb). These entitles were classified into two broad
Given that a formal ontology is generally meant to express the classes: mental concepts and mental tasks.
shared ontological commitments of a group, this poses a difficult
challenge to the development of an ontology of mental processes. MENTAL CONCEPTS
There are two alternatives in this case. The first would be to A mental concept is a latent unobservable construct postulated by
forge ahead and develop a single ontology based on the consen- a psychological theory. Although these mental concepts are ulti-
sus obtained within a small group of individuals. This would have mately instantiated in brain tissue, the mental concept entity in the
the benefit of providing an ontology approved by consensus of its knowledge base refers to the latent construct (e.g., at Marr’s com-
architects, but it would be useless to anyone who did not share putational or algorithmic levels) rather than its physical instan-
the group’s ontological commitments. An alternative approach, tiation. Some potential kinds of mental concepts include (but
which we adhere to in the present work, is to allow and cap- are not limited to) mental representations and mental processes.
ture disagreement, in order to represent the range of views that Mental representations are mental entities that stand in relation
are present in the field. Our approach to this issue is inspired to some physical entity (e.g., a mental image of a visual scene
by the success of social collaborative knowledge building projects stands in relation to, or is isomorphic with, some arrangement of
such as Wikipedia, which allow discussion and the expression of objects in the physical world) or abstract concept (which could
divergent views in service of developing a broader consensus, and be another mental entity). Mental processes are entitles that trans-
one that can be modified flexibly over time as new knowledge form or operate on mental representations (e.g., a process that
emerges. searches a mental representation of the visual scene for a partic-
ular object). In order to accommodate the widest possible range
THE COGNITIVE ATLAS of theories of cognition (including non-representational theories
To address the need for a formal knowledge base that cap- such as Edelman, 1989), the knowledge base does not require that
tures the broad range of conceptual structure within cogni- mental concepts be specified into these subclasses, and indeed it
tive science, we have developed the CA (accessible online at is agnostic about this distinction, permitting but not demand-
http://www.cognitiveatlas.org). The system is under continuous ing that cognitive “representations” exist. Mental concepts in the
development, and new features will be added in the future, but CA knowledge base are modeled using the Concept class from
the current system provides the basic functionality for specifica- the simple knowledge organization system (SKOS; Bechhofer and
tion of knowledge about cognitive processes and tasks. The system Miles, 2009), which describes the basic structure of conceptual
has been designed with the intention of making interaction with entities. An overview of the database schema for mental concepts
the knowledge base as easy as possible, without requiring users is presented in Figure 1.
to possess expertise in ontologies or knowledge base development
MENTAL TASKS
(Miller et al., 2010). In addition, the system uses standard mech-
anisms to enable programmatic access to the database (such as A mental task is a prescribed activity meant to engage or manipu-
SPARQL Protocol and RDF Query Language, SPARQL), which late mental function in an effort to gain insight into the underlying
allows other sites or databases to use the content in an automated mental processes. The structure of the representation of mental
manner. tasks in the CA builds upon the cognitive paradigm ontology
An important guiding principle in the design of the CA has (CogPO3 ; Turner and Laird, 2011), which has a basic class of
been the distinction between mental tasks and mental processes. Behavioral Experimental Paradigm that describes mental tasks.
Mental processes are not directly accessible, but psychological An overview of the database schema for mental tasks is presented
tasks can be used to manipulate and measure them, and behav- in Figure 2.
ior or brain activity observed during those tasks is interpreted as
reflecting those latent mental constructs. The ontological status of 3 www.cogpo.org

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 3


Poldrack et al. The cognitive atlas

FIGURE 1 | An overview of the database schema for representation of mental concepts in the Cognitive Atlas. Blue boxes reflect external ontologies,
and dashed lines reflect class inheritance, while solid lines reflect ontological relations.

Tasks often evoke overt responses (such as motor actions), but (such as response time, accuracy, or other measures of per-
this is not necessary; e.g., a brain imaging study could measure formance) or physiological variables (including genetics, psy-
the neural responses evoked by a particular form of stimulation chophysiology, lesion effects, or neuroimaging data). In the current
(such as watching a movie) without any overt behavior. Any par- implementation of the CA, we focus primarily on behavioral
ticular task has a number of different features that need to be indicators, but we intend the system to be generally applica-
distinguished and/or measured. ble to any indicators measured in the context of mental func-
tion, including physiological measurements, genetics, and imaging
Experimental conditions data.
Experimental conditions are the subsets of an experiment that
define the relevant experimental manipulation. For example, in
the color–word Stroop task there are generally three conditions Contrasts
(congruent, incongruent, and neutral trials). This could also be Although absolute measures of behavior may occasionally be
extended to include parametric manipulations as well. These meaningful (e.g., scores on a standardized test), it is usually the
are defined according to the Behavioral Experimental Paradigm comparison of indicators across different experimental conditions
Condition class in CogPO. that we associate with particular mental processes, through sub-
traction logic or other experimental designs. In the CA, we define
Indicators a contrast as any function over experimental conditions (borrow-
An indicator is a specific quantitative or qualitative variable that ing this usage from the notion of linear contrasts in the general
is recorded for analysis. These may include behavioral variables linear model, as commonly used in the neuroimaging literature).

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 4


Poldrack et al. The cognitive atlas

FIGURE 2 | An overview of the database schema for representation of relations. The dashed line connecting activation maps to imaging databases is
mental tasks in the Cognitive Atlas. Blue boxes reflect external ontologies, meant to reflect an empirical relation, as these databases do not currently
and dashed lines reflect class inheritance, while solid lines reflect ontological expose formal ontologies.

The simplest contrast is the indicator value for a specific con- BASIC ONTOLOGICAL RELATIONS
dition; In some cases this will be meaningful in absolute terms, A standard set of ontological relations has been codified into the
whereas in other cases (e.g., with neuroimaging data) this would open biomedical ontologies (OBO) Relational Ontology (Smith
reflect a comparison with a more basic baseline condition, such as et al., 2005), which provides guidelines regarding the consistent
rest or visual fixation. More complex contrasts include linear or use of specific ontological relations across different ontologies. We
non-linear functions of the indicator across different experimental have adopted several of the basic ontological relations from this
conditions. For example, in the Stroop task, there would be simple ontology:
contrasts for each of the three conditions, in addition to contrasts
such as (incongruent–congruent) and (incongruent–neutral) that • is-a (e.g., “declarative memory is a kind of memory”)
index the well-known “Stroop effect.” These are defined accord- • part-of (e.g., “memory retrieval is a part of declarative mem-
ing to the Behavioral Experimental Paradigm Contrast class in ory”)
CogPO. • transformation-of (e.g., “consolidated memory is a transforma-
tion of encoded memory”)
RELATIONS • preceded-by (e.g., “memory consolidation is preceded by mem-
While a well-defined vocabulary is critical to our knowledge base, ory encoding”)
it is the relations between the terms within the vocabulary that are
of greatest interest, because they express the theoretical claims We have excluded a set of spatial relations defined in the OBO Rela-
of cognitive theories. The CA includes a number of different tional Ontology, because they only apply to entities that have a spa-
types of relations (though not all have been implemented in the tial location. We have also excluded a set of participation relations
current release). Where ever possible, we have reused existing (has_participant, has_agent); these operators express relations
ontologies. between processes and continuants (things), and in the context

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 5


Poldrack et al. The cognitive atlas

of latent mental entities it is not yet clear how to conceptualize which the knowledge can be stored. Instead, we have developed
those constructs. a pipeline to generate an Web Ontology Language (OWL) ontol-
ogy from the database; this allows us to expose the ontological
RELATIONS BETWEEN PROCESSES AND TASKS knowledge in a standard format, while still retaining flexibility to
In addition to the basic relations from the OBO Relational Ontol- store information that may not be easily represented in a formal
ogy, we also define a measured by relation, which denotes the ontology language. The OWL representation of the CA is available
relation between a cognitive process and a particular contrast via the NCBO BioPortal4 and the Python code used to generate
on a task [e.g., “conflict processing is measured by the contrast the OWL representation from the database dumps is available at
of (incongruent–congruent) in the Stroop task”]. This is meant https://github.com/poldrack/cogat.
to reflect the primary form of theoretical claim made by cogni- In order to maximize the ability to interact directly and auto-
tive psychologists, namely that some particular task manipulation matically with other projects, the CA project is built around a set of
affects a particular mental process. Semantic Web (Berners-Lee et al., 2001) technologies (Miller et al.,
2010). First, the representation of every concept in the knowledge
RELATIONS AMONG TASKS base is available in RDF, which is a format for representing seman-
A second new relation introduced in CA is the descended-from tic resources in a machine-understandable way. Second, the CA
relation, which represents historical and/or conceptual relations site exposes a web service known as a SPARQL endpoint, which
between tasks. It is clear that within a broad class of tasks (e.g., allows direct queries of the knowledge base by humans or other
“Color–Word Stroop task”), there will be a large number of poten- computer systems and returns results from the knowledge base
tial variations that could have functional implications, and these in a standards-based format that preserves conceptual relation-
develop over time. In order to capture these relationships, we use ships and valuable contextual information. Together, these services
the concept of “task phylogeny” (Bilder et al., 2009), which treats provide other projects with the ability to directly and effectively
tasks according to a family tree in which tasks inherit particu- access the current state of the knowledge base. This allows sub-
lar features from earlier tasks. Thus, the descended-from relation stantially greater interoperability between systems representing
reflects something like a biological inheritance relationship. In different kinds of information, and supports the automation of
this case, “speciation” is determined by whether the resulting such interactions based on common standards for interoperability
data are commensurate for meta-analysis. If they are not, then and knowledge sharing.
one task would be considered as derived from another, rather In addition to the specific benefits for cognitive neuroscience,
than being considered slightly different variants of the same the infrastructure developed as part of the CA should also serve as
task. a building block for other projects that aim to build collaborative
knowledge bases. Given that the informatics community is still
LITERATURE RELATIONS converging on standards for interoperability between projects, we
All of the entities in the CA knowledge base (including concepts, hope that our demonstration of the effectiveness of Semantic Web
tasks, and relations) can be associated with literature citations, technologies will provide further impetus for their use in such
using a built-in interface to the PubMed literature database. The projects.
knowledge base also allows annotation of relations to specific cita- The web based interface utilizes HTML 5, CSS 3, and cus-
tions, using the relations defined in the citation typing ontology tom JavaScript to generate interactive features. It relies on the
(CiTO; Shotton, 2010). This ontology supports annotation that jQuery libraries and a large number of jQuery utility plugins. The
specifies whether particular citations support or refute a partic- server side software uses a standard LAMP stack (Linux, Apache,
ular claim, as well as many other aspects of citation. Currently, MySQL, and PHP). Graphviz is used for generating RDF visual-
we have only implemented a single literature relation, which is izations, and Arc2/Semsol for RDF parsing and SPARQL libraries.
equivalent to the CITO “citesForInformation” relation. The front end and back end communicate with AJAX techniques,
passing JSON objects from the end-user page to the server and
RELATIONS TO OBSERVED DATA back, as well as with standard POST and GET requests. PDF sup-
The CA does not directly store data; instead, in order to support port is provided by the wkhtmltopdf libraries. Infamous powers
the annotation of relations between tasks and observed data, the the RSS bubble visualizations with Feedburner generating the RSS
system provides the ability to relate specific task contrasts to enti- feed, while PubBrain powers the fMRI imaging references. The
ties or data that are stored in external databases. These will include PubMed Entrez API is used for bibliographic citations and abstract
brain regions (as represented in databases of brain structure such lookups.
as the Foundational Model of Anatomy or Brainmap), genes and
genetic variants (as represented in dbSNP and EntrezGene), and BENEFITS AND CHALLENGES OF COLLABORATIVE
cellular functions (as represented in GO). KNOWLEDGE BUILDING
The Internet has made it possible to tap into the knowledge of
TECHNICAL INFRASTRUCTURE people across the globe on an unprecedented scale. Hundreds of
The CA knowledge base is stored natively in a custom MySQL rela- thousands of people have worked together to write the software
tional database. We chose to do this, rather than storing the knowl-
edge natively in an ontology language or resource description
format (RDF) triplestore, in order to maximize the flexibility with 4 http://bioportal.bioontology.org/

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 6


Poldrack et al. The cognitive atlas

that runs the Internet, write the largest encyclopedia in human in a vacuum; viewpoints and discussion from the community will
history5 , discover new stars and galaxies, and achieve many other be elicited for curation decisions, and decisions are expected to
goals that would be impossible through either humans or com- reflect the consensus of the community. At the same time, a guiding
puters working alone. Such systems serve as both existence proofs principle is that curatorial decisions should be “evidence-based” to
and design models for collaborative knowledge building in sci- ensure that questions about terminology, concepts, and relation-
ence. For example, thousands of individuals may contribute to a ships are not determined only by popular fiat. As the Atlas grows,
single Wikipedia article, each with different knowledge and points it will likely need to develop more sophisticated procedures for
of view, with results rivaling those of expert-written encyclope- dealing with curation (as have been developed in Wikipedia: Forte
dias in quality (Giles, 2005) and vastly exceeding them in scope. et al., 2009), which will be developed in collaboration with and by
However, the process through which such high-quality collabora- the scientific user community.
tive knowledge building happens is by no means a given, requiring
the evolution of community norms, explicit rules, technological UTILITY OF THE COGNITIVE ATLAS
features, and significant effort spent on coordination and conflict The CA will provide a formalization of what are often implicit
resolution (Kittur and Kraut, 2008). Our goal in the CA project is conceptual schemes in cognitive neuroscience, and in particular
to leverage the emerging understanding of crowd-driven collabo- will make clear the mapping of particular task contrasts onto par-
rative knowledge building to develop a system for scientists that ticular mental processes. We envision a number of ways in which
captures a wide variety of viewpoints and builds consensus across the CA could impact cognitive neuroscience research.
fields.
CLEARER VOCABULARY
CAPTURING AND RESOLVING DISAGREEMENT The controlled vocabulary of the CA provides a way for researchers
Though seemingly simple at first glance, Wikipedia is a sophis- to use terms in a more precise way, and to help reduce polysemy,
ticated engine for collaborative knowledge building with highly wherein the same term is used by different researchers to mean
developed mechanisms for lowering participation costs, promot- different things. This occurs surprisingly often within the neu-
ing collaboration, updating information, resolving conflict, and roscience literature. An example is the term “working memory,”
consolidating content. Over a third of all work in Wikipedia goes which has several distinct meanings in the literature, as discussed
not to editing articles but instead to coordination activities such above. This can lead to complications in automated processing of
as debate about policies and procedures, maintenance activities the literature, since it is not possible to know which of these senses
such as deleting non-conforming pages, and negotiation about is implied in any particular usage.
content and issues (Kittur et al., 2007). The very large and increas- In the CA, each term begins with a single concept definition.
ing amount of effort being spent on coordination emphasizes the Researchers who do not agree with this definition can discuss their
need for a collaborative knowledge creation system to focus on disagreement using the built-in discussion feature, similar to the
supporting collaboration at least as much as supporting knowl- way that conflicts are resolved in Wikipedia. However, if it becomes
edge creation. Wikipedia supports collaboration through a wide clear from the discussion that there is an irreconcilable conceptual
variety of mechanisms. One of the simplest but most important is difference, the concept can be “forked,” in which case the original
the presence of a “discussion” section on every page, which con- concept is broken into a number of senses, each of which would
tains a record of all current and past conversations regarding a have its own separate concept page and participate separately in
page separate from the content of the page itself. New users can relations with tasks. For example, in the working memory exam-
view past discussions in order to take advantage of the information ple, the original concept “working memory” would be broken into
accrued in past conversations and avoid repeating past mistakes. separate senses, such as “working memory (maintenance),”“work-
The CA adds to this functionality to deal with relations, capturing ing memory (manipulation),” and “working memory (temporal),”
discussion not only for concepts but also for relations between with the original page being converted to a disambiguation page
them. Furthermore, discussions are integrated directly into con- for the different senses. An example of such a page for the concept
cept and relation pages, surfacing them and making them salient of “behavioral inhibition” is shown in Figure 3
to readers as well as contributors. When discussion alone can- Conversely, it is also common that different terms are used to
not resolve disagreement, concepts can be “forked” or merged as describe the same underlying processes. For example, the terms
discussed further below. “declarative memory” and “explicit memory” are often used to
refer to the same mental function. Within the CA, it is possible
CURATION
to specify terms as synonyms, such that subsequent analyses using
Although completely open systems such as Wikipedia work well the knowledge base will recognize the terms as synonyms. This
for general encyclopedic knowledge, some of the greatest successes is preferred to the merging of the concepts into a single concept,
in scientific knowledge building have come from curated models because it retains the original terms in the knowledge base while
such as GO. The CA aims to strike a balance between the two still noting them as referring to the same process.
extremes, with curation done on an as-needed basis by the core
IMPROVED QUERY EXPANSION
team and volunteer curators. However, curation will not be done
The precision and recall of literature searches can be greatly
improved by the use of ontological knowledge to guide the
5 Wikipedia.org expansion of search queries. For example, PubMed currently

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 7


Poldrack et al. The cognitive atlas

FIGURE 3 | A screenshot of the disambiguation page for the concept of “behavioral inhibition,” which points to two separate senses of the term.

expands queries using the MeSH lexicon, which (as discussed of a more detailed cognitive ontology would provide the ability
above) is not reflective of the state of the art. For example, the to perform such mappings. As an example, Poldrack et al. (2009)
query (sublexical route) is expanded by PubMed as [sublexi- used a set of annotated task-process relations, along with latent
cal[All Fields] AND (“drug administration routes”[MeSH Terms] variables identified from fMRI data obtained across a set of eight
OR (“drug”[All Fields] AND “administration”[All Fields] AND tasks, in order to identify which mental processes were mapped
“routes”[All Fields]) OR “drug administration routes”[All Fields] onto brain networks. With larger data sets annotated using a more
OR “route”[All Fields])]. The confusion of MeSH regarding the detailed ontology, this kind of analysis could provide new insights
meaning of “route” in this context leads to incorrect query expan- into which ontological distinctions in the mental process ontol-
sion, whereas the use of a database that included “sublexical ogy are biologically realized and which are not (Lenartowicz et al.,
route” as a concept would more specifically target that partic- 2010).
ular phrase. In addition, it could potentially expand the query
to include other related terms such as “phonological assembly,”
THEORY TESTING
leading to search results that are more likely to find relevant
It is also possible to envision the use of meta-analysis with the
literature.
CA to test larger theories of cognitive organization. For example,
within the psychology of categorization there is a longstand-
METADATA ANNOTATION AND META-ANALYSIS ing debate between theories that posit single processes under-
The availability of large databases of neuroimaging data, particu- lying both categorization and recognition memory versus sepa-
larly as the Brainmap.org database (Laird et al., 2005), has enabled rate processes (Poldrack and Foerde, 2008). Each of these theo-
powerful meta-analyses. However, the ability to perform meta- ries would make different ontological claims regarding cognitive
analysis is limited by the metadata that are associated with each processes and their relations with mental tasks. These ontologi-
data set; in order to assess which brain systems are associated cal claims could potentially be translated into claims about the
with particular mental processes, the data need to be annotated covariance structure in the data obtained on those tasks, and dif-
using an ontology of mental processes. The Brainmap database ferent theories could be compared for their relative fit to the data
currently uses a relatively coarse ontology of mental processes, using covariance structure modeling methods. While the system
which limits the ability to make finer assessments about structure– is not currently able to support such theory testing, it remains an
function associations (Poldrack, 2006). However, the availability important goal for the future of the system.

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 8


Poldrack et al. The cognitive atlas

TRANSLATIONAL MENTAL HEALTH RESEARCH the knowledge base (Miller et al., 2010), and also worked with
Within psychiatry, there is an increasing movement toward the a small number of investigators to begin to populate the data-
use of dimensional rather than categorical approaches to char- base and refine the interaction design for the site. In the current
acterizing psychiatric disorders (Kraemer, 2007). Ongoing efforts phase, we are continuing to implement new features as well as
such as the NIH Research Domain Criteria project (Insel et al., refining the existing interface, with a particular focus on scaling
2010) aim to characterize these underlying dimensions in terms of to larger amounts of content. We have also begun soliciting open
their cognitive and neural bases. The social collaborative knowl- contributions from researchers across the field. The CA currently
edge building tools that are provided by the CA offer the ability has entries for 904 terms, including 708 mental constructs, and
for such projects to interactively develop their knowledge base and 196 tasks, with definitions present for 795 of these terms; most of
annotate a rich set of links between cognitive processes and data at these definitions would not be viewed as sufficient by an expert in
other levels, such as neural circuits, cellular signaling pathways, or the area, but are provided as a starting point for those experts to
genes. The CA will play an essential link in allowing relationships edit and refine. To date the database has fewer than 900 relations
to be made between the neural level and the level of psychiatric specified; whereas the first phase of the project focused on concept
symptoms and syndromes (see Figure 4). definitions, a major goal in the next phase of the project will be to
enlist a wide range of researchers to contribute their knowledge of
PRESENT STATUS AND FUTURE PLANS these relations.
The development of the CA began in 2008. In the design phase, Beyond the addition of content, future development of the site
we analyzed the structure of the knowledge to be represented will focus on three areas. First, we plan to add personalization fea-
and developed an initial schema for the database. In the initial tures to allow users to keep better track of relevant information.
implementation phase, we worked with the development team This will include tracking of their own contributions, tracking
to implement the basic functionality for presenting and editing of recent changes to topics of interest, and recommendation of

FIGURE 4 | The Cognitive Atlas provides a framework for relating proportional to the literature association between each set of terms (defined
biological functions and processes to psychiatric symptoms and as the Jaccard coefficient between the two search terms derived from
syndromes. The links between each level in this graph reflect proposed PubMed). Each link can also be associated with specific empirical results, as
empirical relations; the strength of each link (noted by its width) is noted in the box demonstrating a particular annotation for one of the edges.

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 9


Poldrack et al. The cognitive atlas

content and/or publications that are relevant to the user’s interests data linking the concepts in the database to brain systems. Cur-
(based on their past contributions). Second, we plan to include rently, the NeuroSynth database includes forward and reverse
content based on mining of the published literature. For exam- inference maps for all concept terms in the CA. In addition, if
ple, we might include relations in the database that are based on other databases become available (e.g., genome-wide association
association between terms in the published literature, which could data, lesion mapping data, patient behavioral data, etc.) these can
then serve as input to the manual annotation process. Third, we also be linked directly to the database. It should also be noted that
plan to implement a greater degree of integration with other data- while the CA is currently focused on concepts from the human psy-
bases. The CA lexicon already is integrated with free web services chology literature, it will also be important to encompass concepts
enabling: mining of literature associations (PubAtlas6 ); mapping from non-human animal literatures as well, in order to provide
associations of literature with a three-dimensional probabilistic truly systematic coverage of the literature.
atlas of brain structure (PubBrain7 ); and collaborative entry of
quantitative annotations for meta-analysis of findings from cog- CONCLUSION
nitive studies (PhenoWiki8 ). While the CA currently represents The mapping of mental processes to brain systems has to date
the structure of tasks in a relatively coarse way, the CogPO (Turner relied largely upon informal representations of mental processes
and Laird, 2011; see text footnote 2) project is currently developing and the tasks that are used to manipulate them. We propose
much more detailed task ontologies. As those become available we that this approach is fundamentally limited, and that continued
will link directly to them, providing a much more thorough way scientific progress in cognitive neuroscience will require the devel-
of modeling the fine-grained details of tasks. The CA lexicon will opment and adoption of formal knowledge bases that provide a
be included in a future version of the NeuroLex database9 , and we more systematic explication of cognitive theories and their rela-
will also implement greater integration biological ontologies such tion to empirical data. The CA aims to provide such a resource
as the GO (Ashburner et al., 2000). that reflects the views of the entire community, and we welcome
Finally, it is important to be clear that the CA itself will not the contribution of interested researchers through the cognitiveat-
contain any data, but we plan to link to empirical databases from las.org web site. By including the contributions of researchers
within the CA. In the short term, this will include direct links to across the field, we hope that the CA can become the standard
coordinate-based neuroimaging databases with exposed APIs such ontology for mental function.
as the NeuroSynth project (Yarkoni et al., 2011) or SumsDB10 ,
which can provide automated access to meta-analytic empirical ACKNOWLEDGMENTS
This work was supported by NIH grants RO1MH082795 (to
Russell A. Poldrack), and the Consortium for Neuropsychi-
6 http://www.pubatlas.org
7 http://www.pubbrain.org
atric Phenomics [NIH Roadmap for Medical Research grants
8 http://www.phenowiki.org UL1-DE019580, PL1MH083271 (to Robert M. Bilder) and
9 http://www.neurolex.org RL1LM009833 (to D. Stott Parker)]. Thanks to Rajeev Raizada
10 http://sumsdb.wustl.edu:8081/sums/index.jsp for helpful comments on a draft of this paper.

REFERENCES at: http://www.w3.org/TR/2009/ specific executive functions predict (2010). Research domain criteria
Anderson, J. R., Bothell, D., Byrne, M. REC-skos-reference-20090818/ symptom variance among schizo- (rdoc): toward a new classification
D., Douglass, S., Lebiere, C., and Qin, Berners-Lee, T., Hendler, J., and Lassila, phrenia patients with a predom- framework for research on men-
Y. (2004). An integrated theory of the O. (2001). The semantic web. Sci. inantly negative symptom profile. tal disorders. Am. J. Psychiatry 167,
mind. Psychol. Rev. 111, 1036–1060. Am. 284, 34–43. Cogn. Neuropsychiatry 11, 13–32. 748–751.
Ashburner, M., Ball, C. A., Blake, J. A., Bilder, R. M., Sabb, F. W., Parker, D. Edelman, G. M. (1989). The Remem- Kane, M. J., Conway, A. R. A., Miura,
Botstein, D., Butler, H., Cherry, J. M., S., Kalar, D., Chu, W. W., Fox, bered Present: a Biological Theory T. K., and Colflesh, G. J. H. (2007).
Davis, A. P., Dolinski, K., Dwight, S. J., Freimer, N. B., and Poldrack, of Consciousness. New York: Basic Working memory, attention con-
S., Eppig, J. T., Harris, M. A., Hill, R. A. (2009). Cognitive ontolo- Books. trol, and the n-back task: a ques-
D. P., Issel-Tarver, L., Kasarskis, A., gies for neuropsychiatric phenomics Forte, A., Larco, V., and Bruck- tion of construct validity. J. Exp.
Lewis, S., Matese, J. C., Richardson, research. Cogn. Neuropsychiatry 14, man, A. (2009). Decentralization in Psychol. Learn. Mem. Cogn. 33,
J. E., Ringwald, M., Rubin, G. M., and 419–450. Wikipedia governance. J. Manag. Inf. 615–622.
Sherlock, G. (2000). Gene ontology: Boring, E. G. (1950). A History of Exper- Syst. 26, 49–72. Kittur, A., and Kraut, R. E. (2008). “Har-
tool for the unification of biology. imental Psychology, 2nd Edn. Engle- Giles, J. (2005). Internet encyclopae- nessing the wisdom of crowds in
the gene ontology consortium. Nat. wood Cliffs, NJ: Prentice Hall. dias go head to head. Nature 438, wikipedia: quality through coordi-
Genet. 25, 25–29. Bowden, D. M., and Dubach, M. F. 900–901. nation,” in CSCW 2008: Proceedings
Baddeley, A. (1992). Working memory. (2003). Neuronames 2002. Neuroin- Goldman-Rakic, P. S. (1995). Cellular of the ACM Conference on Computer-
Science 255, 556–559. formatics 1, 43–59. basis of working memory. Neuron Supported Cooperative Work (New
Bard, J. B. L., and Rhee, S. Y. (2004). Cohen, J. D., Dunbar, K., and McClel- 14, 477–485. York: ACM Press).
Ontologies in biology: design, appli- land, J. L. (1990). On the control Gruber, T. (1993). A translation Kittur, A., Suh, B., Pendleton, B. A.,
cations and future challenges. Nat. of automatic processes: a parallel approach to portable ontology spec- and Chi., E. (2007). “He says, she
Rev. Genet. 5, 213–222. distributed processing account of ifications. Knowl. Acquisition 5, says: conflict and coordination in
Bechhofer, S., and Miles, A. (2009). the stroop effect. Psychol. Rev. 97, 199–220. wikipedia,” in CHI 2007: Proceedings
SKOS Simple Knowledge Orga- 332–361. Insel, T., Cuthbert, B., Garvey, M., of the ACM Conference on Human-
nization System Reference. W3C Donohoe, G., Corvin, A., and Robert- Heinssen, R., Pine, D. S., Quinn, factors in Computing Systems (New
recommendation, W3C. Available son, I. H. (2006). Evidence that K., Sanislow, C., and Wang, P. York, NY: ACM Press).

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 10


Poldrack et al. The cognitive atlas

Kraemer, H. C. (2007). Dsm cate- Textpresso: an ontology-based go-based similarity of gene sets. synthesis of human func-
gories and dimensions in clinical information retrieval and Bioinformatics 25, 1178–1184. tional neuroimaging data. Nat.
and research contexts. Int. J. Methods extraction system for biological Sabb, F. W., Bearden, C. E., Glahn, Methods 8, 665–670.
Psychiatr. Res. 16(Suppl. 1), S8–S15. literature. PLoS Biol. 2, e309. doi: D. C., Parker, D. S., Freimer, N.,
Laird, A. R., Lancaster, J. L., and Fox, 10.1371/journal.pbio.0020309 and Bilder, R. M. (2008). A collab- Conflict of Interest Statement: The
P. T. (2005). Brainmap: the social Olton, D. S., Becker, J. T., and Handel- orative knowledge base for cogni- authors declare that the research was
evolution of a human brain map- mann, G. E. (1979). Hippocampus, tive phenomics. Mol. Psychiatry 13, conducted in the absence of any
ping database. Neuroinformatics 3, space, and memory. Behav. Brain Sci. 350–360. commercial or financial relationships
65–78. 2, 313–365. Shotton, D. (2010). Cito, the citation that could be construed as a potential
Laird, J., Newell, A., and Rosenbloom, Poldrack, R. A. (2006). Can cogni- typing ontology. J Biomed. Semantics conflict of interest.
P. (1987). Soar: an architecture for tive processes be inferred from neu- 1(Suppl. 1), S6. doi: 10.1186/2041-
general intelligence. Artif. Intell. 33, roimaging data? Trends Cogn. Sci. 1480-1-S1-S6 Received: 31 March 2011; accepted: 17
1–64. (Regul. Ed.) 10, 59–63. Smith, B., Ceusters,W., Klagges, B., Köh- August 2011; published online: 06 Sep-
Lenartowicz, A., Kalar, D., Congdon, E., Poldrack, R. A., and Foerde, K. (2008). ler, J., Kumar, A., Lomax, J., Mungall, tember 2011.
and Poldrack, R. A. (2010). Towards Category learning and the memory C., Neuhaus, F., Rector, A. L., and Citation: Poldrack RA, Kittur A, Kalar
an ontology of cognitive control. systems debate. Neurosci. Biobehav. Rosse, C. (2005). Relations in bio- D, Miller E, Seppa C, Gil Y, Parker
Top. Cogn. Sci. 2, 678–692. Rev. 32, 197–205. medical ontologies. Genome Biol. 6, DS, Sabb FW and Bilder RM (2011)
Martone, M. E., Gupta, A., and Ellis- Poldrack, R. A., Halchenko, Y. O., and R46. The cognitive atlas: toward a knowl-
man, M. H. (2004). E-neuroscience: Hanson, S. J. (2009). Decoding the Sternberg, S. (1969). The discovery edge foundation for cognitive neuro-
challenges and triumphs in inte- large-scale structure of brain func- of processing stages: Extensions science. Front. Neuroinform. 5:17. doi:
grating distributed data from mol- tion by classifying mental states of donders’ method. Acta Psychol. 10.3389/fninf.2011.00017
ecules to brains. Nat. Neurosci. 7, across individuals. Psychol. Sci. 20, (Amst.) 30, 276–315. Copyright © 2011 Poldrack, Kittur,
467–472. 1364–1372. Turner, J. A., and Laird, A. R. (2011). Kalar, Miller, Seppa, Gil, Parker, Sabb
Miller, E., Seppa, C., Kittur, A., Sabb, F., Price, C., and Friston, K. (2005). Func- The cognitive paradigm ontology: and Bilder. This is an open-access arti-
and Poldrack, R. A. (2010). The cog- tional ontologies for cognition: the design and application. Neuro- cle subject to a non-exclusive license
nitive atlas: employing interaction systematic definition of structure informatics. Available at: http:// between the authors and Frontiers Media
design processes to facilitate collab- and function. Cogn. Neuropsychol. www.springerlink.com/content/ SA, which permits use, distribution and
orative ontology creation. Nat. Proc. 22, 262–275. w23070351h220513/ reproduction in other forums, provided
[Epub ahead of print]. Ruths, T., Ruths, D., and Nakhleh, Yarkoni, T., Poldrack, R. A., Nichols, the original authors and source are cred-
Müller, H.-M., Kenny, E. E., L. (2009). Gs2: an effi- T. E., Van Essen, D., and Wager, T. ited and other Frontiers conditions are
and Sternberg, P. W. (2004). ciently computable measure of (2011). Large-scale automated complied with.

Frontiers in Neuroinformatics www.frontiersin.org September 2011 | Volume 5 | Article 17 | 11

You might also like