0% found this document useful (0 votes)

42 views

DICOM

Uploaded by

6fkwpbbjnr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views

DICOM

Uploaded by

6fkwpbbjnr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Aiello et al.

Insights Imaging (2021) 12:164

https://doi.org/10.1186/s13244-021-01081-8 Insights into Imaging

STATEMENT Open Access

How does DICOM support big data

management? Investigating its use in medical
imaging community
Marco Aiello1* , Giuseppina Esposito2, Giulio Pagliari2, Pasquale Borrelli1, Valentina Brancato1 and
Marco Salvatore1

Abstract
The diagnostic imaging field is experiencing considerable growth, followed by increasing production of massive
amounts of data. The lack of standardization and privacy concerns are considered the main barriers to big data capi-
talization. This work aims to verify whether the advanced features of the DICOM standard, beyond imaging data stor-
age, are effectively used in research practice. This issue will be analyzed by investigating the publicly shared medical
imaging databases and assessing how much the most common medical imaging software tools support DICOM in
all its potential. Therefore, 100 public databases and ten medical imaging software tools were selected and examined
using a systematic approach. In particular, the DICOM fields related to privacy, segmentation and reporting have been
assessed in the selected database; software tools have been evaluated for reading and writing the same DICOM fields.
From our analysis, less than a third of the databases examined use the DICOM format to record meaningful informa-
tion to manage the images. Regarding software, the vast majority does not allow the management, reading and writ-
ing of some or all the DICOM fields. Surprisingly, if we observe chest computed tomography data sharing to address
the COVID-19 emergency, there are only two datasets out of 12 released in DICOM format. Our work shows how the
DICOM can potentially fully support big data management; however, further efforts are still needed from the scientific
and technological community to promote the use of the existing standard, encouraging data sharing and interoper-
ability for a concrete development of big data analytics.
Keywords: DICOM, Big data, Data curation, COVID-19, Data analytics

Key points • There is need to fully promote DICOM in data shar-

ing and software development.
• Standardization is crucial for big data capitalization.
• DICOM supports key actions for imaging data man-
agement. Introduction
• The majority of shared research databases does not The modern era is going through a rapid technological
fully exploit DICOM format. evolution, and we are witnessing the production of a huge
• Imaging software tools do not fully support DICOM amount of information that can be worthily enhanced
advanced feature. with appropriate management and analysis. We are, in
fact, in the so-called big data era, where the value of data
can be the real engine of innovation [1].
The healthcare sector, where multi-sources patient
*Correspondence: [email protected]
1
IRCCS SDN, Via Emanuele Gianturco 113, 80143 Naples, Italy
information is routinely collected, is gaining volume and
Full list of author information is available at the end of the article complexity. Among big data types, imaging data can be

© The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/.
Aiello et al. Insights Imaging (2021) 12:164 Page 2 of 21

considered the largest in volume. In fact, it covers not • To verify whether the standard is implemented in
only gigapixel images, such as tissues or organs at sub- clinical and research practice, investigating the public
cellular resolutions, but also metadata and quantitative data shared by the research community and check-
measurements. For instance, neuroimaging is currently ing how much the standard is actually supported in
producing more than 10 petabytes of data every year with the main software for management and processing of
a staggering ninefold increase in data complexity (i.e., diagnostic images.
data acquisition modalities) over the last three decades
[2, 3]. The scientific literature shows different initiatives that
Therefore, the standardization of the medical imaging aim to evaluate the use of DICOM and related software
formats plays a crucial role in the effective exploitation of tools for research purposes, in particular for the manage-
the data and subsequent clinical decision making [1]. ment of diagnostic data oriented to quantitative imaging
The World Health Organization (WHO) recognizes the [7, 9, 10] or for operations such as de-identification [11];
lack of standardization, together with privacy concerns, in this work, we intend to carry out a comprehensive
as the main barrier to big data exploitation [4]. The Digi- evaluation that includes all the main data curation opera-
tal Imaging and Communications in Medicine (DICOM) tions both for software tools and released datasets.
format is the current standard for storing and transmit- The next paragraphs deepen the introduction of this
ting medical images, enabling the integration of medical context and show the current support of the DICOM for-
imaging devices such as scanners, servers, workstations, mat in the diagnostic workflow.
printers, network hardware and picture archiving and
communication systems (PACS) from multiple manufac- Big data workflow
turers [5]. DICOM encompasses raw imaging data and all Despite large strides in the introduction of PACS over the
the metadata related to the procedures of image acquisi- past few decades and the acceptance of the international
tion and curation, including a series of processes such as DICOM standard for the storage and transfer of medi-
de-identification of sensible data, annotation of regions cal imaging data, there still remain significant barriers
of interest within the medical image, image enhancement for the effective implementation of big data analytics on
or structured reporting. diagnostic imaging data.
In addition to DICOM, the most used formats designed Diagnostic images constitute a huge amount of data
for medical images are the analyzed format and its most circulating in health care.
recent version, Neuroimaging Informatics Technology Unfortunately, medical data are often stored in quali-
Initiative (NIfTI), MetaImage (mhd) and nearly raw ras- tative reports, which do not allow recovering the images
ter data (NRRD). They are mostly used for saving images of interest as well as the related details. The clinical deci-
for post-processing operations [6]. It is important to sion, in fact, is usually made on information deriving
note, however, that DICOM remains the standard for from several sources which may vary from the patient
the clinical management of images, which are routinely to the diagnostic systems but could not be stored in an
acquired in DICOM and eventually transformed into appropriate and standardized way. Each of these sin-
other formats. For this reason, our work does not focus gle attributes could be considered as a fundamental ele-
on the already consolidated use of DICOM as a clinical ment for the consequent definition of algorithms or, e.g.,
standard but on the extension of the standard toward big supervised or unsupervised models.
data analytics, as a putative bridge between the clinic and Figure 1 shows a typical radiological workflow, in
research. which the patient undergoes a diagnostic imaging
In fact, during the last years, the DICOM steering com- examination. The results of the diagnostic procedure
mittee has carefully followed these needs by defining and are elaborated by the instrumentation (e.g., computed
extending the standard [7, 8]. Nevertheless, the simple tomography (CT) or magnetic resonance (MR) scan
definition of a standard is not enough to satisfy needs and stored in the DICOM format after the image for-
since the availability to users and the support with appro- mation procedure. Series, scans or reports are usu-
priate software tools are crucial. ally safely stored in the hospital’s PACS and could
This work aims to evaluate the effective implementa- be recalled by a radiologist or a clinician whenever
tion of the DICOM standard in the big data perspective needed. For instance, data retrieval is a fundamental
and, in particular: step for reporting the result of the examination, identi-
fying any potential pathologic condition and giving the
• To recognize and introduce how the DICOM stand- patient or other practitioners the relevant indications.
ard effectively supports the current challenges in Many critical and significant elements may be lost dur-
diagnostic imaging management and analytics. ing the conventional workflow because of the lack of
Aiello et al. Insights Imaging (2021) 12:164 Page 3 of 21

PATIENT/CLINICAL ISSUE IMAGING SCAN

IMAGING DB

ANALYTICS
DICOM
PACS IMAGING REPORT EXAM
CLINICAL REPORTING

DICOM-SEG

DICOM-SR
• Segmentaon
• Annotaon

• CONVENTIONAL WORKFLOW • Measurement

• BIG-DATA WORKFLOW • Clinical comments

Fig. 1 Conventional radiological workflow and collection of information for big data analytics using specific DICOM tags. Blue arrows refer to the
radiological workflow; green arrows refer to data collection for big data analytics

standardization, the use of different formats or errors analysis but is often inaccessible due to the qualita-
in the recording stage. We can recognize three funda- tive and descriptive nature of these documents.
mental actions, as shown in Fig. 1:
All the information processed and recorded during
1. De-identification/anonymization: A DICOM file these steps should be prepared as efficiently as possible:
contains both the image and a large variety of data The resulting datasets should be analyzed readily and
in the header. All of these elements can includeiden- should not require additional curation steps. In fact, any
tifiable information about the patient, the study and additional procedure would be infeasible, especially con-
the institution. Sharing such sensitive data demands sidering large amounts of data.
proper protection in order to ensure data safety and The FAIR Guiding Principles [14, 15] state that good
maintain patient privacy [12]. data management is “the key conduit leading to knowl-
2. Annotation/segmentation: This phase includes all edge discovery and innovation, and to subsequent data
the operations of delineation, demarcation, localiza- and knowledge integration and reuse by the community
tion and measurement of regions of interest (organs, after the data publication process.” FAIR, in fact, stands
lesions, suspicious or notable areas) within the diag- for findability, accessibility, interoperability and reusabil-
nostic images [1]. As a final result, some quantitative ity, which are perceived as the four key factors affecting
data are reported with the images and could be used data quality.
to build datasets for the development and validation It is important to note that, in the age of machine learn-
of algorithms, models and the so-called semantic ing, “reusability” refers to reuse not only by humans but
segmentation [13]. also by machines. Consequently, it is important to con-
3. Clinical reporting: This operation includes the collec- sider how to make data readable by machines in order to
tion of data related to the patient’s pathological state, make best use of modern technologies.
before and after the diagnostic session. A report usu-
ally includes critical data such as the clinical out- Privacy and de‑identification
comes, clinical characterization of the subjects and Privacy is one of the most discussed topics related to data
information about therapeutic treatments. Similar collection, analysis and interpretation since, nowadays,
information has a huge potential in terms of data these activities and their results are considered as a new
Aiello et al. Insights Imaging (2021) 12:164 Page 4 of 21

business. Medical data may have several legitimate sec- From these definitions, anonymization can be used in
ondary uses, such as research projects or teaching, but accordance with HIPAA and both anonymization and
it is strictly necessary to receive the informed consent pseudonymization are considered as “data processing”
from the patients. Another example is the development under the GDPR. In the latter case, the right of access,
of decision support systems, which could be easily com- modification and cancellation of the citizen’s personal
mercialized as soon as the results of the digital toolkit data must be guaranteed. However, as the anonymous
are considered as adequate. Additionally, most personal data have no direct or indirect links to identify the origi-
data should be removed securely, even if in many cases nal patient, any additional processing or processing
some data or a link to personal data must be kept. The performed on that dataset falls outside the scope of the
increasing demand for predictive models, clinical deci- GDPR.
sion support systems and data analysis has led to a situ- The implementation of de-identification on clinical
ation where the research needs are often in conflict with diagnostic examinations passes through the search, elim-
privacy rules [16]. At this moment, there is no commonly ination and replacement of identifiers within the images
accepted solution since the amount and type of personal and tags that describe them. Most DICOM objects
data requested by researchers varies on a case-by-case contain demographic and associated medical images
basis and the choice of information to be kept, modified and information about the patient, which must be kept
or deleted depends on the purposes and regulations to confidential or removed in case the tests are to be used
comply with. for research purposes. As reported in the document
To overcome these problems, governments and insti- “Security and Privacy in DICOM” [12], since 1999 the
tutions have developed rules and regulations that bring DICOM standard has included options to encrypt and
together privacy, data and research purposes. To this aim, protect data that move through the connections of net-
some of the most requested and used regulations that work in response to the implementation of HIPAA and
must be considered to use data for research purposes are: not in response to cybersecurity problems; furthermore,
in 2001 DICOM extended the use of the CMS (Crypto-
• HIPAA, 1996 Health Insurance Portability and Lia- graphic Message Syntax) to encrypt DICOM data, allow-
bility Act (HIPAA), ing the encryption of the PHI of a DICOM object.
• GDPR, the general data protection regulation (EU) Profiles and options to address the removal and
2016/679. replacement of attributes within a DICOM Dataset are
reported in the Annex E of DICOM PS3.15 2020d. Spe-
Specifically, the GDPR gives EU citizens or residents cifically, it contains a Basic Profile with several retain
the right to request the deletion, modification or access or clean options, and other implementations such as
of their data, while the HIPAA does not confer this right. “In standard Compliance of IOD.” Each attribute can be
The GDPR compliance affects all personal data, while the either replaced, cleaned, removed or kept, depending on
HIPAA is limited to the “Protected Health Information” the confidentiality level and the importance of the identi-
(PHI), defined as information that can be used to directly fier. The following tags should be added after the applica-
or indirectly identify an individual in relation to his tion of one of the de-identification profiles, to retain the
past, present or future health condition. In addition, the history of metadata as in Table 1.
HIPAA contains the "Safe Harbor" method, which lists 18 There are various software tools to perform de-iden-
identifiers of types to be removed or modified, whereas tification on DICOM images. Aryanto et al. [11] offer a
there are no explicit lists of the elements to be eliminated complete overview and critical analysis.
in the GDPR. It is very important to mention that simply removing
Pseudonymization and anonymization are different or modifying metadata in DICOM images may not be
ways to perform de-identification. Specifically, following sufficient to prevent re-identification of the subject. In
the ISO 25237:2017 standard, anonymization is defined fact, topograms for CT data and ultrasound images may
as a “process by which personal data are irreversibly have patient information burned into the pixel data [17].
altered in such a way that a data subject can no longer To manage this critical issue, some specific studies and
be identified directly or indirectly, either by the data con- tools are available to the scientific community [18–21].
troller alone or in collaboration with any other party,” In addition, it is demonstrated that humans or specific
whereas pseudonymization is a “particular type of de- software could identify individual subjects by recon-
identification that both removes the association with a structing facial images contained in cranial MR or CT
data subject and adds an association between a particular [22–24]. Further efforts are needed to develop reliable
set of characteristics relating to the data subject and one de-identification methods for medical images contain-
or more pseudonyms.” ing identifiable anatomical details, such as facial features.
Aiello et al. Insights Imaging (2021) 12:164 Page 5 of 21

Table 1 The DICOM attributes to add after the application of the DICOM attribute confidentiality profiles
Attribute Tag Description Type

Private data element characteristics sequence attribute (0008, 0300) Characteristics of private data elements within or referenced in the cur- 3
rent SOP instance
Deidentification action sequence attribute (0008, 0305) Actions to be performed on element within the block that are not safe 3
from identify leakage
Patient identity removed (0012, 0062) The true identity of the patient has been removed from the attributes 3
and the pixel data
De-identification method (0012, 0063) A description or label of the mechanism or method use to remove the 1C
patient’s identity. May be multi-valued if successive de-identification
steps have been performed
De-identification method code sequence (0012, 0064) A code describing the mechanism or method use to remove the 1C
patient’s identity
Burned in annotation attribute (0028, 0301) Indicates whether or not image contains sufficient burned in annotation 3
to identify the patient and date the image was acquired
Recognizable visual features attribute (0028, 0302) Indicates whether or not the image contains sufficiently recognizable 3
visual features to allow the image or a reconstruction from a set of
images to identify the patient
Longitudinal temporal information modified (0028, 0303) Indicates whether or not the date and time attributes in the instance 3
have been modified during de-identification
Encrypted attributes sequence attribute (0400, 0500) Sequence of items containing encrypted DICOM data 1C
Original attributes sequence attribute (0400, 0561) Sequence of items containing all attributes that were removed or 3
replaced by other values in the top level dataset
Type 3: optional; Type 1C: conditional

Moreover, the creation of specific DICOM attributes that DICOM-RT was developed to specifically address the
encode this operation is expected to fully standardize the standardization of data deriving from radiotherapy (e.g.,
de-identification procedures. external beam, treatment planning, dose, radiotherapy
images). In particular, five DICOM-RT objects were
Annotation and segmentation defined to manage: areas of significance (DICOM-RT
The key actions to extract fundamental features from Structure Set), transfer of treatment plans (DICOM-RT
medical images for both clinical decision making and Plan), dose distribution of the radiation therapy plan
research are annotation and segmentation. The annota- (DICOM-RT Dose), radiotherapy images (DICOM-RT
tion procedure focuses on labeling images with addi- Image) and treatment session report (DICOM-RT Treat-
tional information useful for data detection, classification ment Record). Segmentation information provided as
and grouping [25–28]. In particular, this routine allows ROIs was specifically included in DICOM-RT Structure
to transform descriptive and qualitative image features Set object and, conditionally for data containing dose
in machine-readable data, thus making them suitable for points or dose curves, in DICOM-RT Dose object.
automatic image analyses as supervised artificial intelli- More recently, DICOM-SEG [31] has been introduced.
gence (AI) methodologies [29]. Segmentation can be con- It is a dedicated modality in which the annotation routine
sidered as a special case of annotation where one or more is encoded as text (Table 2) and the positional informa-
image areas are isolated in so-called regions of interest tion of the annotation is specified by a codified segmen-
(ROI). Indeed, segmentation is based on drawing (manu- tation image (image data).
ally, semi-automatically or fully automatically) a binary It is worth mentioning that software to include seg-
mask of pixels belonging to the ROI. mentation/annotation information as DICOM-SEG
In the context of big data analytics, annotation and seg- modality was developed, thus allowing the management
mentation routines are fundamental processes for data of such information in DICOM. In particular, dedicated
description, sharing and analysis. Furthermore, the huge software packages as DCMTK [32], ITK [33, 34], dcmqi
amount of data to be managed requires approaches able [10] (built upon DCMTK and ITK) and pydicom-seg [35]
to codify and standardize the annotation data. are suitable for this purpose.
To address this issue, the National Electrical Manu-
facturers Association (NEMA) developed DICOM- Structured report
RT, i.e., the first extension of the DICOM standard that With the recent technical advances, the need to achieve
could include information regarding annotations [30]. full interoperability with the increasing amount of
Aiello et al. Insights Imaging (2021) 12:164 Page 6 of 21

Table 2 Most pertinent/specific DICOM tags related to DICOM-SEG modality

Attribute Tag Description Type

Image type (0008, 0008) Value reflecting if the image is primary or derived (value shall be 1 for DERIVED) 1
Instance number (0020, 0013) SOP instance number 1
Segmentation type (0062, 0001) Encoding properties of the segmentation (BINARY or FRACTIONAL) 1
Segmentation fractional type (0062, 0010) The meaning of fractional value (required for FRACTIONAL Segmentation type) 1C
Segments overlap (0062, 0013) Specify if one pixel can be in more than one segment 3
Segment sequence (0062, 0002) Description of the segment(s) 1
Segment number (0062, 0004) The number of the segment (unique) 1
Segment label (0062, 0005) User-defined label identifying the segment 1
Segment description (0062, 0006) User-defined description of the segment 3
Segment algorithm type (0062, 0008) Type of the algorithm used to generate the segment (AUTOMATIC, SEMIAUTOMATIC, 1
MANUAL)
Segmented property category (0062, 0003) Sequence defining the specific property the segment represents 1
Segmented property type (0062, 000F)
Definition source sequence (0008, 1156) Source sequence of the segment(s) 3
Segment algorithm name (0062, 0009) Name of the algorithm to generate the segment (required AUTOMATIC or SEMIAU- 1C
TOMATIC in (0062,0008))
Segmentation algorithm identifica- (0062, 0007) A description of the segmentation algorithm 3
tion attribute
Type 1: required (valid value); Type 3: optional; Type 1C: conditional

multi-modal patient data is arising. Structured report- laboratory results) might make SR as “Big-data con-
ing (SR) is becoming essential for clinical decision-mak- tainer” leading not only to an integrated and precise clin-
ing and research applications, including big data and ical decision (e.g., diagnosis, treatment option) but also
machine learning. to a substantial support for modern clinical research. The
SR aims to standardize both the format and lexicon standardized structure and vocabulary typical of SR can
used in radiology reports [36]. A definition for SR is set be well suitable to be analyzed by computers, thus facili-
by describing three increasing levels of SR according to tating data sharing (e.g., registries and biobanks) or data
Weiss and Bolos [37]: The first and basic level consists of mining in research [38–41].
a structured format with paragraph and subheadings; the In summary, a wide adoption of SR is critical not only
second is marked by a consistent organization with items for communicating results to physicians or patients,
reported in a certain order; and the third and more com- but also for making diagnostic imaging data suitable by
plex is characterized by the consistent use of dedicated AI algorithms. Clinical decision-making and research
lexicon and ontology. applications in AI and big data in medical imaging heav-
The main reasons prompting to move from traditional ily depend on data and standardization. One of the main
free text reporting to standardized and structured report- challenges for the development of AI solutions for health
ing are summarized below and encompass both clini- care and radiology remains the unstructured nature of
cal and research considerations [38–40]. First of all, the the data stored in electronic health records. In particu-
use of checklist-style SR and standardized SR templates lar, radiological report data are often available only as
ensures that all relevant items for a particular examina- unstructured narrative text [41–43].
tion are addressed. This may reduce diagnostic error, The implementation of SR is complex and still scarce
improve report clarity and quality and ensure consistent in clinical routine for several reasons. One of the biggest
use of terminology across practices. Secondly, the use of challenges in SR implementation is resistance to switch
standardized lexicon and structure prevents ambiguity from the traditional narrative reporting to SR. Another
and facilitates comparability of disease states, treatments issue concerns the risk of errors in case of improper use
and any type of clinical results. Even if this constitutes in clinical routine. Moreover, including unnecessary or
guidance for referring physicians, it should be high- irrelevant information in a template report may nega-
lighted that a simple comparison of the results is crucial tively impact the coherence of the report and the sub-
for clinical purposes. sequent comprehension by referring physicians. Finally,
Finally, the capability of SR to include quantitative the SR checklist schema may interfere with the radiolo-
imaging biomarkers (radiomics) and parameters (e.g., gist’s reasoning and ability with a negative impact on the
Aiello et al. Insights Imaging (2021) 12:164 Page 7 of 21

search pattern and visual attention. The so-called eye- with defined content item or indicates another template
dwell phenomenon may happen in case radiologists are to be included in the SR document [48]. A specific limi-
more focused on the report template rather than the tation of DICOM-SR is that even if it provides a data
images. This may not only increase reporting time, but structure which embeds structured reports in a stand-
generate errors or missed findings [38–40]. ard “container” that can be read across different software
Several steps were made by healthcare providers in applications, it does not define how the content should
order to overcome the above-mentioned limitations and be structured or standardized.
encourage the use of uniform language and structure in Table 3 and Fig. 2 show the definition of the DICOM-
radiology reporting, which are the basis for successfully SR Template for a Measurement Report and its sub-tem-
implementing SR in clinical practice. For instance, RSNA plates (TID 1500 from DICOM PS3.16 and PS3.21).
developed RadLex, a standardized ontology of radiologi- Similar to DICOM-SEG, several efforts have been made
cal terms in constant updating and developed starting in software development supporting the management of
from SNOMED-CT. RadLex can be used together with structured reports in DICOM standard [10, 32–34].
popular medical lexicons such as SNOMED-CT, ICD-
10, CPT and BrainInfo [44]. Moreover, RSNA started the
so-called Reporting Initiative with the aim of developing Materials and methods
and providing vendor-neutral reporting templates [45]. To evaluate the degree of effective use of the DICOM
This led to the publication of the Management of Radiol- standard in the previously detailed actions (i.e., de-iden-
ogy Report Templates profile by IHE, which extensively tification/anonymization, segmentation/annotation and
describes the concepts and technical details for inter- clinical reporting), two different experiments have been
operable, standardized and structured report templates designed and will be described in the following para-
[46]. graphs. The first explores the DICOM coding of infor-
Since an essential requirement for the successful imple- mation in the main databases shared by the scientific
mentation of SR is to respect the current radiology work- community, whereas the second investigates and evalu-
flow, the DICOM standard plays a key role [5]. Given its ates the possibility of managing DICOM fields with a sys-
potentiality, NEMA introduced the DICOM-SR which tematic selection of DICOM visualization software.
defines the syntax and semantics of structured and stand-
ardized diagnostic reports. An exhaustive description of Imaging database evaluation
DICOM-SR can be found in Clunie’s work [47]. Briefly, The databases included in our benchmark have been
like a DICOM image, the DICOM-SR has a header, which collected by a quasi-systematic internet survey, starting
encodes the information of the patient and study identi- from four well-known lists of public repositories [52–55].
fication, and a content, that instead is responsible for the The study of databases aims to identify the type of
coding of the report itself. The information elements in information released and to evaluate its standardization.
the report are hierarchically connected in a tree model, In particular, in this analysis we intend to verify whether
identifying the Sources and Targets Nodes and their rela- the DICOM standard is used to collect, where possible,
tionships. Each element has a name and a value, forming the information available.
the pairs Name-Value [48, 49]. DICOM-SR contains text Each database has been evaluated considering its
with links to other data such as images, waveforms and modality of access and several key features, described
spatial or temporal coordinates. Although DICOM-SR is hereafter. Furthermore, we performed a three-step
not as widespread as DICOM for digital images, its use review of the available repositories, in order to select
has many advantages [50, 51]. DICOM-SR documents which of them were suitable for this project. The datasets
can be stored and sent along with the images belong- released up to March 1, 2021, were included. The detailed
ing to the same study in PACS. In addition, DICOM-SR description of the database selection criteria is as follows:
supports unified lexicons such as RadLex, ICD-10 and Step 1:
SNOMED. Finally, DICOM-SR templates have been
defined to constrain the possible structures and to pro- • Access: We looked for open or public access. In par-
vide the basic codes that can be used to encode spe- ticular, repositories which required an application
cific reports [50]. Specifically, a DICOM-SR template is with a project, or private datasets were not consid-
applied to the document content to harmonize its struc- ered.
ture. Each template is assigned to an unique template • Format: Only datasets providing imaging scans in
identifier (TID) with a related name and is specified by DICOM format were selected.
a table where each line corresponds to a so-called node
Aiello et al. Insights Imaging (2021) 12:164 Page 8 of 21

Table 3 DICOM SR template for measurement report, template ID 1500 (from DICOM PS3.16)
NL Rel with parent VT Concept name VM RT Cond Value set constraint

1 > CONTAINER DCID 7021 “Measurement report 1 M Root node

document titles”
2 > HAS CONCEPT MOD INCLUDE DTID 1204 “Language of content 1 U
item and descendants”
3 > HAS OBS CONTEXT INCLUDE DTID 1001 “Observation context” 1 M
4 > HAS CONCEPT MOD CODE EV (121,058, DCM, “Procedure 1-n U BCID 100 “Quantitative Diagnostic
reported”) Imaging Procedures”
5 > CONTAINS INCLUDE DTID 1600 “Image library” 1 U
6 > CONTAINS CONTAINER EV (126,010, DCM, “Imaging 1 C IF row 10 and 12 are absent
measurements”)
6b >> HAS CONCEPT MOD INCLUDE DTID 4019 “Algorithm identifica- 1 U
tion”
7 >> CONTAINS INCLUDE DTID 1410 “Planar ROI measure- 1-n U $Measurement = BCID 218 “Quanti-
ments and qualitative evaluations” tative Image Features”
$Units = BCID 7181 “Abstract
Multi-dimensional Image Model
Component Units”
$Derivation = BCID 7464 “General
Region of Interest Measurement
Modifiers”
$Method = BCID 6147 “Response
Criteria”
$QualModType = BCID 210 “Qualita-
tive Evaluation Modifier Types”
$QualModValue = BCID 211 “Quali-
tative Evaluation Modifier Values”
8 >> CONTAINS INCLUDE DTID 1411 “Volumetric ROI 1-n U $Measurement = BCID 218 “Quanti-
measurements and qualitative tative Image Features”
evaluations” $Units = BCID 7181 “Abstract
Multi-dimensional Image Model
Component Units”
$Derivation = BCID 7464 “General
Region of Interest Measurement
Modifiers”
$Method = BCID 6147 “Response
Criteria”
$QualModType = BCID 210 “Qualita-
tive Evaluation Modifier Types”
$QualModValue = BCID 211 “Quali-
tative Evaluation Modifier Values”
9 >> CONTAINS INCLUDE DTID 1501 “Measurement and 1-n U $Measurement = BCID 218 “Quanti-
qualitative evaluation group” tative Image Features”
$ImagePurpose = BCID 7551
“Generic Purpose of Reference to
Images and Coordinates in Meas-
urements”
$Units = BCID 7181 “Abstract
Multi-dimensional Image Model
Component Units”
$Derivation = BCID 7464 “General
Region of Interest Measurement
Modifiers”
$Method = BCID 6147 “Response
Criteria”
$QualModType = BCID 210 “Qualita-
tive Evaluation Modifier Types”
$QualModValue = BCID 211 “Quali-
tative Evaluation Modifier Values”
10 > CONTAINS CONTAINER EV (126,011, DCM, “Derived imag- 1 C IF row 6 and 12 are absent
ing measurements”)
10b >> HAS CONCEPT MOD INCLUDE DTID 4019 “Algorithm identifica- 1 U
tion”
11 >> CONTAINS INCLUDE DTID 1420 “Measurements 1-n U
derived from multiple ROI meas-
urements”
Aiello et al. Insights Imaging (2021) 12:164 Page 9 of 21

Table 3 (continued)
NL Rel with parent VT Concept name VM RT Cond Value set constraint

12 > CONTAINS CONTAINER EV (C0034375, UMLS, “Qualitative 1 C IF row 6 and 12 are absent
evaluations”)
12b >> HAS CONCEPT MOD INCLUDE DTID 4019 “Algorithm identifica- 1 U
tion”
13 >> CONTAINS CODE 1-n U
13b >> HAS CONCEPT MOD CODE BCID 210 “Qualitative evaluation 1-n U BCID 211 “Qualitative Evaluation
modifier Types” Modifier Values”
14 >> CONTAINS TEXT 1-n U

NL, nesting level, defining the tree structure and depth; VT, value type; BCID, baseline context group identifier; DCID, defined context group identifier; VM, value
multiplicity, defining if a tree node may appear only once or can be repeated; RT, requirement type, defining if a tree node is mandatory or optional; EV, enumerated
value

• Species: We considered only humans, whereas there the aforementioned derived data corresponded to con-
were few databases with rodents, small animals or tributions generated by researchers who were not part of
phantom images. the group which originally submitted the related TCIA
collection.
Step 2: In order to test the actual presence of available DICOM
Among selected databases providing images in images, and any associated additional information in
DICOM format, we excluded those not including any DICOM format, each dataset was inspected in its entire
additional information than the acquired images. Spe- content. Specifically, the first patient folder containing
cifically, we considered the repositories having at least DICOM images was opened by using PostDicom soft-
one of the following additional resources, not necessar- ware (https://www.postdicom.com/), which allows the
ily provided in DICOM format: reading of DICOM series. Exploiting PostDicom’s ability
of reading information encoded in both the DICOM-SEG
• annotations/segmentations; and DICOM-SR, the same procedure was performed for
• radiologist reports; the associated additional information in DICOM format
• any clinical information. (if present).
Furthermore, to complement the analysis, it was con-
In this step, the databases that provided clinical sidered appropriate to investigate the sharing of imag-
information, in any format, were included. The lat- ing data during the spread of coronavirus disease 2019
ter criterion was adopted since, in this step, we aimed (COVID-19). The COVID-19 pandemic required a joint
to verify how many databases had sufficient informa- effort for the urgent development of tools and data shar-
tion to reconstruct dedicated DICOM-SR. Specifically, ing to better face the emergency. In this situation, diag-
we considered this condition verified in the following nostic imaging also plays a crucial role since the virus
cases: can generate a viral lung infection with a typical pattern
in the chest computed tomography (CT), such as ground
1. traceability of the referenced images; glass opacity, crazy-paving pattern and consolidation
2. clinical information related to single imaging modal- [56].
ity. In particular, initiatives for the development of auto-
matic diagnostic support tools based on AI techniques
Step 3: for diagnosis and clinical prediction have proliferated
From the repositories identified in step 2, we selected [57]. For the development of these tools, it is essential to
databases that provided additional information accord- make available large and curated datasets that include,
ing to the DICOM standard. In particular, we excluded in addition to imaging scans, information relating to the
all databases not including at least DICOM-SEG or segmentation of regions of interest (i.e., lesions generated
DICOM-SR. In the case of TCIA repository, we used by viral pneumonia) and information related to the clini-
the TCIA portal (https://nbia.cancerimagingarchive. cal status of the subject and the clinical outcome [58–61].
net/nbia-search) for selection and filtering of the col- It is clear that, in this context, the use of standards plays
lections. Moreover, the collections containing the TCIA a key role and, therefore, the analysis of CT COVID-19
third-party analysis results provided in DICOM-SEG or datasets sharing deserves particular attention.
DICOM-SR modalities were also included. In particular,
Aiello et al. Insights Imaging (2021) 12:164 Page 10 of 21

TID 1500
Measurement Report

TID 1204
Language of Content
Item and Descendants

TID 1003
TID 1001 TID 1002
Person Observer
Obserer Context Observer Context
Identifying Attributes

TID 1004
TID 1005 Device Observer
Procedure Context Identifying Attributes

TID 1006 TID 1007

Subject Content,
Subject Context
Patient

TID 1008
Subject content,
Fetus

TID 1009 TID 1007

Subject content, Subject content,
Specimen Specimen

TID 1010
Subject content,
Device

TID 1600 TID 1602 TID 1603

Image Library Entry Image Library Entry Descriptiors
Image Library
Descriptiors for Projection Radiography

TID 1604
TID 1602 TID 1603
TID 1601 Image Library Entry Descriptiors
Image Library Entry Image Library Entry Descriptiors for Cross-Sectional Modalities
Image Library Entry
Descriptiors for Projection Radiography
TID 1605
TID 1604 Image Library Entry Descriptiors
Image Library Entry Descriptiors for CT
for Cross-Sectional Modalities
TID 1410 TID 1502 TID 1606
Obserer Context Time Point Context TID 1605 Image Library Entry Descriptiors
Image Library Entry Descriptiors for MR
for CT

TID 1419 TID 310 TID 1607

Measurements TID 1606 Image Library Entry Descriptiors
ROI Measurements Image Library Entry Descriptiors
Properties for PET
for MR

TID 315 TID 1607

Equation or Table Image Library Entry Descriptiors
TID 1411 for PET
TID 1502
Volumetric ROI
Time Point Context TID 1000
measurements
Quotation

TID 4019
Algorithm
Identification

TID 1419 TID 310

Measurements
ROI Measurements
Properties

TID 315
Equation or Table

TID 1501 TID 1502

Measurement Group Time Point Context TID 1000
Quotation

TID 300 TID 310

Measurements TID 4019
Measurements Algorithm
Properties
Identification

TID 315
Equation or Table

TID 320
Image or Spatial
Coordinates

TID 321
Waveform or
TID 1420 Temporal Coordinates
Measurements Derived from
Multiple ROI Measurements
TID 1000
Quotation

TID 4108
Tracking Identifier

TID 4019
Algorithm
Identification

Fig. 2 DICOM SR measurement report template structure (template ID 1500) and its sub-templates (from DICOM PS3.21). TID: template ID, MR,
magnetic resonance; CT, computed tomography; PET, positron emission tomography
Aiello et al. Insights Imaging (2021) 12:164 Page 11 of 21

For this purpose, publicly accessible datasets including • Segmentation and annotation: Software and tools
chest CT images with COVID-19 lesions, available at the that could read and/or write a DICOM file with
first pandemic year (up to March 2021), have been iden- segmentations or annotations, using the dedicated
tified in leading scientific research repositories. For each DICOM tags;
dataset, a sample of the data was downloaded to verify • Structured report: Reading and writing DICOM
the format used for the release of CT scans and associ- structured reports, i.e., enclosed documents or data.
ated segmentations and clinical information.
To evaluate if a specific software was able to read a
DICOM-SR file, we checked if DICOM-SR content was
Imaging software evaluation successfully read by the software and displayed to the
In this section, we specifically consider the analysis of the operator.
software tools suitable for the diagnostic workflow. For In order to test the functionality of the software
this reason, the tools allowing at least the direct visuali- selected in the management of the DICOM fields, spe-
zation of the DICOM images have been focused on, and cific “probe” datasets have been created. Two exem-
their ability to manage the other data curation actions plary DICOM publicly available folders, specifically
has been analyzed. downloaded from TCIA Prostate-diagnosis [64] and
The software and tools included in our benchmark CMET-MRhead [65], served for this scope. For the seg-
have been collected by a quasi-systematic Internet sur- mentation and annotation functionalities, the data were
vey using the I Do Imaging (IDI) [62] initiative. Moreo- already embedded as DICOM-SEG modality in CMET-
ver, all software tools listed in the DICOM/PACS viewer MRhead repository. Conversely, since NRRD format
webpage of Radiology Cafè [63] have been included to was used for these data types in Prostate-Diagnosis
account for relevant software tools intended for clinical repository, dcmqi tool [10] was used to convert NRRD
use. segmentations to DICOM-SEG. To generate DICOM-
The first includes all free software tools released and SR from the probe datasets, the template TID1500
reviewed for research purposes, whereas the latter “Measurement Report” was used (from DICOM PS3.16,
includes all software tested by a consultant radiologist http://dicom.nema.org/medical/dicom/current/output/
and thought to represent the best currently available html/part16.html#sect_TID_1500). In particular, for
online, including core functionalities required by a radi- both probe datasets, basic measurements (e.g., mean,
ologist for reviewing studies and/or teaching. In order to standard deviation, median, range, volume) were evalu-
meet our evaluation criteria, the IDI database has been ated on the original DICOM images by means of the
filtered for DICOM support, display of images and rank associated DICOM-SEG and encapsulated in DICOM-
greater than four stars. Note that we have not included SR by using the dcmqi package [10].
software or tools with no rating. Each software has been All software and tools ran on a Windows V10 operating
evaluated according to the following inclusion criteria: system, except for the Horos DICOM Viewer that was
tested on Mac OS Catalina V10.15.6.
• DICOM format: We have considered only software
tools that could at least read a DICOM series. Results
• Display: This is a mandatory requirement, each soft- Imaging database evaluation
ware should, at least, display the image contained in a Regarding the selection of databases to be analyzed, 210
DICOM series. databases were identified after the quasi-systematic web
• License: We considered free or open-source software. survey. 100 repositories were collected after the first
Commercial software or tools were included only if a selection criteria (step 1), among which 83 databases
free trial version was available. were selected by applying the criteria of step 2.
In step 3, we identified which of the 83 detected data-
After selecting software tools that fulfilled the inclusion bases provided additional information according to the
criteria, we considered the following features for the soft- DICOM standard. In particular, we excluded all data-
ware evaluation procedure: bases not including at least DICOM-SEG or DICOM-SR.
After the exclusion of 49 items, 34 datasets were finally
• De-identification: Software that could modify a selected (Fig. 3).
DICOM header, e.g., editing or removing tag values Table 4 shows the 83 databases identified after step 2,
and save back all the de-identified personal informa- with lines colored in bold corresponding to the 34 data-
tion; sets selected after applying step 3 criteria, including their
characteristics (e.g., dataset name, collection, pathology)
Aiello et al. Insights Imaging (2021) 12:164 Page 12 of 21

and additional information other than DICOM images Of note, the highest result variability was found for
(e.g., CI, clinical info; A/S, annotations/segmentations; A/S data (eighth column in Table 4). Indeed, excluding
CR, clinical report). Focusing on the seventh, eighth databases that provided segmentation or annotation in
and ninth columns (CI, A/S, CR), it is evident that, DICOM-SEG, ten databases released segmentation data
respectively: in DICOM-RT, ten databases in image format (eight in
NIfTI, one in mhd and one in NRRD data type), nine
• No clinical information, when available, is reported stored in tabular format (XLS and CSV data type) and
in DICOM format. Of the 67 databases including nine stored in structured data format (four in XML and
clinical information, only 29 provided clinical infor- five in JSON data type). Except for image format, the
mation that are sufficient to reconstruct DICOM-SR; remaining data formats require dedicated routines to
• Twenty-seven out of 56 databases that contain the manage A/S information by associating file content to
definition of regions of interest use the DICOM-SEG positional information.
format. Focusing on the COVID-19 databases, 12 datasets have
• Twenty-four datasets providing clinical reports, out been identified [66–77]. Of note, we found that only two
of 83, use the DICOM-SR format. Furthermore, it of the reviewed COVID-19 datasets were released in
was verified that all datasets comply with the correct DICOM format [76, 77].
DICOM format for data de-identification. Five datasets use a non-specific image format for the
medical domain as portable network graphics [66–68],
tagged image file format [69] and hierarchical data format
Identification

Databases identified after pseudo-

systematic web revision
(n = 210)

Databases excluded, with reasons:

(n = 110)
• not DICOM raw data (n = 50)
• not human subjects (n = 3)
Screening

• phantoms (n = 10)
(STEP I)

• private databases (n = 11)

• required application (n = 26)

Databases with DICOM raw data

(n = 100)
Databases excluded, with reasons:
(n = 17)
• not including additional info other than DICOM images
Screening (STEP II)

Databases with DICOM raw data and additional info (not necessarily in DICOM format)
(n = 83)
• clinical info + annotations/segmentations + report (n = 20)
• clinical info + annotations/segmentations (n = 20)
• annotations/segmentations + report (n = 2)
• clinical info + report (n = 2)
• annotations/segmentations (n = 14)
• clinical info (n = 25)
Screening
(STEP III)

Databases excluded, with reasons:

(n = 49)
• not including DICOM-SEG or DICOM-SR
Final Selection

Databases with DICOM raw data and DICOM-SEG or DICOM-SR

(n =34)
• DICOM-SEG (n = 10)
• DICOM-SR (n = 7)
• DICOM-SEG and DICOM-SR (n = 17)

Fig. 3 Flow diagram describing the process of imaging database evaluation and selection. DICOM, Digital Imaging and Communications in
Medicine; DICOM-SEG, DICOM segmentation object; DICOM-SR, DICOM structured report object
Aiello et al. Insights Imaging (2021) 12:164 Page 13 of 21

Table 4 Selected databases (n = 83) providing additional information other than DICOM images
Dataset name Dataset Pathology Region Modality Number CI A/S CR TCIA analysis
collection of results
samples

PNEUMONIA RSNA Pneumonia Lung RX 30,000 N Y (JSON) N

CT Lymph TCIA Lymphad- Abdomen, CT 176 N Y (NIfTI) N
Nodes enopathy mediastinum
Pancreas-CT TCIA Healthy con- Pancreas CT 82 N Y (NIfTI) N
trols
Prostate-3 T TCIA Prostate cancer Prostate MR 64 N Y (.mhd) N
RIDER Lung TCIA Lung cancer Chest CT 32 N Y (XLS), Y*a, YR N [78, 79]
CT
Brain-Tumor- TCIA Brain cancer Brain MR 20 N Y* N
Progression
CBIS-DDSM TCIA Breast cancer Breast MG 1566 N Y* N
QIN LUNG CT TCIA Non-small cell Lung CT 47 N Ya (NIfTI), Y*a N [78–82]
lung cancer
4D-Lung TCIA Non-small cell Lung CT 20 N YR N
lung cancer
AAPM RT-MAC TCIA Head and neck Head–neck MR 55 N YR N
Grand Chal- cancer
lenge 2019
Head–Neck TCIA Head and neck Head–neck CT, PT 111 N YR N
Cetuximab carcinomas
(RTOG 0522)
LCTSC TCIA Lung cancer Lung CT 60 N YR N
MRI-DIR TCIA Squamous cell Head and Neck MR, CT 9 N YR N
carcinoma
Anti-PD-1 TCIA Lung cancer Lung CT, PT, SC 46 N Y*a Y*a [83]
Lung
Anti-PD-1_ TCIA Melanoma Skin CT, MR, PT 47 N Y*a Y*a [83]
MELANOMA
CT Colonog- TCIA Colon cancer Colon CT 825 Yb N N
raphy (ACRIN
6664)
LDCT-and-Pro- TCIA Various Head, chest, CT 300 Yb N N
jection-data abdomen
NSCLC- TCIA Lung cancer Lung CT 89 Yb N N
Radiomics-
Genomics
REMBRANDT TCIA Low- and high- Brain MR 130 Yb N N
grade glioma
Acrin-FLT- TCIA Breast cancer Breast PET, CT, OT 83 Y N N
Breast (ACRIN
6688)
Acrin-FMISO- TCIA Glioblastoma Brain CT, MR, PT 45 Y N N
Brain (ACRIN
6684)
ACRIN-NSCLC- TCIA Non-small cell Lung PT, CT, MR, CR, 242 Y N N
FDG-PET lung cancer DX, SC, NM
(ACRIN 6668)
CPTAC-CM TCIA Cutaneous Skin MR, CT, CR, PT, 94 Y N N
melanoma pathology
CPTAC-LSCC TCIA Squamous cell Lung CT, CR, DX, 212 Y N N
carcinoma NM, PT, pathol-
ogy
CPTAC-LUAD TCIA Adenocarci- Lung CT, MR, PT, CR, 244 Y N N
noma pathology
Aiello et al. Insights Imaging (2021) 12:164 Page 14 of 21

Table 4 (continued)
Dataset name Dataset Pathology Region Modality Number CI A/S CR TCIA analysis
collection of results
samples

TCGA-CESC TCIA Cervical Cervix MR, pathology 54 Yb N N

squamous cell
carcinoma and
endocervical
adenocarci-
noma
TCGA-ESCA TCIA Esophageal Esophagus CT, pathology 16 Yb N N
carcinoma
TCGA-KICH TCIA Kidney chro- Kidney CT, MR, pathol- 15 Y N N
mophobe ogy
TCGA-KIRP TCIA Kidney renal Renal CT, MR, PT, 33 Y N N
papillary cell pathology
carcinoma
TCGA-PRAD TCIA Prostate cancer Prostate CT, PT, MR, 14 Y N N
Pathology
TCGA-READ TCIA Rectum Rectum CT, MR, pathol- 3 Y N N
adenocarci- ogy
noma
TCGA-SARC TCIA Sarcomas Chest, abdo- CT, MR, pathol- 5 Y N N
men, pelvis, ogy
leg, TSpine
TCGA-STAD TCIA Stomach Stomach CT, pathology 46 Yb N N
adenocarci-
noma
TCGA-THCA TCIA Thyroid cancer Thyroid CT, PT, pathol- 6 Y N N
ogy
LGG-1p19qDe- TCIA Low grade Brain MR 159 Yb Y (NIfTI) N
letion glioma
Lung Fused- TCIA Lung cancer Lung CT, pathology 6 Yb N N
CT-Pathology
LungCT-Diag- TCIA Lung cancer Lung CT 61 Yb Y (XLS), Ya N [80–82]
nosis (NIfTI)
Prostate- TCIA Prostate cancer Prostate MR 92 Yb Y (.NRRD) N
Diagnosis
PROSTATEx TCIA Prostate cancer Prostate MR 346 Yb Y (CSV) N
SPIE-AAPM TCIA Lung cancer Lung CT 70 Yb Y (XLS) N
Lung CT Chal-
lenge
Head-Neck- TCIA Head and Head–neck CT, PT, SEG 137 Yb Y*,YR N
Radiomics- neck cancer
HN1
NSCLC-Radi‑ TCIA Lung cancer Lung CT, SEG 422 Yb Y*, Ya (NIfTI), N [84–86]
omics YR
NSCLC-Radi‑ TCIA Non-small cell Lung CT, SEG 22 Yb Y*, YR N
omics-Inter‑ lung cancer
observer1
TCGA-GBM TCIA Glioblastoma Brain MR, CT, DX, 262 Y Y*a, Ya (NIfTI, N [87–90]
multiforme pathology XML)
TCGA-LGG TCIA Low-grade Brain MR, CT, 199 Y Y*a, Ya (NIfTI) N [88–90]
glioma pathology
Head-Neck- TCIA Head and neck Head–neck PT, CT 298 Yb YR N
PET-CT cancer
HNSCC TCIA Head and neck Head–neck CT, PT, MR 627 Y YR N
squamous cell
carcinoma
Aiello et al. Insights Imaging (2021) 12:164 Page 15 of 21

Table 4 (continued)
Dataset name Dataset Pathology Region Modality Number CI A/S CR TCIA analysis
collection of results
samples

HNSCC-3DCT- TCIA Head and neck Head–neck CT, DICOM-RT, 31 Yb YR N

RT squamous cell RTDOSE
carcinoma
OPC-Radi- TCIA Oropharyngeal Head-and- CT, DICOM-RT, 606 Yb YR N
omics neck Clinical
Soft-tissue- TCIA Soft-tissue Extremities FDG-PET/CT, 51 Y YR N
Sarcoma sarcoma MR
CPTAC-CCRCC TCIA Clear cell Kidney CT, MR, 222 Y Y*a Y*a [83]
carcinoma pathology
CPTAC-GBM TCIA Glioblastoma Brain CT, CR, SC, 189 Y Y*a Y*a [83]
multiforme MR, pathol‑
ogy
CPTAC- TCIA Head and Head–neck CT, MR, SC, 112 Y Y*a Y*a [83]
HNSCC neck cancer pathology
CPTAC-PDA TCIA Ductal adeno‑ Pancreas CT, MR, DX, 168 Y Y*a Y*a [83]
carcinoma PT, XA, CR,
US, pathol‑
ogy
TCGA-BLCA TCIA Bladder Bladder CT, CR, MR, 120 Y Y*a Y*a [83]
endothelial PT, DX,
carcinoma pathology
TCGA-COAD TCIA Colon adeno‑ Colon CT, pathology 25 Y Y*a Y*a [83]
carcinoma
TCGA-LUSC TCIA Lung squa‑ Lung CT, NM, PT, 37 Y Y*a Y*a [83]
mous cell pathology
carcinoma
TCGA-UCEC TCIA Uterine Uterus CT, CR, MR, 65 Y Y*a Y*a [83]
corpus PT, pathology
endometrial
carcinoma
Breast-MRI- TCIA Breast cancer Breast MR, SEG 64 Yb Y* Y*a [91]
NACT-Pilot
C4KC-KiTS TCIA Kidney cancer Kidney CT, SEG 210 Yb Y* N
ISPY1 (ACRIN TCIA Breast cancer Breast MR, SEG 222 Yb Y* Y*a [91]
6657)
NSCLC- TCIA Non-small cell Chest PT, CT, SEG, 211 Yb Y (XML), Y*, Y*a [83]
Radiogenom‑ lung cancer SR Y*a
ics
TCGA-HNSC TCIA Head and Head-neck CT, MR, PT, 227 Y Y*a, YR Y*a [83]
neck squa‑ Pathology
mous cell
carcinoma
LIDC-IDRI TCIA Lung cancer Chest CT, CR, DX 1010 Yb Y (XML), Y*a Y*a [78, 79, 92–95]
b *
QIN-Head‑ TCIA Head and Head-neck PT, CT, SR, 156 Y Y Y*
Neck neck carcino‑ SEG, RWV
mas
CPTAC-UCEC TCIA Corpus Uterus CT, MR, PT, 250 Y Y*a Y* (incom‑ [83]
endometrial CR, DX, SR, plete SR), Y*a
carcinoma pathology
TCGA-KIRC TCIA Kidney renal Renal CT, MR, CR, 267 Y Ya (XLS) Y*a [96]
clear cell pathology
carcinoma
TCGA-BRCA TCIA Breast cancer Breast MR, MG, 139 Y Ya (XLS) Y*a [91, 97–102]
pathology
TCGA-LIHC TCIA Liver hepa‑ Liver MR, CT, PT, 97 Y Ya (XML) Y*a [96]
tocellular pathology
carcinoma
Aiello et al. Insights Imaging (2021) 12:164 Page 16 of 21

Table 4 (continued)
Dataset name Dataset Pathology Region Modality Number CI A/S CR TCIA analysis
collection of results
samples

TCGA-LUAD TCIA Lung adeno‑ Chest CT, PT, NM, 69 Y Ya (XLS) Y*a [96]
carcinoma pathology
TCGA-OV TCIA Ovarian Ovary CT, MR, 143 Y Ya (XLS) Y*a [96]
serous pathology
cystadenocar‑
cinoma
CPTAC-SAR TCIA Sarcomas Various (11 CT, MR, PT, 94 Y N Y* (incom‑
locations) SR, pathology plete SR)
Breast Diag‑ TCIA Breast cancer Breast MR, PT, CT, 88 Yb N Y (.XLS), Y*a [91]
nosis MG
COVID-19-AR TCIA COVID-19 Chest CT, CR, DX 105 Y N N
MIDRC- TCIA COVID-19 Chest CT 110 Yb Y (.JSON) N
RICORD-1a
MIDRC- TCIA COVID-19 Chest CT 117 Yb N N
RICORD-1b
MIDRC- TCIA COVID-19 Chest CR, DX 361 Yb Y (.JSON) N
RICORD-1c
ELCAP Public NA Lung cancer Lung CT 50 N Y (.CSV) N
Lung Image
Database
I2CVB Prostate NA Prostate cancer Prostate MR 12 Y N N
MIMBCD-UI MIMBCD Breast cancer Breast US, MG, MRI 3 Y N N
UTA4
MIMBCD-UI MIMBCD Breast cancer Breast US, MG 6 Y Y (.JSON) N
UTA7
MIMBCD-UI MIMBCD Breast cancer Breast US, MG 6 Y Y (.JSON) N
UTA10
eNKI_RS_TRT NKI_RS No Brain MRI 24 Y N N
TCIA, The Cancer Imaging Archive; CT, computed tomography; MR, magnetic resonance; PT (or PET), positron emission tomography; SR, structured report; CR,
computed radiography; MG, mammography; DX, digital radiography; XA, X-ray angiography; SC, secondary capture; NM, nuclear medicine; OT, other modality; RWV,
real-world value; CI, clinical info; No, number of cases; A/S, annotations/segmentations; CR, clinical report; JSON, JavaScript object notation; NIfTI, Neuroimaging
Informatics Technology Initiative; CSV, comma separated value; XLS, excel spreadsheet; XML, extensible markup language; NRRD, nearly raw raster data; mhd,
MetaImage; Y, yes; N, no; NA, not applicable
Lines with bold correspond to the finally selected datasets (n = 34)
*
Additional information provided according to DICOM format (DICOM-SEG for annotations/segmentations and DICOM-SR for clinical reports)
a
Annotations/segmentations and/or clinical reports provided by researchers who were not part of the group which originally submitted the related TCIA collection,
with related references reported in “TCIA analysis results” column
b
Clinical info supposed to be enough to reconstruct DICOM-SR (namely if related imaging series is provided and/or if the clinical information refers to a single
imaging modality)
R
Annotations/segmentations provided as DICOM-RT

[70]. The remaining datasets [71–75] use the NIfTI for- Finally, ten software tools resulted from the quasi-sys-
mat for CT scans and, when available, segmentation tematic selection, as listed in Table 5.
images. All software tools proved to be suitable in basic opera-
tions such as loading, selecting, viewing and manipulat-
Imaging software evaluation ing the raw images from the probe datasets.
Although the following software met the inclusion cri- The third and fourth columns of Table 5 show the
teria, CollectiveMinds (www.cmrad.com) was excluded results, in dichotomous representation, of the software
since it resulted a Web platform for collaborative report- evaluation with the respect to annotation/segmentation
ing, accessible only to licensed medical doctors and and structured report, respectively.
Papaya (http://mangoviewer.com/papaya.html) was For each selected software tool, the aforementioned
excluded since it was based on the same API of Mango. procedures were tested for both reading and writing
Aiello et al. Insights Imaging (2021) 12:164 Page 17 of 21

operations. The following results summarize the soft- workflow. The results of our research first show that the
ware evaluation procedure: DICOM format fully supports them, allowing to encap-
sulate in a single format much of the information neces-
• Four out of ten analyzed software tools were able to sary for subsequent analytical phases.
modify DICOM header by editing DICOM tag val- Considering a typical radiomic workflow [98, 99], an
ues. artificial system can find appropriately de-identified
• None of the selected software tools allows the writ- data, information related to the patient’s clinical sta-
ing of regions of interest in the DICOM-SEG format; tus (DICOM-SR) and information on the localization of
at the same time, two software tools (PostDicom and the region of interest within a single DICOM folder. For
3D Slicer) allow DICOM-SEG reading. example, the details concerning the ROI (DICOM-SEG)
• Four software tools (Radiant, PostDicom and Horos can be useful to calculate the radiomic descriptors, thus
Viewer) allow the reading of information encoded favoring the aggregation of suitable data to develop reli-
in the DICOM-SR format, and none of the analyzed able systems for classification or prediction of clinical
software tools allows DICOM-SR writing. outcomes.
It is interesting to note that although the DICOM for-
Of note, 3D Slicer software allows both writing mat has considerable potential to foster big data analyt-
DICOM-SEG and reading DICOM-SR images by the ics, it is only partially exploited in the sharing of imaging
use of “QuantitativeReporting” extension (https://qiicr. data for research purposes.
gitbook.io/quantitativereporting-guide/). The results of the imaging database evaluation show
In addition, only two software tools (Horos Viewer that the DICOM-SR format is rarely used to contain clin-
and MricROgl) allow the de-identification of DICOM ical information. It should be pointed out that all the 24
folders in full compliance with the DICOM standard. datasets providing clinical reports released such informa-
tion in DICOM-SR format according to guidelines drawn
Discussion up for challenge tasks or specific initiatives [83, 91, 92,
In the era of big data, it is increasingly important to pay 96]. Moreover, regarding DICOM-SEG, one-third of the
attention and care to data management in order to fully analyzed datasets use different and not fully standardized
exploit the potential of modern analytical techniques. formats to share information on regions of interest. On
The definition and the proper use of standards certainly this topic, the “DICOM4QI” initiative is worthy of note;
play a leading role in this perspective. it aims at evaluating interoperability of the image analy-
The DICOM standard has been described and evalu- sis tools and workstations, applied to exchange of the
ated for a series of key actions involved in the radiological

Table 5 Evaluation of DICOM viewers included in the study

Software DE-ID DICOM-SEG DICOM-SR License Release date Link
Reading Writing Reading Writing

RadiAnt N N* N Y N Free trial + commercial 29/04/2020 https://www.radiantviewer.com/

ProSurgical3D N N N N N Free trial + commercial 25/06/2019 https://www.stratovan.com/
products/pro-surgical-3d
PostDicom N Y N Y N Free trial + commercial N/A https://www.postdicom.com/
Horos Viewer Y N N Y N LGPL-3.0 19/12/2019 https://horosproject.org/
3D Slicer P Y Y** Y** N BSD-style 22/05/2019 https://www.slicer.org/
Mango P N N N N RII-UTHSCSA 24/03/2019 http://ric.uthscsa.edu/mango/
ITK-SNAP N N N N N GNU General Public License 12/06/2019 http://www.itksnap.org/pmwiki/
pmwiki.php
mEDinria N N N N N BSD 4-Clause 11/06/2020
mricROgl Y N N N N BSD 2-Clause 31/3/2020 https://www.nitrc.org/projects/
mricrogl
BrainVISA Anatomist N N N N N CeCILL License V 2 25/09/2018 http://brainvisa.info/web/index.
html
Y, yes; N, no; P, partial
*
DICOM-SEG successfully loaded but misinterpreted by the software
**
Including “QuantitativeReporting” extension (https://qiicr.gitbook.io/quantitativereporting-guide/)
Aiello et al. Insights Imaging (2021) 12:164 Page 18 of 21

quantitative image analysis results using DICOM stand- and approval of specific research projects were not
ard [7, 9, 10]. considered.
The imaging software evaluation shows that the sup-
port provided by software to fully exploit the potential of Conclusions
the DICOM format is still considerably limited, reduc- In conclusion, this work provides an overview of the
ing the possibility for researchers and clinicians to create potential, not always exploited, of the DICOM format
and make available suitable DICOM datasets. Therefore, for capitalizing the radiological workflow from a big
there is a need to spur the development of initiatives data perspective. The analysis of both the databases
that increase the attention on radiological software not and the software shows that further efforts are needed
only for the visualization reporting, but also for prepar- by researchers, clinicians and companies to promote
ing datasets suitable for big data analytics. Interesting and facilitate the use of standards to increase the value
and helpful initiatives, such as OHIF viewer initiative of imaging data, according to FAIR principles.
[103] and NCI Imaging Data Commons (IDC), should be
highlighted. Although not included in the present study,
the first aimed to deliver an extensible platform to sup- Abbreviations
A/S: Annotations/segmentations; AI: Artificial intelligence; CI: Clinical info;
port site-specific workflows and accommodate evolving COVID-19: Coronavirus disease 2019; CR: Clinical report; CR: Computed
research requirements, according to DICOM specifica- radiography; CSV: Comma separated value; CT: Computed tomography;
tions. Instead, IDC highlighted the role of the DICOM DICOM: Digital Imaging and Communications in Medicine; DICOM-RT: DICOM
radiation therapy; JSON: JavaScript Object Notation; mhd: MetaImage; MR:
format as a cornerstone for sharing data and harmoniz- Magnetic resonance; NEMA: National Electrical Manufacturers Association;
ing analyses [104]. NIfTI: Neuroimaging Informatics Technology Initiative; NRRD: Nearly raw raster
The need to promote the DICOM standard is addition- data; PT (or PET): Positron emission tomography; ROI: Region of interest;
RTDOSE: DICOM-RT dose; RTPLAN: DICOM-RT plan; SR: Structured report; TCIA:
ally demonstrated by the analysis of the CT COVID-19 The Cancer Imaging Archive; TID: Template identifier; XLS: Excel spreadsheet;
datasets. Indeed, only the two most recent COVID-19 XML: Extensible markup language.
collections released CT data in DICOM format. Although
Authors’ contributions
DICOM is the format used in the clinical acquisition rou- Guarantor of integrity of the entire study was MA. Study ideation and design
tine, its limited adoption may indicate that the actual were contributed by MA and MS. Databases and software evaluation were
emergency conditions enhance the difficulty in finding contributed by VB, GE and PB. Manuscript editing was contributed by MA, GP
and MS. All authors contributed to the writing and read and approved the
adequate tools to manage the standard and, therefore, to final manuscript.
promptly proceed to the de-identification, sharing and
care of data. This difficulty may therefore lead research- Funding
This work was partially funded by the Italian Ministry for Education, University
ers to use alternative formats easier to manage and dis- and Research—Project “MOLIM ONCOBRAIN LAB—Metodi innovativi di imag-
seminate, such as textual tables and non-medical image ing molecolare per lo studio di malattie oncologiche e neurodegenerative”
data. ARS0100144 Prot. U.001378531-08-2018 and by the Italian Ministry of Health:
“Ricerca Corrente” project and by POR CAMPANIA FESR 2014–2020, AP1-OS1.3
It is important to note that this work includes databases (DGR no. 140/2020) for project “Protocolli TC del torace a bassissima dose e
mainly oriented to oncological studies, although this was tecniche di intelligenza artificiale per la diagnosi precoce e quantificazione
not a prerequisite. Indeed, neurological studies, in which della malattia da COVID-19” CUP D54I20001410002.
diagnostic images play a decisive role [105–107], have not Availability of data and materials
been evaluated due to the definition of ad hoc standards This work is based on data and information already available with open access.
and tools to manage and share neuroimaging data. In this
field, the dedicated standard named Brain Imaging Data Declarations
Structure (BIDS) [108] has been developed to organize
Ethics approval and consent to participate
and describe neuroimaging data with the use of different Not applicable.
file formats than DICOM (e.g., NIfTI, JSON, text files).
Similar to the DICOM, the BIDS standard allows to man- Consent for publication
Not applicable.
age both metadata and derived quantitative measure-
ments, opening to automated data analysis workflows. Competing interests
It should be considered that, to enhance the feasibil- The authors declare that they have no competing interests.
ity of this work, narrow inclusion criteria that could Author details
not allow a comprehensive analysis were chosen and, 1
IRCCS SDN, Via Emanuele Gianturco 113, 80143 Naples, Italy. 2 Bio Check Up
therefore, it is not possible to exclude that some useful S.R.L, Naples, Italy.
data have been neglected. For example, software tools Received: 27 May 2021 Accepted: 25 August 2021
and databases whose release requires the submission
Aiello et al. Insights Imaging (2021) 12:164 Page 19 of 21

References 26. Kohli MD, Summers RM, Geis JR (2017) Medical image data and data-
1. Aiello M, Cavaliere C, D’Albore A, Salvatore M (2019) The challenges of sets in the era of machine learning—whitepaper from the 2016 C-MIMI
diagnostic imaging in the era of big data. J Clin Med 8:316 meeting dataset session. J Digit Imaging 30:392–399. https://doi.org/
2. Cirillo D, Valencia A (2019) Big data analytics for personalized medicine. 10.1007/s10278-017-9976-3
Curr Opin Biotechnol 58:161–167. https://doi.org/10.1016/j.copbio. 27. Channin DS, Mongkolwat P, Kleper V, Rubin DL (2009) The annotation
2019.03.004 and image mark-up project. Radiology 253:590–592. https://doi.org/10.
3. Dinov ID (2016) Volume and value of big healthcare data. J Med Stat 1148/radiol.2533090135
Inform. https://doi.org/10.7243/2053-7662-4-3 28. Tommasi T, Orabona F, Caputo B (2008) Discriminative cue integration
4. Peterson CB, Hamilton C, Hasvold P (2016) From innovation to imple- for medical image annotation. Pattern Recogn Lett 29:1996–2002.
mentation: eHealth in the WHO European region. WHO Regional Office https://doi.org/10.1016/j.patrec.2008.03.009
for Europe, Copenhagen 29. Philbrick KA, Weston AD, Akkus Z et al (2019) RIL-contour: a medical
5. Mildenberger P, Eichelberg M, Martin E (2002) Introduction to the imaging dataset annotation tool for and with deep learning. J Digit
DICOM standard. Eur Radiol 12:920–927. https://doi.org/10.1007/s0033 Imaging. https://doi.org/10.1007/s10278-019-00232-0
00101100 30. Law MYY, Liu B (2009) DICOM-RT and its utilization in radiation therapy.
6. Larobina M, Murino L (2014) Medical image file formats. J Digit Imaging Radiographics 29:655–667. https://doi.org/10.1148/rg.293075172
27:200–206. https://doi.org/10.1007/s10278-013-9657-9 31. A.51 Segmentation IOD. http://dicom.nema.org/medical/dicom/curre
7. Fedorov A, Clunie D, Ulrich E et al (2016) DICOM for quantitative imag- nt/output/chtml/part03/sect_A.51.html
ing biomarker development: a standards based approach to sharing 32. DCMTK - DICOM Toolkit. https://dicom.offis.de/dcmtk.php.en
clinical data and structured PET/CT analysis results in head and neck 33. Insight Toolkit: ITK. https://itk.org/
cancer research. PeerJ 4:e2057. https://doi.org/10.7717/peerj.2057 34. McCormick MM, Liu X, Ibanez L, Jomier J, Marion C (2014) ITK: enabling
8. Fedorov A, Schwier M, Clunie D et al (2018) An annotated test-retest reproducible research and open science. Front Neuroinform. https://
collection of prostate multiparametric MRI. Sci Data 5:1–13. https://doi. doi.org/10.3389/fninf.2014.00013
org/10.1038/sdata.2018.281 35. pydicom-seg. https://github.com/razorx89/pydicom-seg
9. Fedorov A, Rubin D, Kalpathy-Cramer J et al (2015) Interoperable 36. Marcovici PA, Taylor GA (2014) JOURNAL CLUB: structured radiology
communication of quantitative image analysis results using DICOM reports are more complete and more effective than unstructured
standard. figshare reports. AJR Am J Roentgenol 203:1265–1271. https://doi.org/10.2214/
10. Herz C, Fillion-Robin J-C, Onken M et al (2017) DCMQI: an open source AJR.14.12636
library for standardized communication of quantitative image analysis 37. Weiss DL, Langlotz CP (2008) Structured reporting: patient care
results using DICOM. Cancer Res 77:e87–e90. https://doi.org/10.1158/ enhancement or productivity nightmare? Radiology 249:739–747.
0008-5472.CAN-17-0336 https://doi.org/10.1148/radiol.2493080988
11. Aryanto KYE, Oudkerk M, van Ooijen PMA (2015) Free DICOM de- 38. Ganeshan D, Duong P-AT, Probyn L et al (2018) Structured reporting in
identification tools in clinical research: functioning and safety of radiology. Acad Radiol 25:66–73. https://doi.org/10.1016/j.acra.2017.08.
patient privacy. Eur Radiol 25:3685–3695. https://doi.org/10.1007/ 005
s00330-015-3794-0 39. Gul P, Gul P (2019) Education in radiology structured reporting in radiol-
12. Medema J Security and Privacy in DICOM. 2 ogy. Are we ready to implement it? PJR 29(1):49–53
13. Simpson AL, Antonelli M, Bakas S et al (2019) A large annotated medical 40. European Society of Radiology (ESR) (2018) ESR paper on structured
image dataset for the development and evaluation of segmentation reporting in radiology. Insights Imaging 9:1–7. https://doi.org/10.1007/
algorithms. arXiv:190209063 [cs, eess] s13244-017-0588-8
14. Hodson S, Jones S, Collins S et al (2018) Turning FAIR data into reality: 41. Pinto dos Santos D, Baeßler B (2018) Big data, artificial intelligence,
interim report from the European Commission Expert Group on FAIR and structured reporting. Eur Radiol Exp 2:42. https://doi.org/10.1186/
data. https://doi.org/10.5281/zenodo.1285272 s41747-018-0071-4
15. Vesteghem C, Brøndum RF, Sønderkær M et al (2019) Implementing the 42. Marcheschi P (2017) Relevance of eHealth standards for big data inter-
FAIR Data Principles in precision oncology: review of supporting initia- operability in radiology and beyond. Radiol Med 122:437–443. https://
tives. Brief Bioinform. https://doi.org/10.1093/bib/bbz044 doi.org/10.1007/s11547-016-0691-9
16. Caspers J (2021) Translation of predictive modeling and AI into clinics: 43. Pinto dos Santos D, Kotter E (2018) Structured radiology reporting on
a question of trust. Eur Radiol 31:4947–4948. https://doi.org/10.1007/ an institutional level—benefit or new administrative burden? Ann NY
s00330-021-07977-9 Acad Sci 1434:274–281. https://doi.org/10.1111/nyas.13741
17. Moore SM, Maffitt DR, Smith KE et al (2015) De-identification of medi- 44. Langlotz CP (2006) RadLex: a new method for indexing online educa-
cal images with retention of scientific research value. Radiographics tional materials. Radiographics 26:1595–1597. https://doi.org/10.1148/
35:727–735. https://doi.org/10.1148/rg.2015140244 rg.266065168
18. Vcelak P, Kryl M, Kratochvil M, Kleckova J (2019) Identification and clas- 45. Morgan TA, Helibrun ME, Kahn CE (2014) Reporting initiative of the
sification of DICOM files with burned-in text content. Int J Med Inform radiological society of North America: progress and new directions.
126:128–137. https://doi.org/10.1016/j.ijmedinf.2019.02.011 Radiology 273:642–645. https://doi.org/10.1148/radiol.14141227
19. Monteiro E, Costa C, Oliveira JL (2017) A de-identification pipeline for 46. Kahn CE, Genereaux B, Langlotz CP (2015) Conversion of radiology
ultrasound medical images in DICOM format. J Med Syst 41:89. https:// reporting templates to the MRRT standard. J Digit Imaging 28:528–536.
doi.org/10.1007/s10916-017-0736-1 https://doi.org/10.1007/s10278-015-9787-3
20. DicomCleanerTM. http://www.dclunie.com/pixelmed/software/webst 47. Clunie DA (2000) DICOM structured reporting. PixelMed Pub, Bangor
art/DicomCleanerUsage.html. Accessed 6 Aug 2021 48. Hussein R, Engelmann U, Schroeter A, Meinzer H-P (2004) DICOM
21. PyDicom. DICOM in Python structured reporting: part 1. Overview and characteristics. Radiograph-
22. Parks CL, Monson KL (2017) Automated facial recognition of computed ics 24:891–896. https://doi.org/10.1148/rg.243035710
tomography-derived facial images: patient privacy implications. J Digit 49. Hussein R, Engelmann U, Schroeter A, Meinzer H-P (2004) DICOM struc-
Imaging 30:204–214. https://doi.org/10.1007/s10278-016-9932-7 tured reporting: part 2. Problems and challenges in implementation for
23. Schwarz CG, Kremers WK, Therneau TM et al (2019) Identification of PACS workstations. Radiographics 24:897–909. https://doi.org/10.1148/
anonymous MRI research participants with face-recognition software. N rg.243035722
Engl J Med 381:1684–1686. https://doi.org/10.1056/NEJMc1908881 50. Torres JS, Damian SegrellesQuilis J, Espert IB, García VH (2012) Improv-
24. Chen JJ-S, Juluru K, Morgan T et al (2014) Implications of surface- ing knowledge management through the support of image examina-
rendered facial CT images in patient privacy. AJR Am J Roentgenol tion and data annotation using DICOM structured reporting. J Biomed
202:1267–1271. https://doi.org/10.2214/AJR.13.10608 Inform 45:1066–1074. https://doi.org/10.1016/j.jbi.2012.07.004
25. Temal L, Dojat M, Kassel G, Gibaud B (2008) Towards an ontology for 51. Noumeir R (2003) DICOM structured report document type definition.
sharing medical images and regions of interest in neuroimaging. J IEEE Trans Inf Technol Biomed 7:318–328. https://doi.org/10.1109/TITB.
Biomed Inform 41:766–778. https://doi.org/10.1016/j.jbi.2008.03.002 2003.821334
Aiello et al. Insights Imaging (2021) 12:164 Page 20 of 21

52. Open-Access Medical Image Repositories-aylward.org (2020). http:// 77. Tsai EB, Simpson S, Lungren MP et al (2021) The RSNA international
www.aylward.org/notes/open-access-medical-image-repositories. COVID-19 open radiology database (RICORD). Radiology 299:E204–
Accessed 30 Sept 2020 E213. https://doi.org/10.1148/radiol.2021203957
53. Sfikas G (2020) sfikas/medical-imaging-datasets 78. Kalpathy-Cramer J, Zhao B, Goldgof D et al (2016) A comparison of lung
54. FAIRsharing. https://fairsharing.org/biodbcore/?q=dicom nodule segmentation algorithms: methods and results from a multi-
55. Medical data for machine learning. https://github.com/beamandrew/ institutional study. J Digit Imaging 29:476–487. https://doi.org/10.1007/
medical-data s10278-016-9859-z
56. Pan F, Ye T, Sun P et al (2020) Time course of lung changes at chest CT 79. Jayashree Kalpathy-Cramer SN (2015) Multi-site collection of lung CT
during recovery from coronavirus disease 2019 (COVID-19). Radiology data with nodule segmentations. The Cancer Imaging Archive
295:715–721. https://doi.org/10.1148/radiol.2020200370 80. Goldgof D, Hall L, Hawkins S et al (2017) Long and short survival in
57. Computer‐aided diagnosis in the era of deep learning-Chan-2020-Med- adenocarcinomalung CTs DICOM. Available via https://wiki.cancerimag
ical Physics-Wiley Online Library (2020). https://aapm.onlinelibrary. ingarchive.net/pages/viewpage.action?pageId=24284406
wiley.com/doi/10.1002/mp.13764. Accessed 21 July 2020 81. Paul R, Hawkins S, Yoganand B, Goldgof D (2016) Deep feature transfer
58. Gieraerts C, Dangis A, Janssen L et al (2020) Prognostic value and repro- learning in combination with traditional features predicts survival
ducibility of AI-assisted analysis of lung involvement in COVID-19 on among patients with lung adenocarcinoma. Tomography 2:388–395.
low-dose submillisievert chest CT: sample size implications for clinical https://doi.org/10.18383/j.tom.2016.00211
trials. Radiol Cardiothorac Imaging 2:e200441. https://doi.org/10.1148/ 82. Hawkins SH, Korecki JN, Balagurunathan Y et al (2014) Predicting
ryct.2020200441 outcomes of nonsmall cell lung cancer using CT image features. IEEE
59. Wang M, Xia C, Huang L et al (2020) Deep learning-based triage and Access 2:1418–1426. https://doi.org/10.1109/ACCESS.2014.2373335
analysis of lesion burden for COVID-19: a retrospective study with 83. Urban T, Ziegler E, Pieper S et al (2019) Crowds cure cancer: crowd-
external validation. Lancet Digital Health 2:e506–e515. https://doi.org/ sourced data collected at the RSNA 2018 annual meeting. Available
10.1016/S2589-7500(20)30199-0 via https://wiki.cancerimagingarchive.net/pages/viewpage.action?
60. Lessmann N, Sánchez CI, Beenen L et al (2021) Automated assessment pageId=52757630
of COVID-19 reporting and data system and chest CT severity scores 84. Kiser K, Ahmed S, Stieb SM et al (2021) Thoracic volume and pleural
in patients suspected of having COVID-19 using artificial intelligence. effusion segmentations in diseased lungs for benchmarking chest CT
Radiology 298:E18–E28. https://doi.org/10.1148/radiol.2020202439 processing pipelines. Available via https://wiki.cancerimagingarchive.
61. Harmon SA, Sanford TH, Xu S et al (2020) Artificial intelligence for net/pages/viewpage.action?pageId=68551327
the detection of COVID-19 pneumonia on chest CT using multi- 85. Kiser KJ, Barman A, Stieb S, Fuller CD, Giancardo L (2020) Novel autoseg-
national datasets. Nat Commun 11:4080. https://doi.org/10.1038/ mentation spatial similarity metrics capture the time required to correct
s41467-020-17971-2 segmentations better than traditional metrics in a thoracic cavity
62. Free medical imaging software: I do imaging (2020). https://idoim segmentation workflow. J Digit Imaging 34(3):541–553
aging.com/. Accessed 30 Sept 2020 86. Aerts HJWL, Wee L, Rios Velazquez E et al (2019) Data from
63. Clarke C (2020) DICOM viewers. In: Radiology cafe. https://www.radio NSCLC-radiomics
logycafe.com/radiology-trainees/dicom-viewers. Accessed 30 Sept 87. Gevaert O, Mitchell LA, Achrol AS et al (2014) Glioblastoma multiforme:
2020 exploratory radiogenomic analysis by using quantitative image fea-
64. PROSTATE-DIAGNOSIS-The Cancer Imaging Archive (TCIA) Public tures. Radiology 273:168–174. https://doi.org/10.1148/radiol.14131731
Access-Cancer Imaging Archive Wiki. https://wiki.cancerimagingar 88. Bakas S, Akbari H, Sotiras A et al (2017) Segmentation labels for the
chive.net/display/Public/PROSTATE-DIAGNOSIS preoperative scans of the TCGA-GBM collection. Available via https://
65. Slicer Server by Kitware-CMET-MRhead (2020). http://slicer.kitware. wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=24282
com/midas3/folder/4433. Accessed 30 Sept 2020 666
66. Peng Y, Tang Y, Lee S, Zhu Y, Summers RM, Lu Z (2020) COVID-19-CT- 89. Bakas S, Akbari H, Sotiras A et al (2017) Advancing the cancer genome
CXR: a freely accessible and weakly labeled chest X-ray and CT image atlas glioma MRI collections with expert segmentation labels and radi-
collection on COVID-19 from biomedical literature. arXiv omic features. Sci Data 4:170117. https://doi.org/10.1038/sdata.2017.
67. Zhao J, Zhang Y, He X, Xie P (2020) COVID-CT-dataset: a CT scan dataset 117
about COVID-19. arXiv:200313865 [cs, eess, stat] 90. Beers A, Gerstner E, Rosen B et al (2018) DICOM-SEG conversions for
68. Angelov P, Soares E (2020) Explainable-by-design approach for COVID- TCGA-LGG and TCGA-GBM segmentation datasets. Available via https://
19 classification via CT-scan. Health Informatics wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=41517
69. Rahimzadeh M, Attar A, Sakhaei SM (2020) A fully automated deep 733
learning-based network for detecting COVID-19 from a new and large 91. Clunie DA, Hickman H, Ver Hoef W et al (2019) DICOM SR of clinical
lung CT scan dataset. Biomed Signal Process Control 68:102588 data and measurement for breast cancer collections to TCIA. Available
70. Song J, Wang H, Liu Y et al (2020) End-to-end automatic differentia- via https://wiki.cancerimagingarchive.net/pages/viewpage.action?
tion of the coronavirus disease 2019 (COVID-19) from viral pneumonia pageId=50135479
based on chest CT. Eur J Nucl Med Mol Imaging 47:2516–2524. https:// 92. Fedorov A, Hancock M, Clunie D et al (2018) Standardized representa-
doi.org/10.1007/s00259-020-04929-1 tion of the TCIA LIDC-IDRI annotations using DICOM. Available via
71. Vayá M de la I, Saborit JM, Montell JA et al (2020) BIMCV COVID-19+: a https://peerj.com/preprints/27378.pdf
large annotated dataset of RX and CT images from COVID-19 patients. 93. Fedorov A, Hancock M, Clunie D et al (2019) Standardized representa-
arXiv:200601174 [cs, eess] tion of the LIDC annotations using DICOM. Available via https://peerj.
72. Zhang K, Liu X, Shen J et al (2020) Clinically applicable AI system for com/preprints/27378/
accurate diagnosis, quantitative measurements, and prognosis of 94. Armato SG, McLennan G, Bidaut L et al (2011) The Lung Image Data-
COVID-19 pneumonia using computed tomography. Cell 181:1423- base Consortium (LIDC) and Image Database Resource Initiative (IDRI):
1433.e11. https://doi.org/10.1016/j.cell.2020.04.045 a completed reference database of lung nodules on CT scans: the
73. An P, Xu S, Harmon SA et al (2020) CT images in COVID-19 LIDC/IDRI thoracic CT database of lung nodules. Med Phys 38:915–931.
74. Jun M, Cheng G, Yixin W et al (2020) COVID-19 CT lung and infection https://doi.org/10.1118/1.3528204
segmentation dataset 95. Armato SG, McLennan G, Bidaut L et al (2015) Data From LIDC-IDRI
75. Morozov SP, Andreychenko AE, Pavlov NA et al (2020) MosMedData: 96. Kalpathy-Cramer J, Beers A, Mamonov A et al (2019) Crowds cure can-
chest CT scans with COVID-19 related findings dataset. arXiv:20050 cer: crowdsourced data collected at the RSNA 2017 annual meeting.
6465 [cs, eess] Available via https://wiki.cancerimagingarchive.net/pages/viewpage.
76. Desai S, Baghal A, Wongsurawat T et al (2020) Chest imaging represent- action?pageId=33948774
ing a COVID-19 positive rural U.S. population. Sci Data 7:414. https://doi. 97. Burnside ES, Drukker K, Li H et al (2016) Using computer-extracted
org/10.1038/s41597-020-00741-6 image phenotypes from tumors on breast magnetic resonance
Aiello et al. Insights Imaging (2021) 12:164 Page 21 of 21

imaging to predict breast cancer pathologic stage: breast MRI pheno- 104. Fedorov A, Beichel R, Kalpathy-Cramer J et al (2020) Quantitative imag-
types predict stage. Cancer 122:748–757. https://doi.org/10.1002/cncr. ing informatics for cancer research. JCO Clin Cancer Inform 4:444–453.
29791 https://doi.org/10.1200/CCI.19.00165
98. Guo W, Li H, Zhu Y et al (2015) Prediction of clinical phenotypes in inva- 105. Aiello M, Cavaliere C, Salvatore M (2016) Hybrid PET/MR imaging and
sive breast carcinomas from the integration of radiomics and genomics brain connectivity. Front Neurosci 10:64
data. J Med Imag 2:041007. https://doi.org/10.1117/1.JMI.2.4.041007 106. Marchitelli R, Aiello M, Cachia A et al (2018) Simultaneous resting-state
99. Zhu Y, Li H, Guo W et al (2015) Deciphering genomic underpinnings FDG-PET/fMRI in Alzheimer disease: relationship between glucose
of quantitative MRI-based radiomic phenotypes of invasive breast metabolism and intrinsic activity. Neuroimage 176:246–258. https://doi.
carcinoma. Sci Rep 5:17787. https://doi.org/10.1038/srep17787 org/10.1016/j.neuroimage.2018.04.048
100. Li H, Zhu Y, Burnside ES et al (2016) MR imaging radiomics signatures for 107. Aiello M, Cavaliere C, Fiorenza D, Duggento A, Passamonti L, Toschi
predicting the risk of breast cancer recurrence as given by research ver- N (2018) Neuroinflammation in neurodegenerative diseases: current
sions of MammaPrint, Oncotype DX, and PAM50 gene assays. Radiology multi-modal imaging studies and future opportunities for hybrid PET/
281:382–391. https://doi.org/10.1148/radiol.2016152110 MRI. Neuroscience. https://doi.org/10.1016/j.neuroscience.2018.07.033
101. Li H, Zhu Y, Burnside ES et al (2016) Quantitative MRI radiomics in the 108. Gorgolewski KJ, Alfaro-Almagro F, Auer T et al (2017) BIDS apps: improv-
prediction of molecular classifications of breast cancer subtypes in the ing ease of use, accessibility, and reproducibility of neuroimaging data
TCGA/TCIA data set. NPJ Breast Cancer. https://doi.org/10.1038/npjbc analysis methods. PLoS Comput Biol 13:e1005209. https://doi.org/10.
ancer.2016.12 1371/journal.pcbi.1005209
102. Morris E, Burnside E, Whitman G et al (2014) Using computer-extracted
image phenotypes from tumors on breast MRI to predict stage
103. Ziegler E, Urban T, Brown D et al (2020) Open health imaging foun- Publisher’s Note
dation viewer: an extensible open-source framework for building Springer Nature remains neutral with regard to jurisdictional claims in pub-
web-based imaging applications to support cancer research. JCO Clin lished maps and institutional affiliations.
Cancer Inform. https://doi.org/10.1200/CCI.19.00131

How To Hack A Credit Card
67% (6)
How To Hack A Credit Card
5 pages
Introduction To Presales Consulting and Proposal Authoring
100% (4)
Introduction To Presales Consulting and Proposal Authoring
73 pages
Pacman Game Report
100% (2)
Pacman Game Report
19 pages
Big Data in Healthcare
No ratings yet
Big Data in Healthcare
14 pages
DICOM Strategy
No ratings yet
DICOM Strategy
45 pages
AWS Technical Essentials Course
No ratings yet
AWS Technical Essentials Course
2 pages
About Pydicom
No ratings yet
About Pydicom
4 pages
DICOM and Slicer: A Tutorial
100% (1)
DICOM and Slicer: A Tutorial
75 pages
انترو
No ratings yet
انترو
8 pages
Pacs and Dicom
No ratings yet
Pacs and Dicom
8 pages
DICOM-BGibaudV1 0
No ratings yet
DICOM-BGibaudV1 0
13 pages
Chapter 03 - Image Processing and Acquisition using Python_Part5
No ratings yet
Chapter 03 - Image Processing and Acquisition using Python_Part5
2 pages
Building and Operating Data Hubs: Using a practical Framework as Toolset
From Everand
Building and Operating Data Hubs: Using a practical Framework as Toolset
Georg Graner
No ratings yet
DICOM
No ratings yet
DICOM
20 pages
Digital Imaging and Communications in Medicine (DICOM) A Practical Introduction and Survival Guide Verified Download
No ratings yet
Digital Imaging and Communications in Medicine (DICOM) A Practical Introduction and Survival Guide Verified Download
15 pages
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Brochure
No ratings yet
Brochure
2 pages
Dicom 2
No ratings yet
Dicom 2
5 pages
Digital Twins: How Engineers Can Adopt Them To Enhance Performances
From Everand
Digital Twins: How Engineers Can Adopt Them To Enhance Performances
Isrin Ismail
No ratings yet
A Review of Big Data Trends and Challenges in Healthcare
No ratings yet
A Review of Big Data Trends and Challenges in Healthcare
14 pages
10278_2012_Article_9471
No ratings yet
10278_2012_Article_9471
9 pages
final big data word
No ratings yet
final big data word
9 pages
Big Data Analytics for Healthcare Decision-making Enhancing Outcomes Through Data-driven Insights
No ratings yet
Big Data Analytics for Healthcare Decision-making Enhancing Outcomes Through Data-driven Insights
9 pages
Big Data Analytics in Healthcare - Promise and Potential
100% (1)
Big Data Analytics in Healthcare - Promise and Potential
10 pages
Raghupathi-Raghupathi2014 Article BigDataAnalyticsInHealthcarePr PDF
No ratings yet
Raghupathi-Raghupathi2014 Article BigDataAnalyticsInHealthcarePr PDF
10 pages
Lect 4 DICOM
No ratings yet
Lect 4 DICOM
45 pages
Bigdata Teikyo University PDF
No ratings yet
Bigdata Teikyo University PDF
16 pages
The Use of Big Data Analytics in Healthcare: Open Access Research
No ratings yet
The Use of Big Data Analytics in Healthcare: Open Access Research
24 pages
Rad211 Prelims Notes
No ratings yet
Rad211 Prelims Notes
11 pages
Decision Support System: Fundamentals and Applications for The Art and Science of Smart Choices
From Everand
Decision Support System: Fundamentals and Applications for The Art and Science of Smart Choices
Fouad Sabry
No ratings yet
Computer Applications Radiology
No ratings yet
Computer Applications Radiology
9 pages
DICOM
No ratings yet
DICOM
42 pages
The Role of Data Science in Healthcare Advancement
No ratings yet
The Role of Data Science in Healthcare Advancement
11 pages
4 Dicom
No ratings yet
4 Dicom
14 pages
Big Data Analytics in Healthcare
No ratings yet
Big Data Analytics in Healthcare
16 pages
BioMed Research International - 2015 - Belle - Big Data Analytics in Healthcare
No ratings yet
BioMed Research International - 2015 - Belle - Big Data Analytics in Healthcare
16 pages
Big Data Ethics in Research
From Everand
Big Data Ethics in Research
Nicolae Sfetcu
No ratings yet
Big Data Analytics For Healthcare Industry: Impact, Applications, and Tools
No ratings yet
Big Data Analytics For Healthcare Industry: Impact, Applications, and Tools
10 pages
ST (Eal) Health PDF
No ratings yet
ST (Eal) Health PDF
10 pages
DICOM Image Maat03i3p234
No ratings yet
DICOM Image Maat03i3p234
5 pages
(25439251 - Data and Information Management) Big Data in Health Care - Applications and Challenges
No ratings yet
(25439251 - Data and Information Management) Big Data in Health Care - Applications and Challenges
29 pages
Dicom Works
No ratings yet
Dicom Works
9 pages
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
From Everand
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
Steven Vollmer
No ratings yet
DicomWorks Software For Reviewing DICOM Studies An
No ratings yet
DicomWorks Software For Reviewing DICOM Studies An
10 pages
Crash Course Big Data
From Everand
Crash Course Big Data
IntroBooks Team
No ratings yet
5. Challenges and Opportunities of Big Data Analytics
No ratings yet
5. Challenges and Opportunities of Big Data Analytics
11 pages
White Paper: What's Inside?
No ratings yet
White Paper: What's Inside?
14 pages
BI and Big Data Management
From Everand
BI and Big Data Management
Ulrich Hambuch
No ratings yet
AN EXQUISITE APPROACH FOR IMAGE COMPRESSION TECHNIQUE USING LOSSLESS COMPRESSION ALGORITHM FOR ROI & NON-ROI REGIONS
No ratings yet
AN EXQUISITE APPROACH FOR IMAGE COMPRESSION TECHNIQUE USING LOSSLESS COMPRESSION ALGORITHM FOR ROI & NON-ROI REGIONS
13 pages
Managing Big Data Effectively
From Everand
Managing Big Data Effectively
Bhima Asan
No ratings yet
Big Data Hadoop in Health Care
No ratings yet
Big Data Hadoop in Health Care
51 pages
The Role of Big Data in Enhancing Healthcare Quality
No ratings yet
The Role of Big Data in Enhancing Healthcare Quality
5 pages
Healt Care
No ratings yet
Healt Care
22 pages
Reality Mining: Using Big Data to Engineer a Better World
From Everand
Reality Mining: Using Big Data to Engineer a Better World
Nathan Eagle
4/5 (2)
DICOM Standard in Medical Imaging
No ratings yet
DICOM Standard in Medical Imaging
4 pages
desjardins-et-al-2019-dicom-images-have-been-hacked-now-what
No ratings yet
desjardins-et-al-2019-dicom-images-have-been-hacked-now-what
9 pages
Intro To DICOM
No ratings yet
Intro To DICOM
42 pages
Introduction To DICOM
100% (3)
Introduction To DICOM
81 pages
Application of Big Data Analytics - An Innovation in Health Care
No ratings yet
Application of Big Data Analytics - An Innovation in Health Care
14 pages
nihms-1699223
No ratings yet
nihms-1699223
44 pages
RDTC 210 Chapter 13
No ratings yet
RDTC 210 Chapter 13
17 pages
ACVR DICOM Clunie 20051130 PDF
No ratings yet
ACVR DICOM Clunie 20051130 PDF
155 pages
ACVR DICOM Clunie 20051130 PDF
No ratings yet
ACVR DICOM Clunie 20051130 PDF
155 pages
10 1109ICoAC44903 2018 8939061
No ratings yet
10 1109ICoAC44903 2018 8939061
9 pages
6ES71548AB010AB0 Datasheet en
No ratings yet
6ES71548AB010AB0 Datasheet en
9 pages
Laboratorio 8
No ratings yet
Laboratorio 8
22 pages
Admet Catalogue 2600-Series
No ratings yet
Admet Catalogue 2600-Series
18 pages
Basic SELECT Statement: SELECT Identifies What Columns FROM Identifies Which Table
No ratings yet
Basic SELECT Statement: SELECT Identifies What Columns FROM Identifies Which Table
21 pages
User Manual: LAST UPDATED: 2011.11.16
No ratings yet
User Manual: LAST UPDATED: 2011.11.16
14 pages
Comfile Cubloc User Manual
100% (1)
Comfile Cubloc User Manual
481 pages
Data Mining Tutorial - Javatpoint
No ratings yet
Data Mining Tutorial - Javatpoint
12 pages
FluidScan Datasheet
No ratings yet
FluidScan Datasheet
2 pages
Kenwood Radio - Kmmx50bt
No ratings yet
Kenwood Radio - Kmmx50bt
42 pages
Student Workbook - Unit 2 Algorithms
No ratings yet
Student Workbook - Unit 2 Algorithms
17 pages
CEGP013091: 49.248.216.238 10/06/2023 10:35:05 Static-238
No ratings yet
CEGP013091: 49.248.216.238 10/06/2023 10:35:05 Static-238
2 pages
Dbms Darshan Report
No ratings yet
Dbms Darshan Report
54 pages
Tipo Stock CI Catálogo
No ratings yet
Tipo Stock CI Catálogo
13 pages
Joseph James O'Connor Affidavit & Criminal Complaint
No ratings yet
Joseph James O'Connor Affidavit & Criminal Complaint
54 pages
JMESTN42350058
No ratings yet
JMESTN42350058
8 pages
YouTube Kids For PC
No ratings yet
YouTube Kids For PC
2 pages
Fix WordPress Issues
No ratings yet
Fix WordPress Issues
5 pages
Telematics and Informatics: Manuel Au-Yong-Oliveira, Ramiro Gonçalves, José Martins, Frederico Branco
No ratings yet
Telematics and Informatics: Manuel Au-Yong-Oliveira, Ramiro Gonçalves, José Martins, Frederico Branco
10 pages
ES120 User Manual V1.0.3
No ratings yet
ES120 User Manual V1.0.3
10 pages
Discrete Random Variables and Probability Distributions: Stat 4570/5570 Based On Devore's Book (Ed 8)
No ratings yet
Discrete Random Variables and Probability Distributions: Stat 4570/5570 Based On Devore's Book (Ed 8)
46 pages
KV-7300 Datasheet
No ratings yet
KV-7300 Datasheet
4 pages
MySoft HMS Brochure
No ratings yet
MySoft HMS Brochure
2 pages
Criteria and Mechanics
No ratings yet
Criteria and Mechanics
5 pages
Using Report SDVBUK00: Symptom
No ratings yet
Using Report SDVBUK00: Symptom
4 pages
Manual - RouterOS Features - MikroTik Wiki
No ratings yet
Manual - RouterOS Features - MikroTik Wiki
5 pages
Griffin Cyber Motor Vehicle Ecommerce Website Proposal
No ratings yet
Griffin Cyber Motor Vehicle Ecommerce Website Proposal
5 pages

DICOM

Uploaded by

DICOM

Uploaded by

Aiello et al.

Insights Imaging (2021) 12:164

STATEMENT Open Access

How does DICOM support big data

Key points • There is need to fully promote DICOM in data shar-

PATIENT/CLINICAL ISSUE IMAGING SCAN

• CONVENTIONAL WORKFLOW • Measurement

Table 2 Most pertinent/specific DICOM tags related to DICOM-SEG modality

1 > CONTAINER DCID 7021 “Measurement report 1 M Root node

TID 1006 TID 1007

TID 1009 TID 1007

TID 1600 TID 1602 TID 1603

TID 1419 TID 310 TID 1607

TID 315 TID 1607

TID 1419 TID 310

TID 1501 TID 1502

TID 300 TID 310

Databases identified after pseudo-

Databases excluded, with reasons:

• private databases (n = 11)

Databases with DICOM raw data

Databases excluded, with reasons:

Databases with DICOM raw data and DICOM-SEG or DICOM-SR

PNEUMONIA RSNA Pneumonia Lung RX 30,000 N Y (JSON) N

TCGA-CESC TCIA Cervical Cervix MR, pathology 54 Yb N N

HNSCC-3DCT- TCIA Head and neck Head–neck CT, DICOM-RT, 31 Yb YR N

Table 5 Evaluation of DICOM viewers included in the study

RadiAnt N N* N Y N Free trial + commercial 29/04/2020 https://​www.​radia​ntvie​wer.​com/

You might also like

RadiAnt N N* N Y N Free trial + commercial 29/04/2020 https://www.radiantviewer.com/