02 A Biomedical - Data.and - Databases
02 A Biomedical - Data.and - Databases
BIOMEDICAL DATA
AND DATABASES
Biomedical Informatics
Assoc. Prof. Tomaž Vrtovec, Ph.D.
BIOMEDICAL DATA
What are biomedical data?
- John Doe
(ID: 0110975500213)
- body weight
- 74 kg
- 12.9.2012
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 3 / 50
BIOMEDICAL DATA
What are biomedical data?
74,5 kg =
= 74 kg
4:22
?
74,5 kg =
= 75 kg
19:52
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 4 / 50
BIOMEDICAL DATA
DIKW model
Wisdom
Knowledge
(active information)
Information
(formed data)
Data
Source: J.
H.Rowley:
Cleveland:
TheInformation as resource.
wisdom hierarchy: The Futurist,
representations December
of the 1982, PageJournal
DIKW hierarchy. 34 of Information Science 33(2):163-180, 2007
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 5 / 50
BIOMEDICAL DATA
DIKW model
Connectivity Why?
Connection How?
Wisdom
of wholes
Who?
Forming What?
When? Knowledge
of wholes
Where?
Novelty
Connection (doing the
of parts
Information right things)
Experience
(doing things right)
Formation
Data
of parts
Understanding
Research Absorption Action Influence Judgement
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 6 / 50
DATA CLASSIFICATION
According to the structure
Data
Descriptive Numerical
(qualitative) (quantitative)
A A
C C
B B
A B C A> C > B 2 5 8 0 2 5 8
Source: S.S. Stevens: On the theory of scales of measurement. Science 103(2684):677-680, 1946
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 7 / 50
DATA STRUCTURE
Nominal biomedical data
Operations:
- equality / inequality
- grouping
Citizenship:
Gender:
- Slovenian
- male
- Italian
- female
- Austrian
- …
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 8 / 50
DATA STRUCTURE
Ordinal biomedical data
Operations:
- everything that was enabled for nominal data
- greater / less
Age: Opinion:
- younger - completely agree
- older - partially agree
- partially disagree
- completely disagree
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 9 / 50
DATA STRUCTURE A
Interval biomedical data
C
The data is quantitative in nature, can be ordered according to size B
and can be added or subtracted, but ratios cannot be computed. The
mean value can be computed. 2 5 8
Operations:
- everything that was enabled for ordinal data
- addition / subtraction
Example:
- temperature - date
Celsius (C)
scale
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 10 / 50
DATA STRUCTURE A
Ratio biomedical data
C
The data is quantitative in nature, ratios can be computed (therefore B
the reference value – the absolute zero – exists).
0 2 5 8
Operations:
- everything that was enabled for interval data
- multiplication / division
Example:
- temperature in - body
Kelvin (K) height
scale
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 11 / 50
DATA STRUCTURE
Summary
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 12 / 50
DATA CLASSIFICATION
According to dimensionality
Data
DATA DIMENSIONALITY
0-D biomedical data
Example:
- measurement: body weight, - measurement: blood pressure, e.g.
e.g. 74 kg 120/80 mmHg (systolic/diastolic)
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 14 / 50
DATA DIMENSIONALITY
1-D biomedical data
Example:
- 1D signal: body weight - 0D video: heart beat (0D + time)
depending on body height
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 15 / 50
DATA DIMENSIONALITY
2-D biomedical data
Example:
- 2D images: radiographic (X-ray) - 1D video: levels of sound pressure
image of the chest against the resonance frequency of the
Helmholtz resonator (1D + time)
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 16 / 50
DATA DIMENSIONALITY
3-D biomedical data
Example:
- 3D images: magnetic resonance (MR) - 2D video: modelling of the
images of the head at different locations electric waves in the heart
muscle (2D + time)
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 17 / 50
DATA DIMENSIONALITY …
4-D biomedical data
Example:
- 4D data: 3D computed tomography (CT) - 3D video: 4D ultrasound
image of the chest with superimposed lung (3D + time)
movements
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 18 / 50
DATA DIMENSIONALITY
n-D biomedical data
Example:
- Genetic code: The term “curse of dimensionalityˮ refers to
various phenomena that arise when analyzing
and organizing data in high-dimensional
spaces (e.g. noise, poor evaluation) that do
not occur in low-dimensional spaces.
DATA CLASSIFICATION
According to type
Data
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 20 / 50
DATA TYPE
Biomedical descriptions
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 21 / 50
DATA TYPE
Biomedical measurements
Precision
- blood analysis (mmol/l, mg/l)
- urinalysis (mmol/24h, mg/24h)
- body weight (kg)
- …
Accuracy
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 22 / 50
DATA TYPE
Biomedical signals
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 23 / 50
DATA TYPE
Biomedical images
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 24 / 50
DATA TYPE
Biomedical videos
DATABASES
What are biomedical databases?
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 26 / 50
DATABASES
Metadata
They are practically “data about dataˮ, “data about data carriersˮ
or “data about data contentsˮ.
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 27 / 50
DATABASES
Metadata (2)
DATABASES
Properties
Database
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 29 / 50
DATABASE CONTENT
Bibliographic content
1. Bibliographic databases
Links Database
in the form of citation toData
medical (and other) literature.
Data Data
content model indexing retrieval
- MEDLINE1 EMBASE - - CINAHL2
- EMBAS
http://www.pubmed.gov
biomedicine health care
biomedicine > 7000 publications > 3000 publications
> 5500
Bibliographicpublications Hierarchical Manual
> 20 million entries > 2.6 millionExact-
entries
> 22 million entries 1947 – today 1937 – todaymatch
Full-text
1950 – today Network Automated
Partial-
Annotated Relational match
Aggregated Associative
DATABASE CONTENT
Bibliographic content (2)
These three bibliographic databases are not limited only to the biomedicine and
health care.
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 31 / 50
DATABASE CONTENT
Bibliographic content (3)
2. Online catalogues
Web pages that do not display actual contents but rather links to
other web pages.
3. Specialized lists
These are not only links to literature and web pages but display a more
diverse contents.
- National Guidelines Clearinghouse (NGC)
http://guideline.gov
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 32 / 50
DATABASE CONTENT
Full-text content
Full-text content refers to databases that, besides links to the literature, also
contain actual access to text.
Publishing companies in the field of biomedical literature have the leading role
through their online interfaces, where we can via a subscription access to the
full-text (HTML or PDF) and other contents (e.g. multimedia).
- SpringerLink - OvidSP
Springer Wolters Kluwer
http://www.springerlink.com http://ovidsp.ovid.com
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 33 / 50
DATABASE CONTENT
Annotated content
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 34 / 50
DATABASE CONTENT
Annotated content (2)
- CiteSeerX
http://citeseerx.ist.psu.edu
- Clinical Evidence
http://www.clinicalevidence.com
6. Other data
- Essential Evidence Plus
http://www.essentialevidenceplus.com - ClinicalTrials.gov
http://clinicaltrials.gov
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 35 / 50
DATABASE CONTENT
Aggregated content
Aggregated content refers to the aggregation of content from the first three
categories: bibliographic, full-text and annotated content.
- MedlinePlus - MedWeaver
http://www.nlm.nih.gov/medlineplus/ http://www.unboundmedicine.com
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 36 / 50
DATA MODEL
Hierarchical model
Database
Data is organized in a tree-like structure, therefore
enabling the parent/child relationships in the form 1-N: X: Diseases of the
- each parent can have an arbitrary respiratory system
number of children [J00-J99]
Database Data Data Data
- eachcontent
child has exactly onemodel
parent indexing retrieval
… … …
Source: International Statistical Classification of Diseases and Related Health Problems (ICD-10)
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 37 / 50
DATA MODEL
Network model
Computed
Urine Bronchoscopy
tomography (CT)
… … …
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 38 / 50
DATA MODEL
Relational model
Table: Results
Examination Service Date Patient
Key: Examination code = 4 name code (YYYY-MM-DD) code
Radiological exam. 067564 2012-04-23 00249
Radiological exam. 067566 2012-07-19 12765
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 39 / 50
DATA MODEL
Associative model
Data is organized as individual parts, and the links among the content
elements is defined in the form of associations.
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 40 / 50
DATA INDEXING
Controlled terminologies
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 41 / 50
DATA INDEXING
Manual indexing
Atrial
Concept
fibrillation af
afib
a fib
Atrial Auricular
Terms
fibrillation fibrillation
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 42 / 50
DATA INDEXING
Manual indexing (2)
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 43 / 50
DATA INDEXING
Automated indexing
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 44 / 50
DATA INDEXING
Automated indexing (2)
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 45 / 50
DATA RETRIEVAL
Exact-match
In exact-match searching, the retrieval system gives the user all data that
Database
exactly match the criteria specified in the search query.
As data (e.g. documents) are often represented by sets of elements (e.g. words),
Database
set-based (Boolean) searching Data Datasimilarity betweenData
is commonly used. The the
content
data and model by the Booleanindexing
the search query is defined logical operations: retrieval
- conjunction (AND: )
- disjunction (OR: )
Bibliographic Hierarchical Manual Exact-
- negation (NOT: ) match
Full-text Network Automated
Partial-
Annotated Relational match
Aggregated
This kind Associative
of matching is usually applied to bibliographic content. For an
efficient data retrieval, insight into the performance of Boolean operators
as well as the structure of the database in question is required.
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 46 / 50
DATA RETRIEVAL
Partial-match
DATA RETRIEVAL
Model comparison
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 48 / 50
DATA RETRIEVAL
Success evaluation
BIG DATA
The next milestone of innovation, competitiveness
and productivity
“Big data” are data that due to their size and complexity cannot be efficiently
acquired, stored and analyzed, and cannot be managed by using currently
established system for database management.
Source: J. Manyika et al.: Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute, 2011
University of Ljubljana, Faculty of Electrical Engineering BIOMEDICAL INFORMATICS Electrical Engineering, level 2
Laboratory of Imaging Technologies Assoc. Prof. Tomaž Vrtovec, Ph.D. International course
2. Biomedical data and databases 50 / 50
CONCLUSION
Discussion, comments, questions…