computer vision
computer vision
COMPUTER VISION
INTRODUCTION
3. Emerging Techniques
METHODOLOGY
1. Dataset Collection
Medical image datasets are primarily collected from clinical settings using
various imaging modalities, such as MRI, CT, NMI, and USI. These datasets
consist of 3D images of different organs and tissues, captured under
various conditions, such as static images or dynamic sequences (e.g., MRI
for brain activity, USI for cardiac motion). In addition to anatomical data,
functional data such as blood flow or metabolic activity can be captured
through modalities like fMRI and PET. For example, MRI datasets of the
brain can include high-resolution images segmented into different tissue
types like gray matter, white matter, and cerebrospinal fluid. Some
datasets are enriched with temporal data, where the evolution of
structures over time is tracked, which is especially useful for studying
dynamic phenomena like heartbeats or breathing patterns. Moreover,
multimodal datasets, where multiple imaging techniques (e.g., MRI and
CT) are used together, help overcome the limitations of individual
modalities by providing complementary information. These large and
diverse datasets are crucial for training computer vision algorithms,
ensuring they are robust and can handle a wide variety of medical
imaging scenarios.
Primary
Modality Advantages Limitations
Use
MRI Imaging of High contrast for Expensive, long scan
soft tissues soft tissue, non- times, contraindicated
(e.g., brain, invasive, no
with metal implants
muscles) radiation
Imaging of
Excellent for bone
bone and Uses ionizing radiation,
imaging, fast scan
CT dense limited soft tissue
time, detailed
tissues (e.g., contrast
images
skull)
Functional
NMI Provides metabolic Poor spatial resolution,
imaging
(PET/SP and functional data, radioactive tracers
(e.g., glucose
ECT) non-invasive required
metabolism)
Real-time
imaging of Real-time,
Limited resolution,
USI organs and inexpensive, non-
operator-dependent
motion (e.g., invasive
heart)
Real-time functional
Brain activity Low spatial resolution,
fMRI brain imaging, non-
visualization sensitive to motion
invasive
P(i,j,d,θ)=∑m,nδ(I(m,n),i)δ(I(m+d⋅cos(θ),n+d⋅sin(θ)),j)
where:
For rigid registration, the cost function to minimize is typically defined as:
E(R,t)=∑ ∥Rpi +t−qi ∥^2
Where pi and qi are corresponding points in two 3D images, R is the
rotation matrix, and t is the translation vector that aligns the two images.
This equation computes the transformation that minimizes the difference
between corresponding points in the images.
5. Evaluation Metrics