0% found this document useful (0 votes)
14 views

Immersive Visualisation in Medical Imaging

research paper
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Immersive Visualisation in Medical Imaging

research paper
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Immersive Visualisation In Medical Imaging

Aditya Salian Vvyom Shah Shlok Shah Parin Shah


BTech Computer Eng. BTech Computer Eng. BTech Computer Eng. BTech Computer Eng.
NMIMS University NMIMS University NMIMS University NMIMS University
[email protected] [email protected] [email protected] [email protected]

Kamal Mistry Krupalu Mehta Vivek Surve


Faculty, Computer Eng. Co-founder Co-founder
NMIMS University Parallax Labs Parallax Labs
[email protected] [email protected] [email protected]

Abstract— The advent of technologies emphasizing on 3-D model approach and poor surgical results. Thus, to address this
reconstruction, shows a tremendous potential for application in problem faced by the doctors, we plan on making use of this
medical imaging, specifically in anomaly detection for target intuitive and interactive environment to provide a platform
organs. Medical practitioners. Bearing this in mind, the objective where medical practitioners can interact with patient specific
of our research is to identify extant anomalies within the organ anatomical features for improved diagnosis and a more
with an aim to reproduce approximate size and relative location of thorough surgical planning.
anomaly and to generate a 3-Dimensional model of a target organ
with the anomaly based on scans of medical reports to visualize in Our project addresses the visualization of organs and their
an immersive environment through Augmented Reality. In this corresponding anomalies by making use of Augmented Reality.
paper, we discuss the chronological order of detecting, segmenting The use of augmented reality as our platform provides an
and rendering the anomalies onto our target organ- the brain. We intuitive and immersive experience to the user as it enables
have employed the U-Net architecture for CNN to train on the them to interact with the environment. This will enable the
MICCAI BraTS 2018 dataset for segmenting brain tumours, doctors to view the reports in 3-D as they will not have to rely
particularly high grade gliomas. Post segmentation, we discuss the on two dimensional sliced images to visualize the target organ.
pipeline of our research, with the end results being GLTF and The patients will also be able to interact with the generated
USDZ file formats to view the 3-D model of the brain along with model and thus, understand the cause-and-effect relationship of
tumour in Augmented Reality. We conclude the paper by the presence of the anomaly on the target organ.
discussing the future scope of research on our current work.
We have employed a Convolutional Neural Network (CNN)
Keywords— augmented reality; volumetric rendering; medical based on the U-Net architecture to work towards detecting and
imaging; brain tumour segmentation; deep learning segmenting the anomaly on the target organ, i.e, the brain.
Using the segmented anomaly (tumour in this case), we have
I. INTRODUCTION rendered a 3-D volume of the tumour as well as the target organ
Our motivation for this project stems from a dire need to (brain) of each patient in the form of STL files. These files have
abridge the semantic gap between medical practitioners, and been later converted into GLTF files for Android and USDZ
their patients. We inferred that most medical documents like files for iOS, to be viewed in augmented reality. The interface
DICOM, CT and MRI scans constitute esoteric jargon which of this research work is designed on a web application, wherein
can only be elaborated upon by doctors. Dedicated to tackle this the user can upload their MRI scans in the form of MHA files
problem, we deliberated upon the potential of using volumetric and view the resultant 3-D model of their brain in augmented
rendering in the form of Augmented Reality to tackle this reality. A unique shareable link will be generated for every
existing barrier, and how this technology has the potential to report, which will enable doctors to share the 3-D report with
revolutionize the medical industry because of the immersive their patients and vice versa. With these functionalities, we
and interactive experience that it can provide to any user, be it encourage the dissemination of information through an
doctor, or the patient. interactive and immersive medium provided in the form of
augmented reality.
To render a volume of the organ based on medical reports,
Augmented Reality can be used to overcome the currently
elusive linguistic explanation provided by doctors and move to
a much more graphically interactive solution which the patients
can easily understand. From the doctor’s perspective we
realized that surgeons are restricted to imagine a patient’s
anatomy from two dimensional views when most medical
images are acquired as three dimensional volumes. Perusing
through a patient’s anatomy and critical 3-D interactions of a
surgery using 2-D images can lead to a non-optimal surgical
Fig. 1. Pipeline of the proposed system.
978-1-7281-9180-5/20/$31.00 ©2020 IEEE

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
II. REVIEW OF EXISTING LITERATURE assumed. Adjoining pixels in the 3D space are connected
One method that was studied was to utilise Multilayer together and this process repeats until all slices are traversed.
Perceptrons(MLP), which have proven to be an efficient way The generated model is then verified using 3D DOCTOR. An
for classification of brain images; with the use of anisotropic advantage of this method is that it preserves the shape and gray
diffusion filters(ADF) for the segmentation of tumours. The levels of the original image scans which helps in texture
algorithm proposed by this method involve four steps: analysis and classification. Another advantage is that it is an
extracting features, classifying them, segmenting them and automatic process which allows interactive user control, unlike
modelling into a 3-Dimensional(3D) object. For extracting manual process that were followed earlier by doctors.[4]
features, Discrete Wavelet Transform (DWT) is implemented A method to not only segment brain tumour but also calculate
by applying different frequencies on the image, to represent the volume off the tumour along with its shape and size in 3D
coefficients and details. Onto this the MLP performs supervised is by utilising different image processing methods. After
learning to, classify images into tumour or non-tumour classes. inputting the images, they are passed through a high pass filter
Images classified as containing tumours are then segmented using a fspecial function. The histogram is then equalized and
with ADF before their volumes, relative to the brain is thresholding is performed to distinguish objects from their
generated. To do so, the entire brain is divided into eight background. Morphological functions such as erosion, is then
regions, and the region wherein the center point of the tumour performed leading to segmentation through connected
lies is picked as the relative position area of the tumour, in the component labelling. Volume 3D creates the volume render
brain. This method produced an accuracy over 90% when object from input data using the vol3d() function that is based
trained and tested on 2015 data from MICCAI BraTS.[1] on texture mapping technique in openGL.[5]
Another method that utilises artificial Neural Networks for Moving onto visualising segmented and reconstructed brain
segmentation involves 11-layer deep 3D Convolutional Neural tumours in augmented reality, one method is to use an image
Network (CNN). A new scheme for training that was utilised in processing technique based on markers to render 3D models
this paper was to connect adjacent image patches being onto the body. The system under consideration has four
processed in one pass in the network while adjusting to implicit modules which accepts video as an input and generates images
class imbalance in the image data. The issue of requiring large to overlay on top of the original video by rendering the
training batch sizes is overcome by using an image greater than reconstructed 3D model. The steps that are followed are Pre-
the receptive field of the CNN which outputs a posterior processing, finding marker, estimating pose and generating
probability with multiple voxels. Furthermore, for multi-scale graphic. Pre-processing converts the input image into a binary
processing implementation of parallel paths is suggested by the image. A marker is important to trigger placing the model in the
paper. This provides a solution taking into consideration local scene. Locating each closed contour in the binary image, to find
and contextual data, both, which enhances output of the markers, is carried out by a topological structural analysis
segmentation. To integrate both local and global contextual algorithm. To detect the markers, the front view of the rectangle
information into the CNN, a second pathway is added that is obtained by eliminating the perspective projection and
operates on down-sampled images. Thus, the dual pathway 3D looking for the marker pattern in the front view of the polygon.
CNN simultaneously processes the input image at multiple To estimate the pose, the angle of the camera looking at the
scales. The detailed local appearance of structures is captured scene needs to be estimated as this helps in rendering the
in the first pathway, whereas higher level features such as the generate model correctly. To do this, there is an issue of
location within the brain are learned in the second pathway.[2] calculating 3D data from 2D data. The method prefers to work
Another method of detecting and segmenting brain tumours with a homogenous coordinate system as compared to a
from MRI is proposed with a hybrid architecture of a CNN that Cartesian.[6]
uses a patch-based approach while considering local as well as A 3D brain tumour visualization method is proposed in this
contextual information to determine the output class. The paper that functions in real time by using facial features as
proposed network combines a two-path parallel network with 3 markers of the subject in the scene. A new method of camera
path networks to form a hybrid network. In the two-path calibration using the size of the face of the subject is
network, the first stream has small kernels to get local implemented and the pose is computed by processing 3D data
information from images while the other stream has kernels that and its 2D projections. Using these computations, the
have a large receptive which focus more on contextual reconstructed brain tumour is displayed on the subjects’ face
information. This four layered CNN is comparatively faster without the need of any marker.
than the 3 path CNN which uses 5 layers. Taking into account
the local and global information while determining the class In marker-less tracking, the camera pose is determined from
label for each pixel makes the output of this model based on naturally occurring features, such as edges and textures of an
contextual and local features both.[3] anatomical object (e.g. eye corners, nose-tip). A pre-
constructed model of the brain will be imposed on the patient
Another method of segmenting tumour portions from brain body. The proposed model of the system, as shown in Fig. 2,
scans is to apply morphological filters on T1 Flair and T2 consists of three components: Camera calibration, Pose
modalities of the MRI scans. Followed by passing the estimation and augmentation. Camera Calibration is based on
segmented tumour through a software package, 3D Doctor to Tsai algorithm and the Viola-Jones face detector. The most
generate accurate 3D models of tumour. To generate a 3D challenging task for pose estimation is reference point
model, an empty space is chalked out where each tumour pixels recognition. The five reference points looked up are the ends of
(x, y) coordinate pair is marked. For the z coordinate, the slice the eyes, ends of the lips, tip of the nose and tip of the chin of
number representing the distance between two slices is the subject’s face; using Dlib’s facial landmark detection. The

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
3D points of the corresponding reference points are used to accurate segmentation of these anomalies can be achieved if
compute their 2d equivalent values of the face. For the face pose they are segmented manually by humans. However, CNN can
estimation. The 3D model of the brain is rendered onto the recognize patterns in the data and thus they have been used to
scene using Unity. A difference in colour and in material is used detect anomalies by training them on the MICCAI BRATS
to identify parts of the skull and soft tissue of the brain.[7] 2018 dataset.

The networks developed is based on the U-Net architecture,


which consists of two major parts- the encoding or contracting
path; and the decoding or expansive path, as seen in Fig. 4. The
contracting path consists of the traditional convolutional
process whereas the expansive path consists of transposed 2D
convolutional layers. Batch normalization has been added after
each convolution layer.
Fig. 2. Proposed model of the system [7]
III. IMPLEMENTATION
A. Dataset Procurement
Our vision to build immersive and interactive reports for
medical reports was channeled into developing an initial proof
of concept model for the same. Brain was chosen as the target
organ after due deliberation. MICCAI BRATS Dataset 2018
was selected to segment tumours and construct models from.
The MICCAI BRATS Dataset of 2018 contains pre-operative
multimodal MRI scans of glioblastoma (HGG) and low-grade
glioma (LGG) divided as training, testing and validation sets
clinically acquired. The MRI scans were as NIfTI files (.nii.gz)
which describe 4 modalities- a) native (T1) and b) post-contrast
T1-weighted (T1Gd), c) T2-weighted (T2), and d) T2 Fluid
Attenuated Inversion Recovery (FLAIR) volumes. Moreover, Fig. 4. U-Net Architecture
the overall survival data, defined in days, is included in a .csv
file with correspondences to the pseudo-identifiers of the The contraction path comprises several contraction blocks, with
imaging data which also includes the age of patients, as well as each block taking an input and applying two 3X3 convolution
the resection status. The four modalities can be seen in the layers followed by a 2X2 max pooling. The number of feature
figures below. maps doubles with every layer which enables the network to
learn more complex structures effectively. The bottommost
layer of the contracting path is the intermediate stage between
the contracting and the expansive paths, and it uses a 3X3
convolutional layer followed by a 2X2 up convolutional layer.

The expansive path, similar to the contracting path, has multiple


expansion blocks, with each block passing the input to two 3X3
CNN layers followed by a 2X2 up-sampling layer. In
juxtaposition to the contracting path, the number of feature
maps in this path is halved after every block. Furthermore, the
input is also appended each time by feature maps of the
corresponding contraction layer. This appending make sures
that the features are learned while contracting the image is used
to reconstruct it.

Initially, the U-net model was used to perform the segmentation


of the full tumour, tumour core and the enhancing tumour.
Results for full tumour were promising but results for tumour
Fig. 3. Depiction of four modalities of a patient from dataset. core and enhancing tumour were not satisfactory. This was due
to the fact that the tumour core and the enhancing tumour are
B. Tumour segmentation too small in comparison to the entire brain (0.75% and 0.45%
Unsupervised learning in the form of CNN has been per slice respectively). This situation was addressed by
implemented for tumour segmentation because the large calculating the center point of the full tumour and then cropping
variability, location, and size, shape and frequency of the out the training data for tumour core and enhancing tumour
tumour make it difficult to devise segmentation rules for using the training data. Fig 5 depicts the shape and size of our
supervised learning. Due to this high variability, the most predicted tumour in juxtaposition with the ground truth. T2 and

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
Flair modalities have been used for full tumour segmentation B. MHA to STL
instead of using all the modalities while training, which has After generating the MHA file we create surface meshes to
significantly reduced our training time. There are only two store boundary information of the 3D models the detected
input channels as only T2 and Flair modalities have been used
tumour and the patient’s brain. This information is stored in the
for full tumour segmentation.
form of an STL file, ensuring that the models are unique to
every patient. The end result of this step is two STL files, the
first one being that of the tumour and the second one being of
the patient’s brain. This is done using FlyingEdges3D which is
an implementation of the 3D version of the flying edges
algorithm. It is a 4 pass algorithm designed to be highly scalable
for large data. A smoothening filter was applied after the
creation of the surface mesh to remove rough edges generated
by interlocking polygons of the mesh.

Fig. 5. Comparison of the ground truth with predicted images


by our network

Fig. 7. STL file of segmented tumour and brain


C. STL to Blender
In this step, the two STL files are combined using the Blender
API into a single file. Creating a method to place the tumour at
the exact location inside the brain for an accurate representation
posed a challenge. Consequently, novel algorithm was
developed to solve this issue, wherein we added planes to the
Fig. 6. Result on test set by our network
MHA files and superimposed the two planes to combine the
The next step post the generation of tumour as described in the STL files into a single file. The resulting model had an accurate
previous section is to create a 3D model that is unique to each representation of the location and orientation of tumour in the
patient. This 3D model needs to be in two different formats patient’s brain. As the surface meshes created go beyond
namely GLTF (for android users) and USDZ (for iOS users). 600,000+ polygons, it was imperative for us to decimate and
The detailed steps performed for the creation of the model are reduce them to about 10% of the original count to reduce jitters
enlisted below: while parsing over 60000 polygons. The brain is being
decimated to about 15000 polygons, while the tumor is not
IV. PROPOSED WORKFLOW being decimated as it is the main focus from a doctor’s point of
An overview of our final AR deployment phase comprises of a view. The combined file is then exported to GLTF/ USDZ
pipeline wherein we receive an MHA file from the user, render format for viewing in AR.
an STL file representing the 3D model based on the MHA file,
create GLTF and USDZ files from the STL files for android and
iOS devices respectively and provide a universal shareable link
associated with both the AR models for Android and iOS
devices.
A. NumPy array to MHA
The UNet model performs segmentation on the given input file
and provides as output a NumPy array. This NumPy array is
converted into an MHA file using the SITK write image
module. The result is an MHA file consisting of the segmented Fig. 8. GLTF file of the patient before editing
tumour.

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
D. GLTF Editing scene which allows for them to be viewed in Augmented
As seen in Fig. 8, the exported GLTF file from Blender does Reality. The modelviewer scene depicted below has the 3D
not retain its transparency and thus, had to be edited manually. model which can be viewed in the web app. This scene can be
The goal in this module was to convert the outer brain mesh to used for viewing in devices that do not have AR support. In the
near-transparent in order to view the tumour inside the brain’s modelviewer scene, there is a button in the bottom right corner
outer structure. The alpha value of brain’s surface is set to 0.1 that allows the user to view the model in AR. When clicked the
and the tumor’s color was changed to red to enhance the user is taken to a new page where the user can proceed to click
visibility and contrast for a better viewing experience.. the “view in your space” button. On clicking, the user is
prompted to move the phone around the physical space to allow
the calibration of the ground plane using the phone’s camera.
Once calibrated, the 3D model will appear in the physical space
as shown in the figure below. The user can then move, rotate
and scale the model in the physical space. The user can also take
a screenshot of the scene and share it instantly. To allow another
user to view the report, the link only needs to be shared and the
same process is followed on the shared user’s phone.
Fig. 9. GLTF file of the model
E. GLTF to USDZ
GLTF is a transmission format for 3D assets that are apropos to
web and mobile devices by removing data that is not important
for efficient display of assets. iOS does not support GLTF
format and hence it was imperative to convert the file into
USDZ format. This was performed by using Google’s
usd_from_gltf tool. This conversion emulates the functionality
of GLTF files in iOS. (a) (b)
Fig. 11. (a) Screenshot displaying the interface that allows the
user to view the scene in AR, (b) Screenshot of a model in
physical space
V. RESULTS
Table I
Mean Dice Score Median Dice Score

Fig. 10. USDZ file of the model Full Tumour 0.87 0.90

F. Web Application Tumour Core 0.76 0.84


A web application was developed to serve the following
purposes. To begin with, it will help the end-user understand Enhancing Tumour 0.71 0.80
what the product is and how to use it. Secondly, it will provide
the end-user with a graphical user interface via which the user Table I displays the DICE score of the U-NET architecture
can upload the files and receive the shareable link. Lastly, it utilised in this project. This can be further improved by using
also depicts some of the models that were developed through new and more accurate image segmentation algorithms. What
we use during 3D modeling in the pipeline is the entire tumour
the system for more context and understanding.
mass and thus it is more important a metric out of the three
displayed above. A technique to position and accurately
The frontend was designed using HTML, CSS and JavaScript calibrate it with respect to the source brain was developed. An
and the backend was written in Flask. The entire pipeline which automated pipeline for ease of use was also created where the
is described above was written as a single function that takes as input was MRI files and the output is a shareable 3D model.
input the files of respective modalities and presents as output a VI. CONCLUSION
model-viewer scene that has the generated 3D model.
This project was aimed towards creating 3D models of
G. Augmented Reality Deployment tumorous brains, make it easier to visualize them and make
The final output of the pipeline is a USDZ and GLTF file that them shareable using AR technology. We have successfully
created an automated pipeline for this task. The same can be
is unique to the patient. These files are fed into the modelviewer

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
extended to various other organs of the human body and a VII. REFERENCES
variety of imaging technology. [1] G. Latif, M. Mohsin Butt, A. H. Khan, M. Omair Butt and J. F. Al-
Asad, "Automatic Multimodal Brain Image Classification Using MLP and 3D
Further consultations with doctors and other medical Glioma tumour Reconstruction," 2017 9th IEEE-GCC Conference and
professionals can shed light upon their requirements and enable Exhibition (GCCCE), Manama,2017, pp. 1-9.
us to concentrate our efforts towards that need. [2] Kamnitsas, K.; Ledig, C.; Newcombe, V.F.J.; Simpson, J.P.; Kane,
A.D.; Menon, D.K.; Rueckert, D.; Glocker, B.: Efficient multiscale 3D CNN
with fully connected CRF for accurate brain lesion segmentation. Med. Image
Anal. 36, 61–78 (2017).
VII. FUTURE WORKS
[3] Sajid, S., Hussain, S. & Sarwar, A. Brain Tumor Detection and
The points mentioned below are some of the ideas that can be Segmentation in MR Images Using Deep Learning. Arab J Sci Eng 44, 9249–
further experimented and developed upon. These are just a few 9261 (2019).
of the many use-cases possible. [4] Resmi, S. and Thomas, T. (2012) A Semi-automatic method for
segmentation and 3D modeling of glioma tumours from brain MRI. Journal of
Biomedical Science and Engineering, 5, 378-383.
1. Real time sharing/interaction between multiple devices on
[5] Sindhushree. K. S, Mrs. Manjula. T. R, K. Ramesha, 2013,
the same model to serve as a tool which can be utilised for Detection And 3d Reconstruction Of Brain tumour From Brain Mri Images,
training purposes. INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH &
TECHNOLOGY (IJERT) Volume 02, Issue 08 (August 2013).
2. Utilise a solid 3D model instead of a surface mesh, which [6] Qian Shan, Thomas E. Doyle, Reza Samavi, Mona Al-Rei,
would give us the ability to perform cross-sectional views of “Augmented Reality Based Brain tumour 3D Visualization, Procedia Computer
the organ in question. Science”, Volume 113, 2017.
[7] M. A. Ghaderi, M. Heydarzadeh, M. Nourani, G. Gupta and L.
Tamil, "Augmented reality for breast tumours visualization," 2016 38th Annual
3. In organs such as the heart, simulations of internal functions International Conference of the IEEE Engineering in Medicine and Biology
can be displayed intuitively and interactively. Society (EMBC), Orlando, FL, 2016, pp. 4391-4394.
[8] B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani,
4. Planning of surgeries amongst doctors in a collaborative AR J. Kirby, et al. "The Multimodal Brain Tumor Image Segmentation Benchmark
environment with the use of headsets. (BRATS)", IEEE Transactions on Medical Imaging 34(10), 1993-2024 (2015)
DOI: 10.1109/TMI.2014.2377694
VII. ACKNOWLEDGMENT [9] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J.S. Kirby,
et al., "Advancing The Cancer Genome Atlas glioma MRI collections with
This paper and the supporting research would not have been expert segmentation labels and radiomic features", Nature Scientific Data,
possible without the exceptional support and profound insights 4:170117 (2017) DOI: 10.1038/sdata.2017.117
of our mentors, Prof. Kamal Mistry and Mr. Krupalu Mehta. [10] S. Bakas, M. Reyes, A. Jakab, S. Bauer, M. Rempfler, A. Crimi, et
We are also deeply grateful to the Perelman School of Medicine al., "Identifying the Best Machine Learning Algorithms for Brain Tumor
at the University of Pennsylvania for providing us with the Segmentation, Progression Assessment, and Overall Survival Prediction in the
BraTS 2018 dataset. BRATS Challenge", arXiv preprint arXiv:1811.02629 (2018)

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.

You might also like