Immersive Visualisation in Medical Imaging
Immersive Visualisation in Medical Imaging
Abstract— The advent of technologies emphasizing on 3-D model approach and poor surgical results. Thus, to address this
reconstruction, shows a tremendous potential for application in problem faced by the doctors, we plan on making use of this
medical imaging, specifically in anomaly detection for target intuitive and interactive environment to provide a platform
organs. Medical practitioners. Bearing this in mind, the objective where medical practitioners can interact with patient specific
of our research is to identify extant anomalies within the organ anatomical features for improved diagnosis and a more
with an aim to reproduce approximate size and relative location of thorough surgical planning.
anomaly and to generate a 3-Dimensional model of a target organ
with the anomaly based on scans of medical reports to visualize in Our project addresses the visualization of organs and their
an immersive environment through Augmented Reality. In this corresponding anomalies by making use of Augmented Reality.
paper, we discuss the chronological order of detecting, segmenting The use of augmented reality as our platform provides an
and rendering the anomalies onto our target organ- the brain. We intuitive and immersive experience to the user as it enables
have employed the U-Net architecture for CNN to train on the them to interact with the environment. This will enable the
MICCAI BraTS 2018 dataset for segmenting brain tumours, doctors to view the reports in 3-D as they will not have to rely
particularly high grade gliomas. Post segmentation, we discuss the on two dimensional sliced images to visualize the target organ.
pipeline of our research, with the end results being GLTF and The patients will also be able to interact with the generated
USDZ file formats to view the 3-D model of the brain along with model and thus, understand the cause-and-effect relationship of
tumour in Augmented Reality. We conclude the paper by the presence of the anomaly on the target organ.
discussing the future scope of research on our current work.
We have employed a Convolutional Neural Network (CNN)
Keywords— augmented reality; volumetric rendering; medical based on the U-Net architecture to work towards detecting and
imaging; brain tumour segmentation; deep learning segmenting the anomaly on the target organ, i.e, the brain.
Using the segmented anomaly (tumour in this case), we have
I. INTRODUCTION rendered a 3-D volume of the tumour as well as the target organ
Our motivation for this project stems from a dire need to (brain) of each patient in the form of STL files. These files have
abridge the semantic gap between medical practitioners, and been later converted into GLTF files for Android and USDZ
their patients. We inferred that most medical documents like files for iOS, to be viewed in augmented reality. The interface
DICOM, CT and MRI scans constitute esoteric jargon which of this research work is designed on a web application, wherein
can only be elaborated upon by doctors. Dedicated to tackle this the user can upload their MRI scans in the form of MHA files
problem, we deliberated upon the potential of using volumetric and view the resultant 3-D model of their brain in augmented
rendering in the form of Augmented Reality to tackle this reality. A unique shareable link will be generated for every
existing barrier, and how this technology has the potential to report, which will enable doctors to share the 3-D report with
revolutionize the medical industry because of the immersive their patients and vice versa. With these functionalities, we
and interactive experience that it can provide to any user, be it encourage the dissemination of information through an
doctor, or the patient. interactive and immersive medium provided in the form of
augmented reality.
To render a volume of the organ based on medical reports,
Augmented Reality can be used to overcome the currently
elusive linguistic explanation provided by doctors and move to
a much more graphically interactive solution which the patients
can easily understand. From the doctor’s perspective we
realized that surgeons are restricted to imagine a patient’s
anatomy from two dimensional views when most medical
images are acquired as three dimensional volumes. Perusing
through a patient’s anatomy and critical 3-D interactions of a
surgery using 2-D images can lead to a non-optimal surgical
Fig. 1. Pipeline of the proposed system.
978-1-7281-9180-5/20/$31.00 ©2020 IEEE
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
II. REVIEW OF EXISTING LITERATURE assumed. Adjoining pixels in the 3D space are connected
One method that was studied was to utilise Multilayer together and this process repeats until all slices are traversed.
Perceptrons(MLP), which have proven to be an efficient way The generated model is then verified using 3D DOCTOR. An
for classification of brain images; with the use of anisotropic advantage of this method is that it preserves the shape and gray
diffusion filters(ADF) for the segmentation of tumours. The levels of the original image scans which helps in texture
algorithm proposed by this method involve four steps: analysis and classification. Another advantage is that it is an
extracting features, classifying them, segmenting them and automatic process which allows interactive user control, unlike
modelling into a 3-Dimensional(3D) object. For extracting manual process that were followed earlier by doctors.[4]
features, Discrete Wavelet Transform (DWT) is implemented A method to not only segment brain tumour but also calculate
by applying different frequencies on the image, to represent the volume off the tumour along with its shape and size in 3D
coefficients and details. Onto this the MLP performs supervised is by utilising different image processing methods. After
learning to, classify images into tumour or non-tumour classes. inputting the images, they are passed through a high pass filter
Images classified as containing tumours are then segmented using a fspecial function. The histogram is then equalized and
with ADF before their volumes, relative to the brain is thresholding is performed to distinguish objects from their
generated. To do so, the entire brain is divided into eight background. Morphological functions such as erosion, is then
regions, and the region wherein the center point of the tumour performed leading to segmentation through connected
lies is picked as the relative position area of the tumour, in the component labelling. Volume 3D creates the volume render
brain. This method produced an accuracy over 90% when object from input data using the vol3d() function that is based
trained and tested on 2015 data from MICCAI BraTS.[1] on texture mapping technique in openGL.[5]
Another method that utilises artificial Neural Networks for Moving onto visualising segmented and reconstructed brain
segmentation involves 11-layer deep 3D Convolutional Neural tumours in augmented reality, one method is to use an image
Network (CNN). A new scheme for training that was utilised in processing technique based on markers to render 3D models
this paper was to connect adjacent image patches being onto the body. The system under consideration has four
processed in one pass in the network while adjusting to implicit modules which accepts video as an input and generates images
class imbalance in the image data. The issue of requiring large to overlay on top of the original video by rendering the
training batch sizes is overcome by using an image greater than reconstructed 3D model. The steps that are followed are Pre-
the receptive field of the CNN which outputs a posterior processing, finding marker, estimating pose and generating
probability with multiple voxels. Furthermore, for multi-scale graphic. Pre-processing converts the input image into a binary
processing implementation of parallel paths is suggested by the image. A marker is important to trigger placing the model in the
paper. This provides a solution taking into consideration local scene. Locating each closed contour in the binary image, to find
and contextual data, both, which enhances output of the markers, is carried out by a topological structural analysis
segmentation. To integrate both local and global contextual algorithm. To detect the markers, the front view of the rectangle
information into the CNN, a second pathway is added that is obtained by eliminating the perspective projection and
operates on down-sampled images. Thus, the dual pathway 3D looking for the marker pattern in the front view of the polygon.
CNN simultaneously processes the input image at multiple To estimate the pose, the angle of the camera looking at the
scales. The detailed local appearance of structures is captured scene needs to be estimated as this helps in rendering the
in the first pathway, whereas higher level features such as the generate model correctly. To do this, there is an issue of
location within the brain are learned in the second pathway.[2] calculating 3D data from 2D data. The method prefers to work
Another method of detecting and segmenting brain tumours with a homogenous coordinate system as compared to a
from MRI is proposed with a hybrid architecture of a CNN that Cartesian.[6]
uses a patch-based approach while considering local as well as A 3D brain tumour visualization method is proposed in this
contextual information to determine the output class. The paper that functions in real time by using facial features as
proposed network combines a two-path parallel network with 3 markers of the subject in the scene. A new method of camera
path networks to form a hybrid network. In the two-path calibration using the size of the face of the subject is
network, the first stream has small kernels to get local implemented and the pose is computed by processing 3D data
information from images while the other stream has kernels that and its 2D projections. Using these computations, the
have a large receptive which focus more on contextual reconstructed brain tumour is displayed on the subjects’ face
information. This four layered CNN is comparatively faster without the need of any marker.
than the 3 path CNN which uses 5 layers. Taking into account
the local and global information while determining the class In marker-less tracking, the camera pose is determined from
label for each pixel makes the output of this model based on naturally occurring features, such as edges and textures of an
contextual and local features both.[3] anatomical object (e.g. eye corners, nose-tip). A pre-
constructed model of the brain will be imposed on the patient
Another method of segmenting tumour portions from brain body. The proposed model of the system, as shown in Fig. 2,
scans is to apply morphological filters on T1 Flair and T2 consists of three components: Camera calibration, Pose
modalities of the MRI scans. Followed by passing the estimation and augmentation. Camera Calibration is based on
segmented tumour through a software package, 3D Doctor to Tsai algorithm and the Viola-Jones face detector. The most
generate accurate 3D models of tumour. To generate a 3D challenging task for pose estimation is reference point
model, an empty space is chalked out where each tumour pixels recognition. The five reference points looked up are the ends of
(x, y) coordinate pair is marked. For the z coordinate, the slice the eyes, ends of the lips, tip of the nose and tip of the chin of
number representing the distance between two slices is the subject’s face; using Dlib’s facial landmark detection. The
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
3D points of the corresponding reference points are used to accurate segmentation of these anomalies can be achieved if
compute their 2d equivalent values of the face. For the face pose they are segmented manually by humans. However, CNN can
estimation. The 3D model of the brain is rendered onto the recognize patterns in the data and thus they have been used to
scene using Unity. A difference in colour and in material is used detect anomalies by training them on the MICCAI BRATS
to identify parts of the skull and soft tissue of the brain.[7] 2018 dataset.
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
Flair modalities have been used for full tumour segmentation B. MHA to STL
instead of using all the modalities while training, which has After generating the MHA file we create surface meshes to
significantly reduced our training time. There are only two store boundary information of the 3D models the detected
input channels as only T2 and Flair modalities have been used
tumour and the patient’s brain. This information is stored in the
for full tumour segmentation.
form of an STL file, ensuring that the models are unique to
every patient. The end result of this step is two STL files, the
first one being that of the tumour and the second one being of
the patient’s brain. This is done using FlyingEdges3D which is
an implementation of the 3D version of the flying edges
algorithm. It is a 4 pass algorithm designed to be highly scalable
for large data. A smoothening filter was applied after the
creation of the surface mesh to remove rough edges generated
by interlocking polygons of the mesh.
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
D. GLTF Editing scene which allows for them to be viewed in Augmented
As seen in Fig. 8, the exported GLTF file from Blender does Reality. The modelviewer scene depicted below has the 3D
not retain its transparency and thus, had to be edited manually. model which can be viewed in the web app. This scene can be
The goal in this module was to convert the outer brain mesh to used for viewing in devices that do not have AR support. In the
near-transparent in order to view the tumour inside the brain’s modelviewer scene, there is a button in the bottom right corner
outer structure. The alpha value of brain’s surface is set to 0.1 that allows the user to view the model in AR. When clicked the
and the tumor’s color was changed to red to enhance the user is taken to a new page where the user can proceed to click
visibility and contrast for a better viewing experience.. the “view in your space” button. On clicking, the user is
prompted to move the phone around the physical space to allow
the calibration of the ground plane using the phone’s camera.
Once calibrated, the 3D model will appear in the physical space
as shown in the figure below. The user can then move, rotate
and scale the model in the physical space. The user can also take
a screenshot of the scene and share it instantly. To allow another
user to view the report, the link only needs to be shared and the
same process is followed on the shared user’s phone.
Fig. 9. GLTF file of the model
E. GLTF to USDZ
GLTF is a transmission format for 3D assets that are apropos to
web and mobile devices by removing data that is not important
for efficient display of assets. iOS does not support GLTF
format and hence it was imperative to convert the file into
USDZ format. This was performed by using Google’s
usd_from_gltf tool. This conversion emulates the functionality
of GLTF files in iOS. (a) (b)
Fig. 11. (a) Screenshot displaying the interface that allows the
user to view the scene in AR, (b) Screenshot of a model in
physical space
V. RESULTS
Table I
Mean Dice Score Median Dice Score
Fig. 10. USDZ file of the model Full Tumour 0.87 0.90
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.
extended to various other organs of the human body and a VII. REFERENCES
variety of imaging technology. [1] G. Latif, M. Mohsin Butt, A. H. Khan, M. Omair Butt and J. F. Al-
Asad, "Automatic Multimodal Brain Image Classification Using MLP and 3D
Further consultations with doctors and other medical Glioma tumour Reconstruction," 2017 9th IEEE-GCC Conference and
professionals can shed light upon their requirements and enable Exhibition (GCCCE), Manama,2017, pp. 1-9.
us to concentrate our efforts towards that need. [2] Kamnitsas, K.; Ledig, C.; Newcombe, V.F.J.; Simpson, J.P.; Kane,
A.D.; Menon, D.K.; Rueckert, D.; Glocker, B.: Efficient multiscale 3D CNN
with fully connected CRF for accurate brain lesion segmentation. Med. Image
Anal. 36, 61–78 (2017).
VII. FUTURE WORKS
[3] Sajid, S., Hussain, S. & Sarwar, A. Brain Tumor Detection and
The points mentioned below are some of the ideas that can be Segmentation in MR Images Using Deep Learning. Arab J Sci Eng 44, 9249–
further experimented and developed upon. These are just a few 9261 (2019).
of the many use-cases possible. [4] Resmi, S. and Thomas, T. (2012) A Semi-automatic method for
segmentation and 3D modeling of glioma tumours from brain MRI. Journal of
Biomedical Science and Engineering, 5, 378-383.
1. Real time sharing/interaction between multiple devices on
[5] Sindhushree. K. S, Mrs. Manjula. T. R, K. Ramesha, 2013,
the same model to serve as a tool which can be utilised for Detection And 3d Reconstruction Of Brain tumour From Brain Mri Images,
training purposes. INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH &
TECHNOLOGY (IJERT) Volume 02, Issue 08 (August 2013).
2. Utilise a solid 3D model instead of a surface mesh, which [6] Qian Shan, Thomas E. Doyle, Reza Samavi, Mona Al-Rei,
would give us the ability to perform cross-sectional views of “Augmented Reality Based Brain tumour 3D Visualization, Procedia Computer
the organ in question. Science”, Volume 113, 2017.
[7] M. A. Ghaderi, M. Heydarzadeh, M. Nourani, G. Gupta and L.
Tamil, "Augmented reality for breast tumours visualization," 2016 38th Annual
3. In organs such as the heart, simulations of internal functions International Conference of the IEEE Engineering in Medicine and Biology
can be displayed intuitively and interactively. Society (EMBC), Orlando, FL, 2016, pp. 4391-4394.
[8] B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani,
4. Planning of surgeries amongst doctors in a collaborative AR J. Kirby, et al. "The Multimodal Brain Tumor Image Segmentation Benchmark
environment with the use of headsets. (BRATS)", IEEE Transactions on Medical Imaging 34(10), 1993-2024 (2015)
DOI: 10.1109/TMI.2014.2377694
VII. ACKNOWLEDGMENT [9] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J.S. Kirby,
et al., "Advancing The Cancer Genome Atlas glioma MRI collections with
This paper and the supporting research would not have been expert segmentation labels and radiomic features", Nature Scientific Data,
possible without the exceptional support and profound insights 4:170117 (2017) DOI: 10.1038/sdata.2017.117
of our mentors, Prof. Kamal Mistry and Mr. Krupalu Mehta. [10] S. Bakas, M. Reyes, A. Jakab, S. Bauer, M. Rempfler, A. Crimi, et
We are also deeply grateful to the Perelman School of Medicine al., "Identifying the Best Machine Learning Algorithms for Brain Tumor
at the University of Pennsylvania for providing us with the Segmentation, Progression Assessment, and Overall Survival Prediction in the
BraTS 2018 dataset. BRATS Challenge", arXiv preprint arXiv:1811.02629 (2018)
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 20,2024 at 16:27:50 UTC from IEEE Xplore. Restrictions apply.