0% found this document useful (0 votes)
14 views

Understanding The Competition Rsna

The document summarizes work done on the RSNA 2023 Abdominal Trauma Detection competition dataset. It includes: 1. Loading and exploring the train CSV file containing patient IDs and labels. 2. Visualizing the distribution of injuries in the data using a bar plot. 3. Reading DICOM image files, displaying an animation of sample images. 4. Preprocessing a subset of the DICOM images to save them in NPY format for further analysis. The dataset contains CT images of patients labeled for abdominal trauma and injury location. The document explores the provided CSV metadata, visualizes injury distributions, reads DICOM files into NumPy arrays, and preprocesses a subset of images for

Uploaded by

asoedjfanush
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Understanding The Competition Rsna

The document summarizes work done on the RSNA 2023 Abdominal Trauma Detection competition dataset. It includes: 1. Loading and exploring the train CSV file containing patient IDs and labels. 2. Visualizing the distribution of injuries in the data using a bar plot. 3. Reading DICOM image files, displaying an animation of sample images. 4. Preprocessing a subset of the DICOM images to save them in NPY format for further analysis. The dataset contains CT images of patients labeled for abdominal trauma and injury location. The document explores the provided CSV metadata, visualizes injury distributions, reads DICOM files into NumPy arrays, and preprocesses a subset of images for

Uploaded by

asoedjfanush
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

RSNA : RadioLogical Society of North America

import warnings
warnings.filterwarnings("ignore")

The R S N A 2023 A b d o mi n a l T r a u m a D e t e c t io n competition is a machine learning


challenge that aims to develop algorithms to detect abdominal trauma in
C o m p ut e d T o mo g r a p h y ( C T ) images. The competition is hosted by the R a d io l o g ic a l
S o c i e t y of N o r t h A m e r ic a ( R S N A )
I couldnt find a good meme related to R S N A , so here is L i g h t n in g M c Q u e e n to cheer ¿
K a Cho ww ¿

Thanks to Theo Viel for providing small chunks of data to experiment


K U DO S ! ! !

1 | Data 📊
import pandas as pd
import pydicom

Our main data is situated in a CSV file −− −> ¿ train.csv

train_csv = pd.read_csv("/kaggle/input/rsna-2023-abdominal-trauma-
detection/train.csv")
print(train_csv.shape)
train_csv.head()

(3147, 15)

patient_id bowel_healthy bowel_injury extravasation_healthy \


0 10004 1 0 0
1 10005 1 0 1
2 10007 1 0 1
3 10026 1 0 1
4 10051 1 0 1

extravasation_injury kidney_healthy kidney_low kidney_high \


0 1 0 1 0
1 0 1 0 0
2 0 1 0 0
3 0 1 0 0
4 0 1 0 0
liver_healthy liver_low liver_high spleen_healthy spleen_low \
0 1 0 0 0 0
1 1 0 0 1 0
2 1 0 0 1 0
3 1 0 0 1 0
4 1 0 0 0 1

spleen_high any_injury
0 1 1
1 0 0
2 0 0
3 0 0
4 0 1

This csv file contains the patient_id and further explanatory columns

• P a t i e n t I D patient_id - THis column contains the ids of 3 , 147 patients.


Further columns denote what symptons were shown by the specific patient. We can use the
patient_id to get information from train_images Folder.

T r a i n I m a g e s is a big 3 l e v e l nested - directory, contianing images for the specific patient .


There are different folders in the Direcotry that denote the patient_id

The images are labeled with the presence or absence of abdominal trauma, as well
as the location of any injuries. The data is provided in DICOM format, which is a
standard format for medical images.

The type of data in the competition dataset is C T images of patients with abdominal
trauma. The images are labeled with the presence/absence of abdominal trauma, as well
as the location of any injuries. The data is provided in DICOM format, which is a
standard format for medical images.

We can read the DCIM file through pydicom

image_file =
"/kaggle/input/rsna-2023-abdominal-trauma-detection/train_images/10004
/21057/1001.dcm"
ds = pydicom.read_file(image_file)

ds

Dataset.file_meta -------------------------------
(0002, 0001) File Meta Information Version OB: b'\x00\x01'
(0002, 0002) Media Storage SOP Class UID UI: CT Image Storage
(0002, 0003) Media Storage SOP Instance UID UI:
1.2.123.12345.1.2.3.10004.1.1001
(0002, 0010) Transfer Syntax UID UI: RLE Lossless
(0002, 0012) Implementation Class UID UI:
1.2.3.123456.4.5.1234.1.12.0
(0002, 0013) Implementation Version Name SH: 'PYDICOM 2.4.0'
-------------------------------------------------
(0008, 0018) SOP Instance UID UI:
1.2.123.12345.1.2.3.10004.1.1001
(0008, 0023) Content Date DA: '20230721'
(0008, 0033) Content Time TM: '232531.439438'
(0010, 0020) Patient ID LO: '10004'
(0018, 0050) Slice Thickness DS: '1.0'
(0018, 0060) KVP DS: '90.0'
(0018, 5100) Patient Position CS: 'HFS'
(0020, 000d) Study Instance UID UI:
1.2.123.12345.1.2.3.10004
(0020, 000e) Series Instance UID UI:
1.2.123.12345.1.2.3.10004.21057
(0020, 0011) Series Number IS: '16'
(0020, 0013) Instance Number IS: '1001'
(0020, 0032) Image Position (Patient) DS: [-240.55273, -
378.05273, -1601.4]
(0020, 0037) Image Orientation (Patient) DS: [1.0, 0.0, 0.0,
0.0, 1.0, 0.0]
(0020, 0052) Frame of Reference UID UI:
1.2.826.0.1.3680043.8.498.61841901354930484747532163888112046276
(0028, 0002) Samples per Pixel US: 1
(0028, 0004) Photometric Interpretation CS: 'MONOCHROME2'
(0028, 0010) Rows US: 512
(0028, 0011) Columns US: 512
(0028, 0030) Pixel Spacing DS: [0.89453125,
0.89453125]
(0028, 0100) Bits Allocated US: 16
(0028, 0101) Bits Stored US: 12
(0028, 0102) High Bit US: 11
(0028, 0103) Pixel Representation US: 0
(0028, 1050) Window Center DS: '50.0'
(0028, 1051) Window Width DS: '400.0'
(0028, 1052) Rescale Intercept DS: '-1024.0'
(0028, 1053) Rescale Slope DS: '1.0'
(0028, 1054) Rescale Type LO: 'HU'
(7fe0, 0010) Pixel Data OB: Array of 267568
elements

2 | Visualization 🔬
import numpy as np
import tqdm
import os

import matplotlib.pyplot as plt


import matplotlib.animation as animation
import seaborn as sns
from IPython.display import HTML

Thanks to Yuanjian Li=>EDA with Animation BeginnerFriendly/Jocelyn Dumlao=>Unleashing


the Healing Potential: Abdominal Trauma for providing great C S V − E D A Do checkout

organ_columns = ['bowel', 'extravasation', 'kidney', 'liver',


'spleen']

organ_counts = pd.DataFrame()
organ_counts['Organ'] = train_csv.columns[1:]
organ_counts["count"] = [0 for _ in range(organ_counts.shape[0])]
for index , column in enumerate(train_csv.columns[1:]):
organ_counts['count'][index] = train_csv[column].sum()

plt.figure(figsize=(10, 3))
sns.barplot(data=organ_counts.sort_values(by=['count']), x='Organ',
y='count')
plt.xticks(rotation=90)
plt.title("Distribution of Injury")
plt.xlabel("Injury --->")
plt.ylabel("Count --->")
plt.show()

Thanks to Franklin Shih0617=>RSNA Abdominal Trauma Detect EDA animation for providing
great A n im a t i o n s . Do checkout

file_1 =
['/kaggle/input/rsna-2023-abdominal-trauma-detection/train_images/
10004/21057/' + file
for file in os.listdir('/kaggle/input/rsna-2023-abdominal-
trauma-detection/train_images/10004/21057')]
file_2 =
['/kaggle/input/rsna-2023-abdominal-trauma-detection/train_images/
10004/51033/' + file
for file in os.listdir('/kaggle/input/rsna-2023-abdominal-
trauma-detection/train_images/10004/51033')]
sample_files = file_1 + file_2

sample_vid = [pydicom.dcmread(file).pixel_array for file in


tqdm.tqdm(sample_files , total = len(sample_files))]

fig, ax = plt.subplots()
im = ax.imshow(sample_vid[0], cmap=plt.cm.bone)

update = lambda i : im.set_array(sample_vid[i])

ani = animation.FuncAnimation(fig, update,


frames=range(len(sample_vid)), repeat=True)

HTML(ani.to_jshtml())

100%|██████████| 2066/2066 [00:45<00:00, 45.79it/s]

<IPython.core.display.HTML object>
3 | Preprocessing 🔨
Our data is in Dicom File, but we want them in NPY format, which can take a lot of time, but lets
try for the first 80 inputs

train_list = sorted(os.listdir('/kaggle/input/rsna-2023-abdominal-
trauma-detection/train_images'))[:79]

for folder_1 in tqdm.tqdm(train_list , total = len(train_list)):

folder_1_list = sorted(os.listdir('/kaggle/input/rsna-2023-
abdominal-trauma-detection/train_images/' + folder_1))

os.makedirs('/kaggle/working/Inputs/' + folder_1 + '/')


lis = list()

for folder_2 in folder_1_list:

folder_2_list = sorted(os.listdir('/kaggle/input/rsna-2023-
abdominal-trauma-detection/train_images/' + folder_1 + '/' +
folder_2))

for files in folder_2_list:

file = pydicom.read_file('/kaggle/input/rsna-2023-
abdominal-trauma-detection/train_images/' + folder_1 + '/' + folder_2
+ '/' + files)

arr = file.pixel_array
arr = np.resize(arr , new_shape = (512 , 512))
lis.append(arr)

np.save('/kaggle/working/Inputs/' + folder_1 + '/' + 'file',


np.stack(lis , -1))

100%|██████████| 79/79 [21:17<00:00, 16.17s/it]

4 | Ending ☑️
THAT IT FOR TODAY GUYS

WE WILL GO DEEPER INTO THE DATA IN THE UPCOMING VERSIONS

PLEASE COMMENT YOUR THOUGHTS, HIHGLY APPRICIATED

DONT FORGET TO MAKE AN UPVOTE, IF YOU LIKED MY WORK :)


PEACE OUT !!!! :)

You might also like