Understanding The Competition Rsna
Understanding The Competition Rsna
import warnings
warnings.filterwarnings("ignore")
1 | Data 📊
import pandas as pd
import pydicom
train_csv = pd.read_csv("/kaggle/input/rsna-2023-abdominal-trauma-
detection/train.csv")
print(train_csv.shape)
train_csv.head()
(3147, 15)
spleen_high any_injury
0 1 1
1 0 0
2 0 0
3 0 0
4 0 1
This csv file contains the patient_id and further explanatory columns
The images are labeled with the presence or absence of abdominal trauma, as well
as the location of any injuries. The data is provided in DICOM format, which is a
standard format for medical images.
The type of data in the competition dataset is C T images of patients with abdominal
trauma. The images are labeled with the presence/absence of abdominal trauma, as well
as the location of any injuries. The data is provided in DICOM format, which is a
standard format for medical images.
image_file =
"/kaggle/input/rsna-2023-abdominal-trauma-detection/train_images/10004
/21057/1001.dcm"
ds = pydicom.read_file(image_file)
ds
Dataset.file_meta -------------------------------
(0002, 0001) File Meta Information Version OB: b'\x00\x01'
(0002, 0002) Media Storage SOP Class UID UI: CT Image Storage
(0002, 0003) Media Storage SOP Instance UID UI:
1.2.123.12345.1.2.3.10004.1.1001
(0002, 0010) Transfer Syntax UID UI: RLE Lossless
(0002, 0012) Implementation Class UID UI:
1.2.3.123456.4.5.1234.1.12.0
(0002, 0013) Implementation Version Name SH: 'PYDICOM 2.4.0'
-------------------------------------------------
(0008, 0018) SOP Instance UID UI:
1.2.123.12345.1.2.3.10004.1.1001
(0008, 0023) Content Date DA: '20230721'
(0008, 0033) Content Time TM: '232531.439438'
(0010, 0020) Patient ID LO: '10004'
(0018, 0050) Slice Thickness DS: '1.0'
(0018, 0060) KVP DS: '90.0'
(0018, 5100) Patient Position CS: 'HFS'
(0020, 000d) Study Instance UID UI:
1.2.123.12345.1.2.3.10004
(0020, 000e) Series Instance UID UI:
1.2.123.12345.1.2.3.10004.21057
(0020, 0011) Series Number IS: '16'
(0020, 0013) Instance Number IS: '1001'
(0020, 0032) Image Position (Patient) DS: [-240.55273, -
378.05273, -1601.4]
(0020, 0037) Image Orientation (Patient) DS: [1.0, 0.0, 0.0,
0.0, 1.0, 0.0]
(0020, 0052) Frame of Reference UID UI:
1.2.826.0.1.3680043.8.498.61841901354930484747532163888112046276
(0028, 0002) Samples per Pixel US: 1
(0028, 0004) Photometric Interpretation CS: 'MONOCHROME2'
(0028, 0010) Rows US: 512
(0028, 0011) Columns US: 512
(0028, 0030) Pixel Spacing DS: [0.89453125,
0.89453125]
(0028, 0100) Bits Allocated US: 16
(0028, 0101) Bits Stored US: 12
(0028, 0102) High Bit US: 11
(0028, 0103) Pixel Representation US: 0
(0028, 1050) Window Center DS: '50.0'
(0028, 1051) Window Width DS: '400.0'
(0028, 1052) Rescale Intercept DS: '-1024.0'
(0028, 1053) Rescale Slope DS: '1.0'
(0028, 1054) Rescale Type LO: 'HU'
(7fe0, 0010) Pixel Data OB: Array of 267568
elements
2 | Visualization 🔬
import numpy as np
import tqdm
import os
organ_counts = pd.DataFrame()
organ_counts['Organ'] = train_csv.columns[1:]
organ_counts["count"] = [0 for _ in range(organ_counts.shape[0])]
for index , column in enumerate(train_csv.columns[1:]):
organ_counts['count'][index] = train_csv[column].sum()
plt.figure(figsize=(10, 3))
sns.barplot(data=organ_counts.sort_values(by=['count']), x='Organ',
y='count')
plt.xticks(rotation=90)
plt.title("Distribution of Injury")
plt.xlabel("Injury --->")
plt.ylabel("Count --->")
plt.show()
Thanks to Franklin Shih0617=>RSNA Abdominal Trauma Detect EDA animation for providing
great A n im a t i o n s . Do checkout
file_1 =
['/kaggle/input/rsna-2023-abdominal-trauma-detection/train_images/
10004/21057/' + file
for file in os.listdir('/kaggle/input/rsna-2023-abdominal-
trauma-detection/train_images/10004/21057')]
file_2 =
['/kaggle/input/rsna-2023-abdominal-trauma-detection/train_images/
10004/51033/' + file
for file in os.listdir('/kaggle/input/rsna-2023-abdominal-
trauma-detection/train_images/10004/51033')]
sample_files = file_1 + file_2
fig, ax = plt.subplots()
im = ax.imshow(sample_vid[0], cmap=plt.cm.bone)
HTML(ani.to_jshtml())
<IPython.core.display.HTML object>
3 | Preprocessing 🔨
Our data is in Dicom File, but we want them in NPY format, which can take a lot of time, but lets
try for the first 80 inputs
train_list = sorted(os.listdir('/kaggle/input/rsna-2023-abdominal-
trauma-detection/train_images'))[:79]
folder_1_list = sorted(os.listdir('/kaggle/input/rsna-2023-
abdominal-trauma-detection/train_images/' + folder_1))
folder_2_list = sorted(os.listdir('/kaggle/input/rsna-2023-
abdominal-trauma-detection/train_images/' + folder_1 + '/' +
folder_2))
file = pydicom.read_file('/kaggle/input/rsna-2023-
abdominal-trauma-detection/train_images/' + folder_1 + '/' + folder_2
+ '/' + files)
arr = file.pixel_array
arr = np.resize(arr , new_shape = (512 , 512))
lis.append(arr)
4 | Ending ☑️
THAT IT FOR TODAY GUYS