0% found this document useful (0 votes)

38 views

Data Mining 2

- The document discusses hyperparameter optimization of a neural network model for MNIST digit classification using Optuna. - It loads and preprocesses the MNIST dataset, then defines models with hyperparameters like number of layers and units per layer. - Optuna is used to run trials of each model by varying the hyperparameters, training the models, and returning the validation accuracy. The best trials are identified to find the optimal hyperparameters.

Uploaded by

21800768

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views

Data Mining 2

Uploaded by

21800768

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 24

14 주차

Neural Network
# Data handling
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt

# Model
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.initializers import GlorotNormal
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.datasets import mnist

# Optimize the Hyper parameters

import optuna

# Dateset & Utility

from keras.datasets import mnist
from keras.utils import to_categorical
from keras import layers
from keras import models

Q1. keras 패키지를 사용하여 아래 task 를 수행하여라. (R 뿐만 아니라

python 의 keras 패키지를 사용하는 것도 가능하다.)
# 강의자료에서 modeling 한 neural net architecture 구현
# Load the MNIST dataset
(train_images, train_labels), (test_images, test_labels) =
mnist.load_data()

# Preprocess the data: Flatten the images and scale the

pixel values to [0, 1]
train_images = train_images.reshape((60000, 28 *
28)).astype('float32') / 255
test_images = test_images.reshape((10000, 28 *
28)).astype('float32') / 255

# Convert labels to one-hot encoding

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# Initialize the model

model = Sequential([
# 2 개의 Hidden layer
Dense(400, activation='relu',
kernel_initializer=GlorotNormal(), input_shape=(28 * 28,)),
Dropout(0.3),
Dense(200, activation='relu',
kernel_initializer=GlorotNormal()),
Dropout(0.3),
# Output layer
Dense(10, activation='softmax',
kernel_initializer=GlorotNormal())
])

# Compile the model

model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)

# Model fitting
history = model.fit(
train_images, train_labels,
epochs=100,
validation_split=0.1
)

 MNIST 데이터셋을 로드하고, 이미지 데이터를 모델에 적합한 형태로

전처리

 Sequential 모델을 사용하여 인공 신경망 만들기

 모델을 컴파일하여 손실 함수, 옵티마이저, 평가 지표 등을 설정

 컴파일된 모델을 실제 데이터에 맞추어 학습(Epochs: 반복 횟수). 이

함수에서는 에폭을 100 으로 잡았다.

# Result
# Assuming 'history' is the object returned from the 'fit'
method
# Plot training & validation accuracy values
plt.figure(figsize=(14, 5))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')

# Plot training & validation loss values

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')

plt.show()

print(f"Training Loss: {round(train_loss,3)}")

print(f"Training Accuracy: {round(train_accuracy,3)}")
print()
# Evaluate the model on the test data
test_loss, test_accuracy = model.evaluate(test_images,
test_labels)
print()
print(f"Test Loss: {round(test_loss,3)}")
print(f"Test Accuracy: {round(test_accuracy,3)}")

 모델 훈련 후 결과를 시각화하여 나타내었다. 보니까 모델 훈련이 잘

된 것을 볼 수 있다. Training Loss 가 매우 낮고, Training Accuracy 는
99.5%로 매우 높다. Test Loss 는 Training Loss 에 비해 약간 더 높긴
하지만, 충분히 아주 낮은 수치이다. 정확도 또한 97%로 높다.

 훈련 데이터와 테스트 데이터 간의 정확도 차이가 크지 않고, 높은

정확도와 낮은 Loss 를 갖는다.

Q2. Optimize the Hyper parameters

 OPTUNA 라이브러리로 파라미터 변경하며 Validation accuracy 확인’

 epochs: 20 으로 고정

 trials: 15~20 (경우의 수가 많으면 20 회, 적으면 15 회 시도)

Split the data set

 train: 50,000

 validation: 10,000

 test: 10,000

# test 결과 비교를 위한 데이터 프레임 생성

result_df = pd.DataFrame({'Hyper parameter':[], 'Test
Accuracy':[]})
# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Preprocess the data: Flatten the images and scale the

pixel values to [0, 1]
x_train = x_train.reshape((60000, 28 *
28)).astype('float32') / 255
x_test = x_test.reshape((10000, 28 *
28)).astype('float32') / 255

# Create a validation set from the last 10000 images from

the training set
x_val = x_train[50000:]
y_val = y_train[50000:]

# Use the first 50000 images for training

x_train = x_train[:50000]
y_train = y_train[:50000]

# Convert labels to one-hot encoding

y_train = to_categorical(y_train, 10)
y_val = to_categorical(y_val, 10)
y_test = to_categorical(y_test, 10)
 MNIST 데이터셋을 로드하고, 이미지 데이터를 전처리한 뒤 모델
훈련에 사용할 데이터셋을 구성
o MNIST 데이터셋을 로드하고, 이미지 데이터를 28x28 의 2D
배열에서 1D 배열(784 픽셀)로 평탄화하고, 픽셀 값의 범위를
[0, 1]로 조정
o Training set 의 마지막 1 만 개의 이미지를 검증 세트로
사용하기 위해 분리
o 처음 5 만 개의 이미지를 훈련에 사용. 각 데이터셋의 레이블을
원-핫 인코딩으로 변환

def predict(model_func, best_trial):

# 최적의 하이퍼파라미터를 사용하여 모델 생성

best_model = model_func(best_trial)

# 모델 컴파일
best_model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# 모델 훈련
best_model.fit(x_train, y_train,
epochs=20,
validation_data=(x_val, y_val),
verbose=1)

# 테스트 데이터에 대해 모델 평가
test_loss, test_accuracy = best_model.evaluate(x_test,
y_test, verbose=1)

print("Hyper parameter of best model: ",

best_trial.params)
print(f'Test Accuracy: {round(test_accuracy,3)}')
return test_accuracy
 모델 함수와 최적의 하이퍼파라미터를 사용하여 모델을 생성하고,
훈련하며, 테스트 데이터에 대해 평가하는 함수 생성
o 최적의 하이퍼 파라미터를 사용하여 모델을 생성
o Adam 옵티마이저와 categorical crossentropy 손실 함수로
모델을 컴파일. 정확도 평가
o 훈련 데이터로 모델을 훈련하고, 검증 데이터로 모델을 평가.
epochs 는 20
o 테스트 데이터에 대해 모델의 성능을 평가하고, 테스트
정확도를 출력

1). Layer 수를 변경

 Hidden layer 를 2~4 개로 변경하며 성능 평가

 첫번째 layer 는 256, 그 이후는 128

def model_by_layers(trial):
# Hyperparameters to be tuned by Optuna
num_layers = trial.suggest_int('num_layers', 1, 3)

# Model architecture
model = Sequential()
model.add(Dense(units=256, activation='relu',
input_shape=(28 * 28,)))
model.add(Dropout(rate=0.3))
# Hidden layer
for i in range(num_layers):
model.add(Dense(units=128, activation='relu'))
model.add(Dropout(rate=0.3))
model.add(Dense(10, activation='softmax'))

return model

def objective(trial):
# Create a model for this trial
model = model_by_layers(trial)

# Compile the model

model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Train the model

history = model.fit(
x_train,
y_train,
validation_data=(x_val, y_val),
shuffle=True,
batch_size=32,
epochs=20,
verbose=1
)

# Evaluate the model on the validation set

val_accuracy = history.history['val_accuracy'][-1]

# Return the validation accuracy

return val_accuracy

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)

# Get the best trial

best_trial_by_layers = study.best_trial
# Convert the study trials into a DataFrame
result_by_layers = study.trials_dataframe()
 Optuna 를 사용하여 신경망의 Layer 수를 조정, 최적의
하이퍼파라미터 찾는데 사용되는 함
o model_by_layers(trial) 함수는 주어진 하이퍼파라미터를 바탕으로
Layer 구성
o objective 함수에서는 Optuna 가 추천한 하이퍼파라미터로 모델을
컴파일하고 훈련. 이를 10 회 반복하여 최적의 구조를 찾고, 각
시도에 대한 결과를 DataFrame 으로 저장
o study.optimize(objective, n_trials=10)를 통해 주어진 횟수만큼 모델을
훈련하고 최적의 하이퍼파라미터를 찾는다.

result_by_layers.groupby('params_num_layers')
[['value']].mean().sort_values('value', ascending=False)

 params_num_layers 가 1 일 때(총 layer 갯수 2 개) 성능이 0.981000

으로 가장 높은 걸 확인할 수 있다. layer 를 추가할수록 성능이
조금씩 낮아지는 걸 보면, overfitting 이 일어난 것으로 보인다.

# Predict
test_accuracy = predict(model_by_layers,
best_trial_by_layers)
layer_result_df = pd.DataFrame({'Hyper parameter':
['Layers'], 'Test Accuracy':[test_accuracy]})
result_df = pd.concat([result_df, layer_result_df],
ignore_index=True)

 num_layers 가 2 일때, Test Accuracy 는 0.98 로 가장 좋은 성능을

가진다.

2. Layer 당 unit 의 수를 변경

 hidden layer 를 2 층으로 고정하고 1 번째 hidden layer units 을 [64,

128, 256, 512] 으로 변경하며 accuracy 확인

 2 번째 hidden layer 는 1 번째의 절반으로 하였음

def model_by_units(trial):
# Hyperparameters to be tuned by Optuna

num_units = trial.suggest_categorical('num_units', [64,

128, 256, 512])
# Model architecture
model = Sequential()
# Hidden layer 1
model.add(Dense(units=num_units, activation='relu',
input_shape=(28 * 28,)))
model.add(Dropout(rate=0.3))
# Hidden layer 2
model.add(Dense(units=num_units//2, activation='relu'))
# Output layer
model.add(Dense(10, activation='softmax'))

return model

def objective(trial):

# Create a model for this trial

model = model_by_units(trial)

# Compile the model

model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Train the model

history = model.fit(
x_train,
y_train,
validation_data=(x_val, y_val),
shuffle=True,
batch_size=32,
epochs=20,
verbose=0
)

# Evaluate the model on the validation set

val_accuracy = history.history['val_accuracy'][-1]

# Return the validation accuracy

return val_accuracy

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=15)

# Get the best trial

best_trial_by_units = study.best_trial
# Convert the study trials into a DataFrame
result_by_units = study.trials_dataframe()

 num_units 을 조정하여 신경망의 구조를 변경하는 함수와 Optuna 를

사용하여 최적의 하이퍼파라미터를 찾는다.
o model_by_units(trial) 함수는 주어진 하이퍼파라미터(num_units)를
바탕으로 신경망을 구성합니다. 이 함수에서는 층의 유닛 수와
드롭아웃 레이어를 포함하여 모델의 구조를 정의
o objective(trial) 함수에서는 해당 모델을 컴파일하고 훈련시킨 후,
검증 정확도를 반환합니다. 이 함수는 주어진 하이퍼파라미터
조합으로 모델을 평가하고 최적의 하이퍼파라미터를 찾기 위해
사용
o study.optimize(objective, n_trials=15)를 통해 15 회의 시도 동안 최적의
하이퍼파라미터를 찾고, 이를 통해 최상의 모델을 결정

result_by_units.groupby('params_num_units')
[['value']].mean().sort_values('value', ascending=False)
# Predict
test_accuracy = predict(model_by_units, best_trial_by_units)
unit_result_df = pd.DataFrame({'Hyper parameter':['Units'],
'Test Accuracy':[test_accuracy]})
result_df = pd.concat([result_df, unit_result_df],
ignore_index=True)

 위에 표와 예측한 결과를 보면, params_num_units 이 512 일 때, value

값이 0.981786 으로 가장 큰 걸 확인할 수 있다. params_num_units 이
작을수록 underfitting 이 일어난 것으로 보인다.

3). Drop-out 을 추가하고 drop-out rate 를 변경

 Hidden layer 2 개 층 (256, 128) 로 고정하고 각 층에서 [0, 0.1, 0.2, 0.3,
0.4, 0.5] dropout 변경하며 성능 확인

def model_by_dropout(trial):
# Hyperparameters to be tuned by Optuna
dropout_rate = trial.suggest_categorical('dropout_rate',
[0, 0.1, 0.2, 0.3, 0.4, 0.5])

# Model architecture
model = Sequential()
# Hidden layer
model.add(Dense(units=256, activation='relu',
input_shape=(28 * 28,)))
model.add(Dropout(rate=dropout_rate))
model.add(Dense(units=128, activation='relu'))
model.add(Dropout(rate=dropout_rate))
# Outpuy layer
model.add(Dense(10, activation='softmax'))

return model
def objective(trial):

# Create a model for this trial

model = model_by_dropout(trial)

# Compile the model

model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Train the model

history = model.fit(
x_train,
y_train,
validation_data=(x_val, y_val),
shuffle=True,
batch_size=32,
epochs=20,
verbose=0
)

# Evaluate the model on the validation set

val_accuracy = history.history['val_accuracy'][-1]

# Return the validation accuracy

return val_accuracy

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=20)

# Get the best trial

best_trial_by_dropouts = study.best_trial
# Convert the study trials into a DataFrame
result_by_dropouts = study.trials_dataframe()

 조정하여 모델을 구성하고, Optuna 를 사용하여 최적의

dropout_rate 을

드롭아웃 비율을 탐색

result_by_dropouts.groupby('params_dropout_rate')
[['value']].mean().sort_values('value', ascending=False

# Predict
test_accuracy = predict(model_by_dropout,
best_trial_by_dropouts)
dropout_result_df = pd.DataFrame({'Hyper parameter':
['Dropouts'], 'Test Accuracy':[test_accuracy]})
result_df = pd.concat([result_df, dropout_result_df],
ignore_index=True)

 위 표와 predict 한 결과를 봤을 때, dropout_rate 가 0.1 일 때, 성능이

0.981350 으로 가장 높은 걸 확인할 수 있다.

4). Batch size 를 변경

 Hidden layer 2 개 층 (256, 128) 로 고정하고 batch size 를 [16, 32, 64,
128]로 변경

def model_by_batch(trial):
# Hyperparameters to be tuned by Optuna
# Model architecture
model = Sequential()
# Hidden layer
model.add(Dense(units=256, activation='relu',
input_shape=(28 * 28,)))
model.add(Dropout(rate=0.3))
model.add(Dense(units=128, activation='relu'))
model.add(Dropout(rate=0.3))
# Output layer
model.add(Dense(10, activation='softmax'))

return model

def objective(trial):
# Create a model with the current trial's dropout rate
model = model_by_batch(trial)

# Compile the model

model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Suggest a batch size

batch_size = trial.suggest_categorical('batch_size',
[16, 32, 64, 128])

# Train the model with the suggested batch size

history = model.fit(
x_train,
y_train,
validation_data=(x_val, y_val),
shuffle=True,
batch_size=batch_size,
epochs=20,
verbose=0
)

# Evaluate the model on the validation set

val_accuracy = history.history['val_accuracy'][-1]
# Return the validation accuracy
return val_accuracy

# Rest of your code for Optuna study remains the same

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=20)

# Get the best trial

best_trial_by_batch = study.best_trial
# Convert the study trials into a DataFrame
result_by_batch = study.trials_dataframe()

 (batch_size)를 조정하여 모델을 구성하고, Optuna 를 사용하여 최적의

배치 크기를 탐색
o model_by_dropout 함수는 주어진 드롭아웃 비율 후보군 중 하나를
선택하여 신경망 모델 생성
o objective 함수는 Optuna 가 제안한 드롭아웃 비율을 사용하여
모델을 훈련하고 검증
o study.optimize 함수를 통해 이를 20 회 반복하여 최적의 드롭아웃
비율을 찾고, 각 시도에 대한 결과를 DataFrame 으로 저장

# batch size 가 클수록 robust 한 모델

result_by_batch.groupby('params_batch_size')
[['value']].mean().sort_values('value', ascending=False)

# Predict
test_accuracy = predict(model_by_batch, best_trial_by_batch)
batch_result_df = pd.DataFrame({'Hyper parameter':['Batch
size'], 'Test Accuracy':[test_accuracy]})
result_df = pd.concat([result_df, batch_result_df],
ignore_index=True)

 위 표와 predict 결과를 봤을 때, batch_size 가 128 일 때, 정확도가

0.98156 으로 가장 높은 걸 확인할 수 있다.
5). Optimize the Hyper parameters

 위에서 serach 한 parameter 의 경우의 수를 조합하여 파라미터

최적화

def model_by_all(trial):
# Hyperparameters to be tuned by Optuna
num_layers = trial.suggest_int('num_layers', 1, 3)
dropout_rate = trial.suggest_float('dropout_rate', 0.0,
0.5)
num_units = trial.suggest_categorical('num_units', [64,
128, 256, 512])

# Model architecture
model = Sequential()
# Hidden layer
model.add(Dense(units=trial.suggest_int('units_layer_0',
256, 512), activation='relu', input_shape=(28 * 28,)))
model.add(Dropout(rate=dropout_rate))
for i in range(num_layers):

model.add(Dense(units=trial.suggest_int(f'units_layer_{i+1}'
, 32, 128), activation='relu'))
model.add(Dropout(rate=dropout_rate))
model.add(Dense(10, activation='softmax'))

return model

def objective(trial):

# Create a model for this trial

model = model_by_all(trial)

# Compile the model

model.compile(optimizer='adam',
loss='categorical_crossentropy', metrics=['accuracy'])
batch_size = trial.suggest_categorical('batch_size',
[16, 32, 64, 128])
# Train the model
history = model.fit(
x_train,
y_train,
validation_data=(x_test, y_test),
shuffle=True,
batch_size = batch_size,
epochs=20,
verbose=0
)

# Evaluate the model on the validation set

val_accuracy = history.history['val_accuracy'][-1]

# Return the validation accuracy

return val_accuracy

# Rest of your code for Optuna study remains the same

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=40)

# Get the best trial

best_trial_by_all = study.best_trial
# Convert the study trials into a DataFrame
result_by_hp = study.trials_dataframe()

 네 가지 하이퍼파라미터(num_layers, dropout_rate, num_units, batch_size)를

조정하여 모델을 구성하고, Optuna 를 사용하여 최적의
하이퍼파라미터를 탐색
o model_by_all 함수는 드롭아웃 비율, 레이어 수, 뉴런 수 등과 같은
여러 하이퍼파라미터를 받아 신경망을 구성
o objective 함수는 Optuna 가 제안한 하이퍼파라미터로 모델을

훈련하고, 검증 데이터에 대한 정확도를 평가

o study.optimize 함수를 통해 이를 40 회 반복하여 최적의

하이퍼파라미터를 찾고, 각 시도에 대한 결과를 DataFrame

으로 저장

result_df.sort_values('Test Accuracy', ascending=False)

# Predict
test_accuracy = predict(model_by_all, best_trial_by_all)
all_result_df = pd.DataFrame({'Hyper parameter':['All'],
'Test Accuracy':[test_accuracy]})
result_df = pd.concat([result_df, all_result_df],
ignore_index=True)

 predict 한 결과, 모든 파라미터를 조정했을 때(All), 정확도가 0.982

으로 가장 크게 나왔다.

# 시각화
tmp_df = result_df.sort_values('Test Accuracy',
ascending=False)
plt.figure(figsize=(8, 6))
plt.bar(tmp_df['Hyper parameter'], tmp_df['Test Accuracy'],
color='navy')
plt.xlabel('Hyper parameter Tuning Method')
plt.ylabel('Test Accuracy')
plt.title('Test Accuracy for Different Hyperparameter Tuning
Methods', size=16 )
plt.ylim(0.98, 0.985) # Y 축 범위 설정
plt.show()
 시각화로 결과를 나타내었다. 모든 Test Accuracy 가 비슷비슷하긴
하지만, All 이 0.982 에 가까운 가장 좋은 성능을 가진 것을 확인할 수
있다.

Bonus Q1
CNN 을 통한 MNIST 구현
# MNIST
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# 데이터 전처리: 이미지를 28x28x1 로 변형하고 픽셀 값을 [0, 1]

범위로 조정
x_train = x_train.reshape((60000, 28, 28,
1)).astype('float32') / 255
x_test = x_test.reshape((10000, 28, 28,
1)).astype('float32') / 255

# 훈련 세트에서 마지막 10000 개 이미지를 검증 세트로 사용

x_val = x_train[50000:]
y_val = y_train[50000:]
x_train = x_train[:50000]
y_train = y_train[:50000]

# 레이블을 원-핫 인코딩으로 변환

y_train = to_categorical(y_train, 10)
y_val = to_categorical(y_val, 10)
y_test = to_categorical(y_test, 10)

# Convolu
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(28, 28, 1))) # 흑백이므로 채널이 1
model.add(layers.MaxPooling2D((2, 2))) # pooling
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2))) # pooling
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

# Dense 레이어 추가 -> 이 전에 했던 것과 유사하게

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

# 모델 컴파일
# rmsprop: 지수 이동 평균 기법을 적용하여 최근 값의 영향은 더욱
크고, 오래된 값의 영향은 대폭 낮추는 알고리즘
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])

# 모델 훈련
model.fit(x_train, y_train, epochs=20, batch_size=64,
validation_data=(x_val, y_val))

# 테스트 세트를 사용하여 모델 평가

test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test accuracy:', test_acc)

CNN_result_df = pd.DataFrame({'Hyper parameter':['CNN'],

'Test Accuracy':[test_acc]})
result_df = pd.concat([result_df, CNN_result_df],
ignore_index=True)

 CNN 을 사용하여 이미지 분류를 위한 모델을 만들고 평가.

tmp_df
# 데이터프레임 시각화
tmp_df = result_df.sort_values('Test Accuracy',
ascending=False)
tmp_df = tmp_df.iloc[:2,]
plt.figure(figsize=(6, 6))
plt.bar(tmp_df['Hyper parameter'], tmp_df['Test Accuracy'],
color='navy')
plt.xlabel('Hyper parameter Tuning Method')
plt.ylabel('Test Accuracy')
plt.title('Test Accuracy for Different Model', size=16)
plt.ylim(0.96, 1) # Y 축 범위 설정
plt.show()

결과를 표와 시각화로 나타내보았다. CNN 이 모든 파라미터를 다 튜닝한

모델보다 더 높은 정확도를 가지는 것을 볼 수 있다. CNN 이 0.9907, All 이
0.9819 로 약 0.01 의 정확도 차이가 난다.

딥러닝 용어

출처: https://www.slideshare.net/w0ong/ss-82372826

한 번의 epoch 는 전체 데이터 셋에 대해 forward pass/backward pass

Epoch 과정을 거친 것.
즉, 전체 데이터 셋에 대해 한 번 학습을 완료한 상태

batch
size 학습 할 때, 샘플을 나누는 단위 (=mini batch)

iteration
batch_size 로 나눠진 샘플에 대해 학습하는 횟수 (파라미터 업데이트
되는 단위)

Example)
train data: 54,000 개
epochs: 100
batch size: 32
라고 한다면 전체 데이터에 대해 100 번 학습하게 되는 것이고, 1epochs
학습에 필요한 iteration 은 54,000/3254,000/32 = 1688 번의 iterations 가
나오고, 실제로 아래와 같이 코드가 구현될 때 확인 가능하다.

CNN
이미지를 Neural Network 로 처리할 때의 문제점?
→ Flatten 과정에서 픽셀 간의 상호 관계가 깨져버림!
→ 배경까지 학습하여, overfitting 의 가능성 증가

CNN 의 배경
🗣
“인간이 이미지를 인식하듯이, 이미지의 패턴, 특징을 추출해서 모델을
학습시키자”
Convolution Layer Fully connected Layer

Convolution layer(합성곱 신경망)

이미지 행렬에 대해 필터(커널, 마스크)를 곱하고, 이동하는 과정을 통해

이미지의 특징을 추출한다. (필터의 가중치들은 모델이 학습하면서
업데이트 됨)

Padding

convolution 을 수행하면, 이미지의 크기가 점점 작아지며 엣지 부분에 있는

정보들이 유실된다. 만약 이미지의 엣지에 중요한 특징이 있을 경우,
비효율적인 학습을 하게 되므로 오른쪽 그림처럼 0 으로 행렬을 둘러싸서
convolution 과정 이후에도 이미지의 크기가 유지되도록 한다.

Pooling

연산량을 줄이기 위해, 이미지를 압축하는 과정이 Pooling 이며, 모델에

따라 선택적으로 사용된다. Max pooling 과 average pooling 이 있으며,
CNN 에서는 주로 Max pooling 을 사용한다. pooling 과정을 통해 주요
특징을 강조하고, 연산량을 줄일 수 있다. 추가로 노이즈에 강건한 모델을
만들어 일반성이 확보된다.
flatten

Fully connected layer

이후 컨볼루션 층에서 추출된 특징을 바탕으로, Neural Network 에서
구현했듯이, Fully connected layer 에 넣고 이미지의 특징을 학습시킨다.

120 Advanced JavaScript Interview Questions
From Everand
120 Advanced JavaScript Interview Questions
Hernando Abella
No ratings yet
S-5
No ratings yet
S-5
10 pages
exp2
No ratings yet
exp2
3 pages
24 Summer
No ratings yet
24 Summer
23 pages
Assignment 1: Q1. Task Description
No ratings yet
Assignment 1: Q1. Task Description
12 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Ex-3 DL
No ratings yet
Ex-3 DL
3 pages
24CS4505 ML Assignment
No ratings yet
24CS4505 ML Assignment
3 pages
DL_4
No ratings yet
DL_4
7 pages
Big Data Assignment - 7
No ratings yet
Big Data Assignment - 7
7 pages
vertopal.com_HandWritten
No ratings yet
vertopal.com_HandWritten
13 pages
dlweek6
No ratings yet
dlweek6
4 pages
(23mca24) Practical 1 & Practical 2
No ratings yet
(23mca24) Practical 1 & Practical 2
6 pages
Deep Learning
No ratings yet
Deep Learning
30 pages
Training Code
No ratings yet
Training Code
27 pages
(23mca32) Practical 1 & Practical 2
No ratings yet
(23mca32) Practical 1 & Practical 2
9 pages
TA1_U1_DL
No ratings yet
TA1_U1_DL
3 pages
DLP Lab
No ratings yet
DLP Lab
81 pages
Experiment 4
No ratings yet
Experiment 4
4 pages
ml_fat
No ratings yet
ml_fat
9 pages
AIML Lab Prog
No ratings yet
AIML Lab Prog
15 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Assignment 3 DS5620
No ratings yet
Assignment 3 DS5620
11 pages
Deep Learning Perceptron
No ratings yet
Deep Learning Perceptron
10 pages
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
ANN_EXPERIENTIAL_LEARNING
No ratings yet
ANN_EXPERIENTIAL_LEARNING
43 pages
Data Mining
No ratings yet
Data Mining
20 pages
12212143
No ratings yet
12212143
32 pages
lab-report-03
No ratings yet
lab-report-03
14 pages
program 5
No ratings yet
program 5
3 pages
Import Numpy as Np
No ratings yet
Import Numpy as Np
5 pages
Nibedita Dehury, 123CE0079, Assignment 7
No ratings yet
Nibedita Dehury, 123CE0079, Assignment 7
15 pages
6 Neural Network
No ratings yet
6 Neural Network
4 pages
202203103510493
No ratings yet
202203103510493
6 pages
DL Record
No ratings yet
DL Record
37 pages
ex3TP1
No ratings yet
ex3TP1
17 pages
Ritik DL
No ratings yet
Ritik DL
17 pages
DL_2
No ratings yet
DL_2
5 pages
Lab Manual DL (New)
No ratings yet
Lab Manual DL (New)
89 pages
Deep Learning Program
No ratings yet
Deep Learning Program
5 pages
Code
No ratings yet
Code
4 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
System Impl
No ratings yet
System Impl
10 pages
Artificial Intelligence May Minor Project
No ratings yet
Artificial Intelligence May Minor Project
8 pages
딥러닝 기반 의미론적 분할 기법을 통한 건물 자동추출 연구 모델의 가중치 경중과 전이학습에
No ratings yet
딥러닝 기반 의미론적 분할 기법을 통한 건물 자동추출 연구 모델의 가중치 경중과 전이학습에
11 pages
Skill7
No ratings yet
Skill7
11 pages
Ccnet Only
No ratings yet
Ccnet Only
6 pages
Softmax Regression Mnist
No ratings yet
Softmax Regression Mnist
3 pages
naan mudhalvan
No ratings yet
naan mudhalvan
24 pages
ai int-1
No ratings yet
ai int-1
6 pages
Dlv Lab Manual Print
No ratings yet
Dlv Lab Manual Print
29 pages
bldd_VIT_ResNet50v2_CustomCNN
No ratings yet
bldd_VIT_ResNet50v2_CustomCNN
38 pages
exp 4 ml
No ratings yet
exp 4 ml
2 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
10 pages
DeepLearningLab2.Ipynb - Colab
No ratings yet
DeepLearningLab2.Ipynb - Colab
7 pages
Training and Testing Neural Networks
No ratings yet
Training and Testing Neural Networks
28 pages
Vertopal.com HW4ML Project Code
No ratings yet
Vertopal.com HW4ML Project Code
24 pages
COMBINING MODELS FOR VOTING
No ratings yet
COMBINING MODELS FOR VOTING
5 pages