KT 01 Intro2Keras
KT 01 Intro2Keras
CS-F441
● Everyone has been interacting with Keras as it is in use at Netflix, Uber, Yelp, Instacart, Zocdoc, Square, and many others.
Advantages:
○ It offers consistent & simple APIs i.e. it minimizes the number of user actions required for common use cases.
○ It also provides clear and actionable feedback upon user error.
○ This ease of use does not come at the cost of reduced flexibility, because Keras integrates with lower-level deep learning
languages (in particular TensorFlow).
○ Runs seamlessly on CPUs and GPUs.
2
Installation
1. Install Engine
a. Keras backend engines: TensorFlow, Theano, or CNTK.
b. It is recommended to install tensorflow because it can be deployed in production via Tensorflow Serving.
(See the instructions from https://www.tensorflow.org/install/)
2. To install keras on Linux/Mac use,
a. $ sudo pip install keras
Source: https://keras.io/
3
Keras Models
1. The core data structure of Keras is a model, a way to organize layers.
2. The simplest type of model is the Sequential model, a linear stack of layers.
3. For more complex architectures, Keras functional API can be used that allows to build arbitrary graphs of layers.
The sequential API allows you to create models layer-by-layer for most problems. It is limited in that it does not allow you to
create models that share layers or have multiple inputs or outputs.
Alternatively, the functional API allows you to create models that have a lot more flexibility as you can easily define models
where layers connect to more than just the previous and next layers. In fact, you can connect layers to any other layer. As a
result, creating complex networks such as siamese networks and residual networks become possible.
4
Keras Sequential Model
● The Sequential model is a linear stack of layers.
● A Sequential model can be created by passing a list of layer instances to the constructor:
model = Sequential([
Dense(32, input_shape=(784,)),
Activation('relu'),
Dense(10),
Activation('softmax'),
])
model = Sequential()
model.add(Dense(32, input_shape=(784,)))
model.add(Activation('relu'))
5
1. Specifying the input shape
The first layer in a Sequential model expects information regarding the shape of the input.
This can be done by passing a value to the input_shape argument to the first layer. Batch size is not included to this argument.
model = Sequential()
model.add(Dense(32, input_shape=(784,)))
Note: The model will take as input arrays of shape (*, 784) and output arrays of shape (*, 32).
6
2. Compilation
The learning process is configured via the compile method.
model.compile(optimizer='rmsprop',loss='binary_crossentropy', metrics=['accuracy'])
7
3. Training
Keras models are trained on Numpy arrays of input data and labels. For training a model, the fit function is used.
● x is the input data. A Numpy array (or array-like), or a list of arrays (in case the model has multiple inputs).
● y is the target data. A Numpy array (or array-like), or a list of arrays (in case the model has multiple inputs).
● batch_size: Integer or None. Number of samples per gradient update. If unspecified, batch_size will default to 32.
● epochs: Integer. Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided.
● verbose: Integer. 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch.
● Callbacks: A callback is a set of functions that can be used to get a view on internal states and statistics of the model during
training such as History() or ModelCheckpoint() etc.
● validation_split: Float between 0 and 1. Fraction of the training data to be used as validation data.
● shuffle: Boolean (whether to shuffle the training data before each epoch).
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(10, size=(1000, 1))
5. Evaluate
Returns the loss value & metrics values for the model in test mode.
evaluate(x=None, y=None, batch_size=None, verbose=1, sample_weight=None, steps=None, callbacks=None)
● batch_size: Integer or None. Number of samples per gradient update. If unspecified, batch_size will default to 32.
● verbose: Verbosity mode, 0 - silent or 1- progress bar.
● steps: Total number of steps (batches of samples) before declaring the prediction/evaluation round finished. Ignored with the default value of None.
9
Keras functional API
● The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or
models with shared layers.
● The following example includes all layers required in the computation of b given a.
a = Input(shape=(32,))
b = Dense(32)(a)
model = Model(inputs=a, outputs=b)
● In the case of multi-input or multi-output models, you can use lists as well:
● It also has compile(), fit(), predict() and evaluate() as the Sequential Model.
10
Keras Layers
1. Conv2D
This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs.
2. Conv2DTranspose
This refers to the transposed convolution layer (sometimes called Deconvolution). The need for transposed convolutions
generally arises from the desire to use a transformation going in the opposite direction of a normal convolution.
11
3. MaxPooling2D
This layer performs Max pooling operation for spatial data.
4. Dense
Dense implements the operation: output = activation(dot(input, kernel) + bias) where activation is the element-wise
activation function passed as the activation argument, kernel is a weights matrix created by the layer, and bias is a bias vector
created by the layer (only applicable if use_bias is True).
model.add(Dense(32, input_shape=(16,)))
5. Dropout
Dropout consists in randomly setting a fraction rate of input units to 0 at each update during training time, which helps prevent
overfitting.
12
6. Flatten
Flattens the input. Does not affect the batch size.
Here, data_format can have value either channels_last (default) or channels_first. The ordering of the dimensions in the
inputs. The purpose of this argument is to preserve weight ordering when switching a model from one data format to another.
keras.layers.Flatten(data_format=None)
7. Batch Normalization
Normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close
to 0 and the activation standard deviation close to 1.
keras.layers.BatchNormalization(axis=-1)
13
Coding Example
The Problem: Fashion MNIST classification
14
1. Importing Libraries
import numpy as np
from keras.utils import to_categorical
import matplotlib.pyplot as plt
#matplotlib inline
from keras.datasets import fashion_mnist
from sklearn.model_selection import train_test_split
import keras
15
2. Load the Data
16
3. Data Preprocessing
17
4. Model
batch_size = 64
epochs = 20
num_classes = 10
#Architecture
fashion_model = Sequential()
fashion_model.add(Conv2D(32, kernel_size=(3,3),activation='linear',input_shape=(28,28,1),padding='same'))
fashion_model.add(LeakyReLU(alpha=0.1))
fashion_model.add(MaxPooling2D((2, 2),padding='same'))
fashion_model.add(Conv2D(64, (3, 3), activation='linear',padding='same'))
fashion_model.add(LeakyReLU(alpha=0.1))
fashion_model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
fashion_model.add(Conv2D(128, (3, 3), activation='linear',padding='same'))
fashion_model.add(LeakyReLU(alpha=0.1))
fashion_model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
fashion_model.add(Flatten())
fashion_model.add(Dense(128, activation='linear'))
fashion_model.add(LeakyReLU(alpha=0.1))
fashion_model.add(Dense(num_classes, activation='softmax'))
#Compilation of model
fashion_model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),metrics=['accuracy'])
fashion_model.summary()
18
5. Model Evaluation
accuracy = fashion_train.history['acc']
val_accuracy = fashion_train.history['val_acc']
loss = fashion_train.history['loss']
val_loss = fashion_train.history['val_loss']
epochs = range(len(accuracy))
plt.plot(epochs, accuracy, 'bo', label='Training accuracy')
plt.plot(epochs, val_accuracy, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
19
5. Predict Labels
predicted_classes = fashion_model.predict(test_X)
predicted_classes = np.argmax(np.round(predicted_classes),axis=1)
predicted_classes.shape, test_Y.shape
correct = np.where(predicted_classes==test_Y)[0]
print ('Found %d correct labels',len(correct))
for i, correct in enumerate(correct[:9]):
plt.subplot(3,3,i+1)
plt.imshow(test_X[correct].reshape(28,28), cmap='gray', interpolation='none')
plt.title("Predicted {}, Class {}".format(predicted_classes[correct], test_Y[correct]))
plt.tight_layout()
plt.show()
incorrect = np.where(predicted_classes!=test_Y)[0]
print ("Found %d incorrect labels",len(incorrect))
for i, incorrect in enumerate(incorrect[:9]):
plt.subplot(3,3,i+1)
plt.imshow(test_X[incorrect].reshape(28,28), cmap='gray', interpolation='none')
plt.title("Predicted {}, Class {}".format(predicted_classes[incorrect], test_Y[incorrect]))
plt.tight_layout()
plt.show()
20
Prediction
Correct Incorrect
21
6. Generating Classification Report
22
Homework
1. Code the model discussed the class and get the results for different architectural settings
- Change layers
- Add dropout
- Change optimizers and parameters
2. Apply same on MNIST database
Report what changes you observe in accuracy. What is the best accuracy you have achieved?
Thank You !