Handwritten Character Recognition With Neural Network
Handwritten Character Recognition With Neural Network
Project Prerequisites
Below are the prerequisites for this project:
Download Dataset
The dataset for this project contains 372450 images of alphabets of 28×2,
all present in the form of a CSV file:
Handwritten character recognition dataset
Splitting the data read into the images & their corresponding labels. The ‘0’
contains the labels, & so we drop the ‘0’ column from the data dataframe
read & use it in the y to form the labels.
Reshaping the data in the csv file so that it can be displayed as an image
train_x, test_x, train_y, test_y = train_test_split(X, y, test_size = 0.2)
train_x = np.reshape(train_x.values, (train_x.shape[0], 28,28))
test_x = np.reshape(test_x.values, (test_x.shape[0], 28,28))
print("Train data shape: ", train_x.shape)
print("Test data shape: ", test_x.shape)
In the above segment, we are splitting the data into training &
testing dataset using train_test_split().
Also, we are reshaping the train & test image data so that they can
be displayed as an image, as initially in the CSV file they were
present as 784 columns of pixel data. So we convert it to 28×28
pixels.
word_dict =
{0:'A',1:'B',2:'C',3:'D',4:'E',5:'F',6:'G',7:'H',8:'I',9:'J',10:'K',11:'L',12:'M',13:'N',14:'O',15:'P',16:'Q',17:'R',18:'S',19:
'T',20:'U',21:'V',22:'W',23:'X', 24:'Y',25:'Z'}
All the labels are present in the form of floating point values, that
we convert to integer values, & so we create a dictionary
word_dict to map the integer values with the characters.
Plotting the number of alphabets in the dataset
y_int = np.int0(y)
count = np.zeros(26, dtype='int')
for i in y_int:
count[i] +=1
alphabets = []
for i in word_dict.values():
alphabets.append(i)
fig, ax = plt.subplots(1,1, figsize=(10,10))
ax.barh(alphabets, count)
plt.xlabel("Number of elements ")
plt.ylabel("Alphabets")
plt.grid()
plt.show()
(The above image depicts the grayscale images that we got from the
dataset)
Data Reshaping
Reshaping the training & test dataset so that it can be put in the model
train_X = train_x.reshape(train_x.shape[0],train_x.shape[1],train_x.shape[2],1)
print("New shape of train data: ", train_X.shape)
test_X = test_x.reshape(test_x.shape[0], test_x.shape[1], test_x.shape[2],1)
print("New shape of train data: ", test_X.shape)
Now we reshape the train & test image dataset so that they can be put in the model.
New shape of train data: (297960, 28, 28, 1)
New shape of train data: (74490, 28, 28, 1)
Now we reshape the train & test image dataset so that they can be put in the
model.
Here we convert the single float values to categorical values. This is done as
the CNN model takes input of labels & generates the output as a vector of
probabilities.
What is CNN?
CNN stands for Convolutional Neural Networks that are used to extract the
features of the images using several layers of filters.
The convolution layers are generally followed by maxpool layers that are
used to reduce the number of features extracted and ultimately the output
of the maxpool and layers and convolution layers are flattened into a vector
of single dimension and are given as an input to the Dense layer (The fully
connected network).
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(3, 3), activation='relu', input_shape=(28,28,1)))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu', padding = 'same'))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Conv2D(filters=128, kernel_size=(3, 3), activation='relu', padding = 'valid'))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Flatten())
model.add(Dense(64,activation ="relu"))
model.add(Dense(128,activation ="relu"))
model.add(Dense(26,activation ="softmax"))
Above we have the CNN model that we designed for training the model over
the training dataset.
Now we are getting the model summary that tells us what were the different
layers defined in the model & also we save the model
using model.save() function.
(Summary of the defined model)
In the above code segment, we print out the training & validation
accuracies along with the training & validation losses for character
recognition.
Doing Some Predictions on Test Data
fig, axes = plt.subplots(3,3, figsize=(8,9))
axes = axes.flatten()
for i,ax in enumerate(axes):
img = np.reshape(test_X[i], (28,28))
ax.imshow(img, cmap="Greys")
pred = word_dict[np.argmax(test_yOHE[i])]
ax.set_title("Prediction: "+pred)
ax.grid()
Now we make a prediction using the processed image & use the
np.argmax() function to get the index of the class with the highest
predicted probability. Using this we get to know the exact
character through the word_dict dictionary.
This predicted character is then displayed on the frame.
while (1):
k = cv2.waitKey(1) & 0xFF
if k == 27:
break
cv2.destroyAllWindows()
Handwritten characters have been recognized with more than 97% test
accuracy. This can be also further extended to identifying the handwritten
characters of other languages too.