This document presents research on handwritten text recognition and translation with audio. The researchers used convolutional neural networks to classify handwritten words and characters. For word classification, they directly classified whole words. For character classification, they first separated characters from words using an LSTM model and then classified each character. They trained models on the IAM Handwriting Dataset containing over 100k words. Pre-processing steps like noise removal, skew correction, and line segmentation were used to prepare images for classification. Both word-level and character-level classification models were explored. The character classification approach showed better results due to a smaller output size for the softmax layer. The recognized text could then be translated and output with audio.