Physically examining each plant to determine its state of health and determining the disease if plant is affected due to it, is challenging. The encoder - decoder approach is proposed for describing health of cauliflower plant in English, Hindi, and Marathi languages from aerial images. Experiments are performed with different convolutional neural network (CNN) models and long short-term memory (LSTM) combinations. The multilanguage cauliflower captions dataset (MCCD) is developed to evaluate the performance of the model. The dataset contains 1213 images where each image is described in 3 different languages. The dataset contains images of cauliflower plant affected due to bacterial spot rot, black rot, and downy mildew diseases. It also contains images of healthy plant. The objective metrics such as bilingual evaluation understudy (BLEU) scores and subjective criteria are used to decide the quality of the generated description.