Resume Recommendation Using RNN Classification and Cosine Similarity
Resume Recommendation Using RNN Classification and Cosine Similarity
Abstract. One way to save time and resources in the human recruitment and
hiring process is to post open job positions on the Internet, but the overload of
applications creates challenges for hiring managers and companies to select an
adequate candidate. One of the solutions is to apply intelligent tools such as Deep
Learning and recommender system algorithms to speed up the hiring process and
identify the right candidates. In this paper, we propose a two-fold algorithmic
approach to 1) building an RNN classifier for resume classification; 2) using
cosine similarity for resume recommendation to find a candidate that fits job
requirements best after selecting all the resumes that belong to the right category.
The performance of the proposed RNN classifier is evaluated in terms of accuracy,
precision, recall, F1-score, and confusion matrix criteria. The experiment results
have shown that the RNN classifier performs better than the other classifiers such
as GNB, Linear SVM, RF, and BERT on the same dataset.
1 Introduction
The evolution of technology has led to the widespread use of big data analytics and
artificial intelligence in business management. Still, there have been few success stories
concerning the application of such methods and techniques to effective Human Resource
management, mainly to provide support for recruiting and corporate population man-
agement in large organizations [1]. One area where these tools can be particularly useful
is the process of selecting candidates for job positions. Reviewing resumes and identi-
fying the most qualified candidates can be a time-consuming and labour-intensive task
for employers. Developing a system for automatically classifying resumes based on job
qualifications and other relevant information can make the process more efficient.
In this paper, we propose a two-fold algorithmic approach to: 1) building a recurrent
neural network (RNN) classifier for resume classification 2) using cosine similarity
for resume recommendation to rank the best candidates that fit job requirements after
their resumes have passed the categorical test (they are in the same category as the
job description) with the RNN model. The performance of the proposed classifier is
evaluated in terms of accuracy, precision, recall, F1-score, and confusion matrix criteria.
The experiment results have shown that the RNN classifier performs better than the other
classifiers such as GNB, Linear SVM, RF, and BERT on the same dataset.
The paper is organized as follows: Sect. 2 reviews research works related to resume
classification and recommender systems. In Sect. 3, we present the proposed two-fold
algorithm to build an RNN classifier and filter the resumes in the designated class to
recommend the most qualified candidate. Also, here we present experimental results
and evaluation of the performance of the proposed model in terms of accuracy, precision
call, F1-score, and the confusion matrix. Section 4 presents the similarity computation
between the filtered resumes and the job description. Finally, Sect. 5 is devoted to the
discussion of the results of the paper, as well as further research issues.
2 Related Works
In this section, we first review research papers related to resume classifiers, then papers
related to resume recommenders.
The paper [2] proposes a two-phase process of resume classification and content-based
ranking using cosine similarity. The approach seems to be like the approach presented
in our paper, but we proposed an RNN classifier and compared it with the classifiers
given in [2]. The performance of the RNN classifier overperformed the performance of
classifiers presented in [2].
This paper [3] proposes a block-level bidirectional recurrent neural network
(BBRNN) model for resume block classification, which considers the contextual order
of different blocks within a resume. The authors argue that existing text classification
algorithms fail to consider the structural features of resumes, such as the order of blocks,
resulting in suboptimal classification results. The performance of the proposed model is
6% to 9% higher than existing methods. The paper also discusses related work on resume
information extraction and block classification methods. However, the paper could ben-
efit from further discussion of limitations and potential future research directions. We
chose simple RNN because of the efficient computation, simple architecture, and ability
to train.
The article [4] presents the use of machine learning algorithms such as Naive Bayes,
Random Forest, and SVM for resume classification and skill extraction. Also, an NLP
method TF-IDF is used for vectorization.
The resume classification system using natural language processing and machine
learning techniques is integrated in [5]. The paper [6] is devoted to the classification
problem of research papers. A comprehensive survey on applications of deep learning
techniques using NLP is given in [7]. In the paper [8], resume information extraction
with a cascaded hybrid model is presented.
98 I. Huseyinov et al.
This paper [9] proposes a two-stage strategy which is first extracting all the informa-
tion from the candidate’s resumes and then using the BERT sentence pair classification
which allows the ranking based on the job description. The proposed model used in
[9], BERT, which is a transformer model has shown good results in the world of NLP.
Bidirectional Encoder Representations (BERT) [10] from Transformers, better known
as BERT, is a revolutionary paper by Google that increased the state-of-the-art perfor-
mance for various NLP tasks and was the stepping stone for many other revolutionary
architectures [10].
Data Preprocessing
Preprocessing is the process of transforming unstructured data into a form that can be
used to create and train machine learning models. Since they directly affect the task
success rate [13]. The dataset used in this study was downloaded from Kaggle.com [14]
and is in CSV format. The dataset contains two columns: Category, which contains the
different classes of resumes, and Resume, which contains the full description of the
resumes. There are 25 categories in the dataset, as shown in Fig. 2 and the data contains
962 rows. We observed that the most common category in the dataset is Java Developer.
There were no missing values in the dataset.
To preprocess the data, we used natural language processing techniques (NLP) of
libraries such as Spacy and NTLK. We performed the following transformations on the
data: stop words removal, text conversion to lower case, URL removal, and replace-
ment of consecutive non-ASCII characters with a space, punctuation removal, men-
tion removal, and extra whitespace removal. An example of the data before and after
preprocessing is shown in Table 1.
100 I. Huseyinov et al.
The resulting data is cleaner and more meaningful, allowing us to gain insights into
the characteristics of the different resume categories.
Feature Extraction
Feature Extraction improves the efficiency and accuracy of machine learning to better
serve their intended purpose [15]. Since our data contains only one main feature – the
cleaned text of the resume descriptions which was obtained after the data preprocessing.
To build a more accurate predictive model, we need to extract additional features such as
the skills, education, and work experience of the job applicants. To extract these features,
we developed a custom function that utilizes NLP techniques. The function first identified
sections of the resume where there might be information about features: skills, education,
and work experience, and then extracted these features from the corresponding sections.
For skills, the function identified relevant keywords and phrases using techniques such
as part-of-speech tagging and named entity recognition. For education, the function
identified the highest degree attained and the field of study, as for work experience, we
get the different domains of work experience the candidates have because those work
experiences might be related to the candidate’s main field or not. For work experience,
in the future, we will improve the function to identify the number of years of experience
and the job titles held (Fig. 3).
Feature Vectorization
We used the TextVectorization layer from tf. Keras. layers module for feature vectoriza-
tion in our deep learning model. We especially used the max_tokens parameters to set
102 I. Huseyinov et al.
the maximum number of unique tokens(words) to consider in the vocabulary, and it was
set to 20,000 in our case. This layer is commonly used in natural language processing
tasks to convert text inputs into numerical vectors that can be fed into a deep learning
model. The TF-IDF (Term Frequency Inverse Document Frequency) is utilized for text
vectorization.
RNN Parameters
The number of layers is as follows: 1 (SimpleRNN layers), number of units in each
layer: 128, activation function: ‘relu’ (in dense layers) and default activation function
(tanh) for simple RNN layers, learning rate: 0.001 (in Adam optimizer).
Functions
The input function processes the input by applying the encoder layer to convert text to
numerical, the output function generates the output based on the final dense layer with
a SoftMax activation function, which predicts the probability distribution across the 25
possible categories, and the hidden function maintains the internal state of the model
by updating the current state based on the previous state and the current input using the
SimpleRNN layers.
Training
The RNN model is trained to classify resumes into one of 25 categories using a categorical
cross-entropy loss function and the Adam optimizer. The training data is represented
as numerical vectors obtained from the encoder layer, and the labels are converted
into numerical vectors using the preprocessing LabelEncoder function from the sklearn
library. Early stopping is employed to prevent overfitting, and a learning rate scheduler
is used to adjust the learning rate during training. The model is trained for a maximum
of 50 epochs with a batch size of 32.
Tools
We implemented our proposed deep learning model using Python and the Ten-
sorFlow library. The functions imported from TensorFlow are Sequential which is
imported from TensorFlow.keras.models, [Dense, SimpleRNN, and Flatten] imported
from TensorFlow.keras.layers, [EarlyStopping, LearningRateScheduler] imported from
TensorFlow.keras.callbacks.
104 I. Huseyinov et al.
Accuracy
To evaluate the performance of our RNN model, we compared it to four other commonly
used machine learning models: Gaussian Naive Bayes, Linear Support Vector Machines,
Random Forest classifiers, and a generative model BERT. We used the same dataset to
train and test each of these models and evaluated their accuracy, Precision, F1 and Recall.
The results are presented in the following table (Table 2):
The fact that the performance of our model is almost the same as the BERT is that we
trained the BERT model in 500 epochs while our model was trained in only 50 epochs.
That shows that our RNN architecture is efficient in terms of computational resources.
Confusion Matrix
The confusion matrix is a crucial tool that can help us see where errors in the model were
made. It is a table where the rows represent the actual categories the outcomes should
have been, the columns represent the predicted categories we have made. It can easily
be visualized by utilizing a heatmap plot from the Seaborn Library. It is given in Fig. 5.
As seen in Fig. 5, most of the classes have been classified correctly, with only one
class (DevOps Engineer) being misclassified only once. This is a positive result that
indicates our model’s high level of accuracy and reliability in predicting most of the
classes.
4 Resume Recommendation
4.1 Similarity Function
As seen from the workflow diagram (see Fig. 1), the last step in the resume recommenda-
tion is to test similarity. This is the similarity between the two documents which are the
resumes that passed the categorical test with the RNN and the description of the job. A
cosine similarity function is used because it is a widely implemented metric in informa-
tion retrieval and related studies. This metric model a text document as a vector of terms.
As a result, the similarity between the two documents can be derived by calculating the
cosine of the angle between the two documents’ term vectors [17, 18]. Its value ranges
from -1 to 1, where 1 indicates the two vectors are identical, 0 indicates the two vectors
are orthogonal, and -1 indicates the two vectors are opposed. We calculated the cosine
similarity between each pair of documents in the corpus and used it to identify docu-
ments that are most like each other. To calculate the cosine distance, we first calculated
the cosine similarity between the two vectors as described above. We then subtracted
the cosine similarity from 1 to 11 to obtain the cosine distance, which ranges from 0 to
2, where 0 indicates the two vectors are identical, and 2 indicates the two vectors are
opposed. We used the cosine distance to measure the dissimilarity between documents
and used it to identify documents that are most dissimilar to each other.
and career counselling. However, there is still room for improvement, and future work
can explore the use of more advanced deep learning techniques such as Transformers
or hybrid models that combine RNNs with Convolutional Neural Networks (CNNs) for
better performance.
References
1. Celsi, L.R., Moreno, J.F.C., Kieffer, F., Paduano, V.: HR- specific NLP for the homogeneous
classification of declared and inferred skills. Appl. Artific. Intell. 36(1), 3943–3963 (2022)
2. Roy, P.K., Chowdhary, S.S., Bhatia, R.: A machine learning approach for automation of
resume recommendation system. In: International Conference on Computational Intelligence
and Data Science, Procedia Computer Science, vol. 167, pp. 2318–2327 (2020)
3. Xu, Q., Zhang, J., Zhu, Y., Li, B., Guan, D., Wang, X.: A block-level RNN model for
resume block classification. In. IEEE International Conference on Big Data (Big Data), IEEE,
pp. 5855–5857, Atlanta, GA, USA, December 10–13 (2020)
4. Riya, P., Shahrukh, S., Swaraj, S., Sumedha, B.: Resume classification using various machine
learning algorithms. In: International Conference on Automation, Computing and Commu-
nication, ITM Web of Conferences, vol. 44, pp.1–7, Nerul, Navi Mumbai, India, April 7–8
(2022)
5. Ali, I., Mughal, N., Khand, Z. H., Ahmed, J., Mujtaba, G.: Resume classification system
using natural language processing and machine learning techniques. Mehran Univ. Res. J.
Eng. Technol. 41(1), 65–69 (2022)
6. Kim, S.W., Gil, J.M.: Research paper classification systems based on TF-IDF and LDA
schemes. Hum. Cent. Comput. Inf. Sci. 9(30) (2019)
7. Lavanya, P., Sasikala, E.: Deep Learning techniques on text classification using Natural Lan-
guage Processing (NLP) in social healthcare networks. a comprehensive survey. In. 3rd Inter-
national Conference on Signal Processing and Communication (ICPSC), pp. 603–609, IEEE,
Coimbatore, India (2021)
8. Yu, K., Guan, G., Zhou, M.: Resume information extraction with a cascaded hybrid model.
In. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics
(ACL 2005), pp. 499–506, Ann Arbor, Michgan (2005)
9. Bhatia, V., Rawat, P., Kumar, A., Shah, R.R.: End-to-End Resume Parsing and Finding
Candidates for a Job Description using BERT. https://doi.org/10.48550/arXiv.1910.03089.
Accessed 21 Nov 2022
10. Articles. https://wandb.ai/mukilan/BERT_Sentiment_Analysis/reports/An-Introduction-to-
BERT-And-How-To-Use-It--VmlldzoyNTIyOTA1. Accessed 21 May 2023
11. Zhu, C., et al.: Person-Job Fit: Adapting the Right Talent for the Right Job with Joint
Representation Learning (2018). ArXiv./abs/1810.04040. Accessed 11 Feb 2023
12. Varshini, S.V.P., Kannan, S., Suresh, S., Ramesh, H., Mahadevan, R., Raman, R.C.: Turtle
Score – Similarity Based Developer Analyzer (2022). ArXiv./abs/2205.04876. Accessed 11
March 2023
13. Huseyinov, I., Okocha, O.: A machine learning approach to the prediction of bank customer
churn problem. In: 3rd International Informatics and Software Engineering, (IISEC), pp. 1–5.
IEEE, Ankara, Turkey, December 15–16 (2022)
14. Data card. https://www.kaggle.com/datasets/gauravduttakiit/resume-dataset. Accessed 17 Jan
2023
15. https://www.snowflake.com/guides/featureextraction-machine-learning. Accessed 21 May
2023
16. https://www.ibm.com/topics/recurrent-neural-networks. Accessed 21 May 2023
Resume Recommendation using RNN Classification and Cosine Similarity 107