0% found this document useful (0 votes)
1 views

Report Model

The project report titled 'Job Recommendation Using AI: Guiding Careers Through Skill-Based Insights' presents a system that utilizes Natural Language Processing (NLP) and machine learning to analyze resumes and provide personalized job recommendations. The objective is to enhance the job application process by automating skill identification and improving the accuracy of job matches, addressing inefficiencies in traditional recruitment methods. The report outlines the system's design, implementation, and testing, emphasizing its potential to transform career guidance through data-driven insights.

Uploaded by

sriramsri687
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Report Model

The project report titled 'Job Recommendation Using AI: Guiding Careers Through Skill-Based Insights' presents a system that utilizes Natural Language Processing (NLP) and machine learning to analyze resumes and provide personalized job recommendations. The objective is to enhance the job application process by automating skill identification and improving the accuracy of job matches, addressing inefficiencies in traditional recruitment methods. The report outlines the system's design, implementation, and testing, emphasizing its potential to transform career guidance through data-driven insights.

Uploaded by

sriramsri687
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

JOB RECOMMENDATION USING AI

GUIDING CAREERS THROUGH SKILL-BASED


INSIGHTS

A PROJECT REPORT

Submitted by

JEFFRIN RIJO V C 113320243012


SANTHOSH T 113320243036
ARUNESHWAR V 113320243047

In partial fulfillment for the award of the degree


of
BACHELOR OF TECHNOLOGY
In

ARTIFICIAL INTELLIGENCE & DATA SCIENCE


VELAMMAL INSTITUTE OF TECHNOLOGY
CHENNAI 601204
ANNA UNIVERSITY :: CHENNAI 600 025
MAY 2024
ANNA UNIVERSITY: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report “JOB RECOMMENDATION USING AI


GUIDING CAREERS THROUGH SKILL-BASED INSIGHTS” is the
bonafide work of “SANTHOSH T (113320243036), JEFFRIN RIJO V.C.
(113320243012) ARUNESHWAR V (113320243047)” who carried out the
project work under my supervision.

SIGNATURE SIGNATURE

Dr.S. PADMA PRIYA., Dr.S. PADMA PRIYA.,


PROFESSOR, PROFESSOR,
HEAD OF THE DEPARTMENT , HEAD OF THE DEPARTMENT ,
Artificial Intelligence & Data Science., SUPERVISOR,
Velammal Institute of Technology , Artificial Intelligence & Data Science,
Velammal Gardens, Panchetti , Velammal Institute of Technology,
Chennai-601 204. Velammal Gardens, Panchetti ,
Chennai-601 204.
VIVA-VOCE EXAMINATION

Submitted for the Project Viva-Voce held on __________ at VELAMMAL


INSTITUTE OF TECHNOLOGY CHENNAI 601 204

JEFFRIN RIJO V C 113320243012


SANTHOSH T 113320243036
ARUNESHWAR V 113320243047

INTERNAL EXAMINER EXTERNAL EXAMINER


ACKNOWLEDGEMENT

We are personally indebted to many who had helped us during the course of
this project work. Our deepest gratitude to the God Almighty.

We are greatly and profoundly thankful to our beloved Chairman


Thiru.M.V.Muthuramalingam for facilitating us with this opportunity. Our sincere
thanks to our respected Director Thiru.M.V.M Sasi Kumar for his consent to take up
this project work and make it a great success.

We are also thankful to our Advisors Shri.K.Razak, Shri.M.Vaasu our


Principal Dr.N.Balaji and our Vice Principal Dr.S.Soundararajan for their never
ending encouragement that drives us towards innovation.

We are extremely thankful to our Head of the Department and Project Coordinator
Dr.S. Padma Priya for their valuable teachings and suggestions.

From the bottom of our heart with profound reference and high regards, we
would like to thank our Supervisor Dr.S. Padma Priya who has been the pillar of this
project without whom we would not have been able to complete the project
successfully.

The Acknowledgment would be incomplete if we would not mention word of thanks


to our Parents, Teaching and Non-Teaching Staffs, Administrative Staffs and
Friends for their motivation and support throughout the project. Thank you one and
all.

iv
ABSTRACT

The system ingests resume data, harnessing the power of NLP to extract and

analyze nuanced information embedded within the text. By deciphering the

language of resumes, skills gains deep insights into users' professional

competencies and preferences. The algorithm then employs this enriched data to

provide personalized job role recommendations, transforming the job-seeking

experience into a tailored and efficient process. This fusion of NLP and machine

learning not only enhances the accuracy of skill assessment but also ensures that

users receive targeted and relevant career suggestions, marking a significant

advancement in the field of data-driven career guidance. Skills emerge as a

pioneering solution, empowering individuals to make informed decisions about

their professional futures with confidence.

v
TABLE OF CONTENTS

CHAPTER NO TITLE PAGE NO

ABSTRACT v
LIST OF FIGURES ix
LIST OF ABBREVIATIONS x

1. INTRODUCTION 1
1.1 OVERVIEW 2
1.2 OBJECTIVE 3
1.3 LITERATURE SURVEY 3

2. SYSTEM ANALYSIS 6
2.1 EXISTING SYSTEM 7
2.1.1 DISADVANTAGES 7
2.2 PROPOSED SYSTEM 8
2.1.2 ADVANTAGES 9

3. SYSTEM REQUIREMENTS 11
3.1 HARDWARE REQUIREMENTS 12
3.2 HARDWARE DESCRIPTION 12
3.2.1 PROCESSOR 12
3.2.2 RANDOM ACCESS MEMORY 13
3.2.3 GRAPHICS PROCESSING UNIT 13
3.2.4 STORAGE 13

vi
3.3 SOFTWARE REQUIREMENTS 13
3.4 SOFTWARE DESCRIPTION 13
3.4.1 HTML 14
3.4.2 CSS 14
3.4.3 PYTHON 3.X 14
3.4.4 SPACY 15
3.4.5 MACHINE LEARNING LIBRARIES 15
3.4.6 PYMUPDF 16

4 SYSTEM DESIGN 17
4.1 ARCHITECTURE DIAGRAM 18
4.2 UML DIAGRAM 19
4.2.1 CLASS DIAGRAM 19
4.2.2 USE CASE DIAGRAM 20
4.2.3 ACTIVITY DIAGRAM 21
4.2.4 DATA FLOW DIAGRAM 23

5 SYSTEM IMPLEMENTATION 24
5.1 LIST OF MODULES 25
5.2 MODULE DESCRIPTION 25
5.2.1 DATA PRE-PROCESSING 25
5.2.2 CONVOLUTIONAL NEURAL 26
NETWORK
5.2.3 JOB ROLE RECOMMENDATION 26

vii
6 TESTING 27
6.1 UNIT TESTING 28
6.2 INTEGRATION TESTING 28
6.3 SYSTEM TESTING 28
6.4 TEST CASES 30

7 RESULT AND DISCUSSION 32


7.1 RESULT 33
7.2 DISCUSSION 34

8 CONCLUSION AND FUTURE 36


ENHANCEMENT
8.1 CONCLUSION 37
8.2 FUTURE ENHANCEMENT 37

ANNEXURE 39
APPENDIX 1: SOURCE CODE 40
APPENDIX 2: SAMPLE OUTPUT 46

REFERENCES 47

viii
LIST OF FIGURES

FIGURE NO FIGURE NAME PAGE NO


4.1 Architecture Diagram 18
4.2 Class Diagram 20
4.3 Use Case Diagram 21
4.4 Activity Diagram 22
4.5 Data Flow Diagram 23

ix
LIST OF ABBREVIATIONS

CNN Convolution Neural Network


CPU Central Processing Unit
CSS Cascading Style Sheet
DFD Data Flow Diagram
GPU Graphics Processing Unit
HDD Hard Disk Drive
HTML Hyper Text Markup Language
ML Machine Learning
NLP Natural Language Processing
NER Named Entity Recognition
PDF Portable Document Format
RAM Random Access Memory
RNN Recurrent Neural Network
SSD Solid State Drive
UML Unified Modeling Language

x
CHAPTER 1
INTRODUCTION

1
CHAPTER 1
INTRODUCTION

1.1 OVERVIEW

In the contemporary landscape of career exploration and development, individuals often


encounter the challenge of aligning their unique skill sets with the diverse array of job
opportunities available. To address this issue, the project "Guiding Careers Through Skill-Based
Insights" was conceived. This project employs cutting-edge Natural Language Processing (NLP)
techniques to revolutionize the traditional methods of resume analysis. By delving into the subtle
nuances of language, the system extracts explicit skill information and captures the essence of
professional experiences and preferences embedded in resumes.

At its core, the project boasts a sophisticated algorithm meticulously crafted to decipher the
language of resumes and transform it into actionable insights. Through an iterative process of
learning and adaptation, the platform refines its understanding of industry-specific terms,
evolving to provide increasingly accurate and personalized job role recommendations. Unlike
conventional approaches, the system considers not only technical skills but also soft skills and
industry-specific language, ensuring a holistic understanding of the user's professional journey.

By offering users personalized job role recommendations based on their unique skill profiles, the
platform aims to streamline the job-seeking experience, alleviating the overwhelming burden of
sifting through vast job listings.

This fusion of NLP and machine learning not only addresses the immediate challenge of job
matching but also signifies a stride towards a more nuanced and empathetic approach to career
guidance. In an era where personalization and efficiency are paramount in the realm of career
development, this project signifies a paradigm shift, presenting a comprehensive and tailored
solution.

2
1.2 OBJECTIVE

The primary objective of the project "Guiding Careers Through Skill-Based Insights" is to
enhance the job application process by leveraging Convolutional Neural Network (CNN)
algorithms and Natural Language Processing (NLP) techniques.

This objective is driven by the recognition of the evolving landscape of job recruitment and the
potential to leverage advanced technologies for a more efficient and personalized process.
Traditionally, job matching relies on manual reviews of resumes, which can be time-consuming
and prone to oversight, especially with the increasing volume of job applications. The project
aims to streamline this process for both candidates and recruiters by providing accurate and
personalized job role recommendations.

By training the model on a dataset of job descriptions and their corresponding roles, the CNN
algorithm learns to recognize patterns and associations, enabling it to predict the most suitable
job roles for the given skills. This data-driven approach enhances the accuracy and efficiency of
job matching, addressing the inefficiencies associated with manual processes and keyword-based
matching.

Furthermore, the incorporation of NLP techniques ensures a nuanced analysis of textual content
within resumes, capturing not only technical skills but also soft skills and industry-specific
language. By offering users tailored recommendations based on their unique skill profiles, the
project aims to foster better matches between skills and job opportunities, ultimately improving
the job-seeking experience for both candidates and recruiters.

1.3 LITERATURE SURVEY

[1] Kwieciński, R., Melniczak, G., & Górecki, T. (2023). Comparison of Real-Time and
Batch Job Recommendations. In Proceedings of the International Conference on Artificial
Intelligence and Data Science (ICAIDS), 2023, 78-85.

3
Kwieciński et al. (2023) present a comparative study between real-time and batch job
recommendation systems. Utilizing the RP3Beta model as an example, the research evaluates the
performance of both approaches in a real-world scenario of a job recommendation task. The
study reports a significant increase in user engagement when employing a real-time
recommendation system, suggesting its potential for enhancing the job application process.

[2] Agung, M., Watanabe, Y., Weber, H., Egawa, R., & Takizawa, H. (2021). Preemptive
Parallel Job Scheduling for Heterogeneous Systems Supporting Urgent Computing.
Journal of Parallel and Distributed Computing, 45(3), 212-227.
Agung et al. (2021) propose a parallel job scheduling method for heterogeneous systems to
support urgent computations. The research introduces an in-memory process swapping
mechanism to preempt regular jobs running on coprocessor devices, enabling the execution of
urgent jobs without substantial delays. Simulations demonstrate the effectiveness of the proposed
method in reducing response times and slowdowns of regular jobs while prioritizing urgent
computations.

[3] Khaouja, I., Kassou, I., & Ghogho, M. (2021). A Survey on Skill Identification from
Online Job Ads. International Journal of Human-Computer Studies, 30(4), 389-402.
Khaouja et al. (2021) conduct a comprehensive survey on skill identification from online job ads.
The study systematically reviews existing research articles, categorizing the methods used for
skill identification, the types of skills extracted, and the sectors studied. The research also
discusses the applications, goals, challenges, and recent trends in skill identification from job
postings, offering valuable insights for future research directions.

[4] Ha, T., Lee, M., Yun, B., & Coh, B. (2022). Job Forecasting Based on Patent
Information: A Word Embedding-Based Approach. Journal of Artificial Intelligence and
Data Science, 18(1), 56-71.
Ha et al. (2022) propose a word embedding-based approach for job forecasting based on patent
information. The research matches jobs with patents to forecast future job trends, leveraging
changes in the number of patents over time. A word embedding model trained on patent

4
classification codes and job description data facilitates the identification of promising jobs with
high technical demands, providing insights into the evolving job market.

[5] Van Dongen, G., & Van Den Poel, D. (2021). Influencing Factors in the Scalability of
Distributed Stream Processing Jobs. Proceedings of the ACM Symposium on Cloud
Computing (SoCC), 2021, 102-115.
Van Dongen and Van Den Poel (2021) investigate the scalability of distributed stream
processing jobs in popular frameworks such as Flink, Kafka Streams, Spark Streaming, and
Structured Streaming. The research identifies factors influencing scalability, including cluster
layout, pipeline design, framework design, resource allocation, and data characteristics.
Recommendations are provided for practitioners to effectively scale their clusters and optimize
performance.

5
CHAPTER-2
SYSTEM ANALYSIS

6
CHAPTER-2
SYSTEM ANALYSIS

2.1 EXISTING SYSTEM

In the existing job recruitment system, the process of resume screening and job matching
primarily relies on manual efforts, making it time-consuming and susceptible to human errors.
Recruiters often face challenges in handling the increasing influx of resumes, leading to potential
delays in the hiring process. The absence of a systematic and automated approach to skill
identification and job role prediction contributes to inefficiencies and suboptimal matches
between candidate profiles and job requirements. The traditional system lacks the sophistication
needed to adapt to the evolving demands of the job market. It relies on keyword-based matching,
which may not capture the nuances of candidate skills or the dynamic nature of job roles.
Furthermore, the absence of advanced technologies, such as machine learning and natural
language processing, hinders the system's ability to extract meaningful insights from the textual
content of resumes.

2.1.1 DISADVANTAGES

1. Manual Processes:
The existing job recruitment system heavily relies on manual efforts for resume screening and
job matching, making it time-consuming and prone to human errors. This manual approach not
only increases the workload for recruiters but also introduces the possibility of overlooking
qualified candidates or mismatches between candidate skills and job requirements.

2. Lack of Automation:
The system lacks automation in skill identification and job role prediction, leading to
inefficiencies in the recruitment process. Without automated mechanisms, recruiters may
struggle to handle the increasing influx of resumes, resulting in delays in the hiring process and
missed opportunities for both candidates and employers.

7
3. Keyword-Based Matching:
The existing system predominantly uses keyword-based matching for job recommendations,
which may not capture the nuances of candidate skills effectively. This simplistic approach
overlooks the context in which skills are presented in resumes, leading to suboptimal job
matches and potentially missing out on qualified candidates who possess relevant but differently
expressed skills.

4. Limited Adaptability:
Traditional methods of resume screening and job matching lack adaptability to the evolving
demands of the job market. The system may struggle to keep pace with changes in job
requirements or industry trends, resulting in outdated recommendations and mismatches between
candidate profiles and job roles.

5. Absence of Advanced Technologies:


The existing system operates without leveraging advanced technologies such as machine
learning and natural language processing (NLP). This limits the system's ability to extract
meaningful insights from resume data and provide accurate job recommendations based on
comprehensive skill analysis. As a result, the system may fail to meet the evolving needs of
recruiters and candidates in a dynamic job market.

2.2 PROPOSED SYSTEM

The proposed system aims to overcome the limitations of the existing job recruitment
process by introducing an advanced and automated solution. Leveraging state-of-the-art
technologies such as Convolutional Neural Network (CNN) algorithms and Natural Language
Processing (NLP) techniques, the system offers a more sophisticated approach to resume
screening and job role prediction. In the proposed system, the integration of CNN algorithms
allows for data-driven predictions of suitable job roles based on historical patterns and
associations within a vast dataset of job descriptions.

8
This approach enhances the accuracy and efficiency of job matching, addressing the
inefficiencies associated with manual processes and keyword-based matching. Furthermore, the
incorporation of NLP techniques enables a nuanced analysis of textual content within resumes.
Through methods such as tokenization, Named Entity Recognition (NER), and keyword
extraction, the system can accurately identify and extract relevant skills from resumes. This not
only improves the precision of skill recognition but also ensures a deeper understanding of the
context in which these skills are presented.

2.2.1 ADVANTAGES

1. Enhanced Efficiency:
The proposed system introduces advanced technologies such as Convolutional Neural Network
(CNN) algorithms and Natural Language Processing (NLP) techniques, which streamline the job
application process. By automating tasks such as resume parsing and skill identification, the
system significantly reduces the time and effort required for recruiters to screen resumes and
match candidates with suitable job roles.

2. Personalized Recommendations:
Leveraging CNN algorithms and NLP techniques, the system offers personalized job role
recommendations tailored to each candidate's unique skill profile. By analyzing the contextual
nuances of candidate resumes, the system can provide more accurate and relevant job
suggestions, increasing the likelihood of successful matches between candidates and job
opportunities.

3. Improved Accuracy:
The integration of CNN algorithms enables the system to make data-driven predictions of
suitable job roles based on historical patterns and associations within a vast dataset of job
descriptions. This approach enhances the accuracy of job matching, ensuring that candidates are
presented with job opportunities that closely align with their qualifications and preferences.

9
4. Reduction of Bias:
Automation reduces the potential for biases in the recruitment process, promoting fairness and
objectivity. By standardizing the evaluation criteria and removing human subjectivity from the
screening process, the system helps mitigate unconscious biases that may influence decision-
making in traditional recruiting methods.

5. Scalability and Adaptability:


The proposed system is designed to scale according to the needs of recruiters and candidates,
accommodating a growing volume of job applications and evolving job market trends. With the
ability to continuously learn and adapt through iterative processes, the system remains
responsive to changes in job requirements and industry dynamics, ensuring its relevance and
effectiveness over time.

10
CHAPTER 3
SYSTEM REQUIREMENTS

11
CHAPTER 3
SYSTEM REQUIREMENTS

3.1 HARDWARE REQUIREMENTS

 Processor (CPU) - Multi-core processor (dual-core minimum).


 Random Access Memory (RAM) - Minimum 8 GB RAM.
 Graphics Processing Unit (GPU) - NVIDIA CUDA-enabled recommended.
 Storage - Adequate storage space, preferably SSD.

3.2 HARDWARE DESCRIPTION


3.2.1 PROCESSOR

The processor, or Central Processing Unit (CPU), serves as the brain of the computer,
responsible for executing instructions and computations. In the context of the resume parsing
project, a multi-core processor is preferred to handle parallel processing tasks efficiently. Dual-
core capability, at a minimum, ensures that the system can manage concurrent operations, such
as data preprocessing and model training, effectively. This capability is crucial for optimizing the
overall performance of the project, especially when dealing with large datasets and complex
natural language processing tasks.

3.2.2 RANDOM ACCESS MEMORY

Random Access Memory (RAM) plays a pivotal role in the system's ability to handle and process
data effectively. With a minimum requirement of 8 GB RAM, the system can store and access
data rapidly, reducing latency during memory-intensive tasks. The substantial RAM capacity is
particularly beneficial during machine learning model training, where the system must hold and
manipulate large datasets. Adequate RAM ensures that the system can efficiently perform tasks
such as feature extraction, model evaluation, and other memory-demanding operations,
contributing to the overall responsiveness and speed of the resume parsing project.

12
3.2.3 GRAPHICS PROCESSING UNIT

A Graphics Processing Unit (GPU) can significantly enhance the project's performance,
especially during machine learning model training. A GPU, preferably NVIDIA CUDA-enabled,
accelerates parallel processing tasks by offloading computations from the CPU. This is
particularly advantageous for training complex models on substantial datasets, as the GPU can
handle parallel operations simultaneously, reducing the time required for model convergence.
The GPU's parallel processing capabilities make it well-suited for the computationally intensive
nature of natural language processing tasks involved in resume parsing, providing a boost to
overall system efficiency.

3.2.4 STORAGE

Adequate storage space, preferably Solid State Drive (SSD), is essential for storing the various
components of the resume parsing project. SSDs offer faster read and write speeds compared to
traditional Hard Disk Drives (HDDs), enhancing the system's responsiveness. Sufficient storage
is crucial for housing datasets, trained machine learning models, and project-related files. The
faster data access speeds of an SSD contribute to quicker data retrieval during model training and
resume processing, supporting an efficient and streamlined workflow.

3.3 SOFTWARE REQUIREMENTS

 Front End (Html,Css)


 Python3.x.
 spaCy - Latest version of spaCy with required language models.
 Machine Learning Libraries - scikit-learn or relevant machine learning libraries.
 PyMuPDF - If used for PDF text extraction.

3.4 SOFTWARE DESCRIPTION


3.4.1 FRONT END

13
3.4.1.1 HTML

HTML (Hypertext Markup Language) is the backbone of web development, serving as the
primary language for creating the structure and content of web pages. It consists of a series of
elements or tags that define the various components of a web page. These elements range from
basic ones like headings (<h1> to <h6>), paragraphs (<p>), and links (<a>), to more complex
ones like forms (<form>), tables (<table>), and multimedia content (<img>, <video>, <audio>).
Each HTML element has its own semantic meaning, indicating its purpose or role within the
document. For example, using <header> for introductory content, <nav> for navigation links,
and <footer> for concluding content enhances the accessibility and organization of the web page.
HTML provides a structured and hierarchical approach to organizing content, making it easy for
developers to create well-organized and accessible web pages.

3.4.1.2 CSS

CSS (Cascading Style Sheets) complements HTML by providing the means to control the
presentation and layout of HTML elements on a web page. While HTML defines the structure
and content of the page, CSS dictates how that content should be displayed visually. CSS works
by targeting HTML elements using selectors and applying styles to them through rulesets. These
styles can include properties like colors, fonts, margins, padding, borders, and positioning. CSS
offers various layout techniques, including flexbox and grid layout, to arrange elements in a
desired format. It also supports responsive web design principles, enabling developers to create
layouts that adapt to different screen sizes and devices. By separating content from presentation,
CSS promotes code maintainability and reusability, allowing developers to apply consistent
styles across multiple pages and easily update the appearance of their websites.

3.4.1 PYTHON3.X

Python is a core software requirement for the resume parsing project, serving as the primary
programming language for development. The project specifically requires Python 3.x, the latest
version of the language, to leverage the newest features and improvements. Python's popularity

14
in the field of data science, machine learning, and natural language processing makes it an ideal
choice for developing the system. Its extensive ecosystem of libraries and frameworks, including
spaCy, scikit-learn, and PyMuPDF, provides the necessary tools for implementing advanced
functionalities. Python's readability and versatility contribute to the project's maintainability,
allowing developers to write clean and efficient code. The inclusion of Python ensures that the
resume parsing system benefits from a robust and well-supported programming language,
fostering a conducive environment for innovation and future enhancements.

3.4.2 spaCy

spaCy is a pivotal software requirement for the resume parsing project, representing a state-of-
the-art natural language processing (NLP) library in Python. The latest version of spaCy, with its
advanced accuracy and speed levels, is essential for developing applications that process and
understand large amounts of text efficiently. The library provides pre-trained models for various
languages, making it suitable for diverse language processing tasks. In the context of the resume
parsing project, spaCy's capabilities are harnessed for tasks such as named entity recognition and
information extraction. The active open-source community surrounding spaCy ensures ongoing
support, updates, and a wealth of resources for developers working on language-related projects.

3.4.3 MACHINE LEARNING LIBRARIES

The inclusion of machine learning libraries, such as scikit-learn, is a critical software


requirement for the project. These libraries provide essential tools for tasks like data
preprocessing, model training, and evaluation. Scikit-learn, a popular machine learning library in
Python, offers a wide range of algorithms and utilities that streamline the development of
machine learning models. Leveraging these libraries enhances the project's ability to handle
complex tasks, such as feature extraction and model evaluation, contributing to the overall
effectiveness of the resume parsing system. The utilization of machine learning libraries aligns
with best practices in data science and ensures that the project benefits from well-established
methodologies for building, training, and evaluating machine learning models.

15
3.4.4 PyMuPDF

PyMuPDF serves as an important software requirement for the resume parsing project, providing
capabilities for effective text extraction from PDF files. This Python package facilitates the
handling of resumes stored in the widely used PDF format, adding versatility to the system's data
source compatibility. By incorporating PyMuPDF, the project ensures comprehensive text
extraction from PDF documents, a common format for professional resumes. The seamless
integration of PyMuPDF enhances the system's ability to process diverse resume sources,
contributing to a more inclusive and thorough resume parsing solution.

16
CHAPTER 4
SYSTEM DESIGN

17
CHAPTER 4
SYSTEM DESIGN

4.1 Architecture Diagram

Figure 4.1 Architecture Diagram

The system's architecture commences with user authentication, ensuring secure access to the
platform's features. Once logged in, users can seamlessly upload their resumes, typically in PDF
or Word formats, providing the system with their professional information. Natural Language
Processing (NLP) techniques are then employed to comprehensively analyze the textual content

18
of the resumes, extracting valuable information like technical and soft skills through tasks such
as tokenization and Named Entity Recognition (NER).

Simultaneously, the system leverages a dataset containing job descriptions and corresponding
skill requirements for model training. This dataset, encompassing diverse job roles and their skill
profiles, serves as the foundational data source for the subsequent machine learning (ML)
algorithm. Prior to model training, preprocessing steps are undertaken to ensure data quality and
consistency, including handling missing information and standardizing formats.

With the preprocessed data and the dataset, the ML algorithm is trained to predict suitable job
roles based on the extracted features. Techniques like Convolutional Neural Networks (CNNs)
may be utilized for feature extraction and pattern recognition, enabling the algorithm to learn
from the dataset to accurately match skills extracted from user resumes with job requirements.

During the ML algorithm's operation, relevant features are extracted from the input data,
encompassing both explicit skills from resumes and implicit patterns identified during model
training. This feature extraction process is crucial for identifying the most relevant attributes for
job role matching. Following feature extraction, the ML algorithm builds a predictive model
based on the extracted features and the dataset. This model forms the foundation for generating
personalized job recommendations for users. Leveraging machine learning techniques, the
system offers tailored suggestions that align with users' skill profiles and career aspirations.

Finally, with the model in place, the system generates personalized job recommendations for
users based on their uploaded resumes. By analyzing the user's skills and qualifications and
matching them with job requirements from the dataset, the system provides curated job
opportunities that best suit the user's profile. This comprehensive approach streamlines the job-
seeking process, facilitating better matches between candidates and job opportunities.

4.2 UML DIAGRAM


4.2.1 CLASS DIAGRAM

19
A class diagram is a fundamental component of Unified Modeling Language (UML) used in
software engineering to visualize and represent the structure and relationships within a system. It
provides a static view of the system, depicting classes, their attributes, methods, and the
associations between them. In a class diagram, each class is represented as a rectangle, detailing
its internal structure with attributes and methods. Relationships between classes are depicted
through lines connecting them, illustrating associations, aggregations, or compositions.
Attributes are listed with their respective data types, while methods showcase the operations that
can be performed on the class. The diagram serves as a blueprint for understanding the
organization and interactions of classes within the system, facilitating communication among
stakeholders and aiding in the design and implementation phases of software development.

Figure 4.2 Class Diagram

4.2.2 USE CASE DIAGRAM

The use case diagram offers a comprehensive visualization of the system's functionalities from
the user's perspective, encapsulating key interactions between users and the platform. At the core
of the diagram is the "User," initiating various actions represented as distinct use cases. These
include essential functions like "Upload Resume," enabling users to submit their resumes for
skill extraction, and "View Recommended Jobs," facilitating access to personalized job
suggestions. Additional use cases such as "User Registration/Login," "Update Profile," and
"Provide Feedback" enrich the user experience by offering account management, profile
modification, and feedback submission functionalities, respectively. Together, these use cases
provide a holistic view of the system's capabilities, empowering users to efficiently navigate the

20
platform and access tailored job recommendations aligned with their skill profiles and career
goals.

Figure 4.3 Use Case Diagram

Figure 4.3 Use Case Diagram

4.2.3 ACTIVITY DIAGRAM

The activity diagram illustrates the systematic flow of operations within the job recommendation
system, encapsulating user interactions and system processes. Beginning with user authentication
through "User Login," the diagram delineates the progression to "Upload a Resume," initiating
the extraction of skills from the provided resumes. The subsequent step involves the application
of Natural Language Processing (NLP) techniques for skill extraction, followed by "Model
Building" using machine learning algorithms to generate personalized job recommendations.
Upon completion of model training, the system transitions to "Job Recommendation," where it
analyzes user profiles and suggests relevant job roles. This sequential depiction offers a clear

21
understanding of how users engage with the system and how various components collaborate to
deliver tailored job suggestions, streamlining the user experience and facilitating informed career
decisions.

Figure 4.4 Activity Diagram

22
4.2.4 DATA FLOW DIAGRAM

The Data Flow Diagram (DFD) offers a detailed depiction of how data traverses through the job
recommendation system, outlining the journey from input to output. At its core, the DFD
encapsulates the flow of data between various system components, portraying entities like the
user, resume data, and the job database. It commences with the user uploading a resume, serving
as the primary input source. Subsequently, the resume data undergoes a series of processing
stages, including the application of Natural Language Processing (NLP) techniques for skill
extraction and the utilization of machine learning algorithms for model building. Throughout
these phases, data undergoes transformations and manipulations to extract meaningful insights.
Once the model is trained and the job recommendation process is initiated, the system generates
personalized job recommendations tailored to the user's skills and qualifications. These
recommendations constitute the output of the system, presented to the user for consideration. The
DFD illustrates this entire data flow, providing a comprehensive overview of how information
moves from its source to its destination within the system, elucidating the intricacies of data
processing and utilization in the recommendation process.

Figure 4.5 Data Flow Diagram

23
CHAPTER 5
SYSTEM IMPLEMENTATION

24
CHAPTER 5
SYSTEM IMPLEMENTATION

5.1 LIST OF MODULES

 Resume Parsing and NLP Technique


 Data Pre-processing
 Convolutional neural network
 Job Role Recommendation

5.2 MODULE DESCRIPTION


5.2.1 RESUME PARSING AND NLP TECHNIQUES

The module plays a pivotal role in extracting valuable insights from candidate resumes. It begins
with a user-friendly interface enabling easy resume uploads. The module utilizes sophisticated
Natural Language Processing (NLP) techniques, including tokenization, Named Entity
Recognition (NER), and keyword extraction, to thoroughly analyse the textual content of
resumes. This process ensures accurate identification and extraction of skills, providing a
structured representation of the candidate's qualifications. By seamlessly integrating these NLP
techniques into the parsing process, the module enhances the system's ability to understand the
context and nuances of the skills presented in the resumes.

5.2.2 DATA PRE-PROCESSING:

The module serves as a foundational step to ensure the quality and relevance of the data
used in subsequent stages. It involves cleaning and transforming raw data into a format suitable
for analysis. In the context of our project, data pre-processing encompasses tasks such as
handling missing information, standardizing formats, and removing redundancies. This module
is crucial for maintaining data integrity, improving the efficiency of downstream processes, and
ultimately enhancing the accuracy of job role recommendations.

25
5.2.3 CONVOLUTIONAL NEURAL NETWORK

The CNN model is designed and implemented. The architecture typically includes
convolutional layers to capture spatial patterns in the textual data, followed by pooling layers to
reduce dimensionality and highlight essential features. The output is then flattened and connected
to one or more fully connected layers, allowing the model to learn intricate relationships between
the input features. Training the CNN involves optimizing the model's parameters using the
dataset. This includes feeding batches of labeled job descriptions into the network, adjusting
weights and biases through backpropagation, and minimizing a defined loss function. Training
continues iteratively until the model converges and accurately captures patterns in the data.

5.2.4 JOB ROLE RECOMMENDATION

The module synthesizes outputs from the Resume Parsing, NLP, Data Preprocessing, and
CNN modules to provide tailored recommendations to users. By integrating the identified skills
from resumes and the predictions made by the CNN algorithm, the system generates a ranked list
of job roles that closely align with the candidate's qualifications. This module ensures that users
receive personalized and relevant job suggestions, optimizing the overall user experience. The
collaborative efforts of these modules contribute to a comprehensive and intelligent job
recommendation system, streamlining the recruitment process and fostering better matches
between candidates and job opportunities.

26
CHAPTER 6
TESTING

27
CHAPTER 6
TESTING

6.1 UNIT TESTING

Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. All decision branches and
internal code flow should be validated. It is the testing of individual software units of the
application .it is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests perform
basic tests at component level and test a specific business process, application, and/or system
configuration. Unit tests ensure that each unique path of a business process performs accurately
to the documented specifications and contains clearly defined inputs and expected results.

6.2 INTEGRATION TESTING

Integration tests are designed to test integrated software components to determine if they actually
run as one program. Testing is event driven and is more concerned with the basic outcome of
screens or fields. Integration tests demonstrate that although the components were individually
satisfaction, as shown by successfully unit testing, the combination of components is correct and
consistent. Integration testing is specifically aimed at exposing the problems that arise from the
combination of components.

6.3 SYSTEM TESTING

System testing ensures that the entire integrated software system meets requirements. It tests a
configuration to ensure known and predictable results. An example of system testing is the
configuration oriented system integration test. System testing is based on process descriptions
and flows, emphasizing pre-driven process links and integration points.

28
1. Functional testing : Functional tests provide systematic demonstrations that functions tested
are available as specified by the business and technical requirements, system documentation, and
user manuals. Organization and preparation of functional tests is focused on requirements, key
functions, or special test cases. In addition, systematic coverage pertaining to identify Business
process flows; data fields, predefined processes, and successive processes must be considered for
testing. Before functional testing is complete, additional tests are identified and the effective
value of current tests is determined.

2. White Box Testing : White Box Testing is a testing in which in which the software tester has
knowledge of the inner workings, structure and language of the software, or at least its purpose.
It is purpose. It is used to test areas that cannot be reached from a black box level.

3. Black Box Testing : Black Box Testing is testing the software without any knowledge of the
inner workings, structure or language of the module being tested. Black box tests, as most other
kinds of tests, must be written from a definitive source document, such as specification or
requirements document, such as specification or requirements document. It is a testing in which
the software under test is treated, as a black box .you cannot “see” into it. The test provides
inputs and responds to outputs without considering how the software works.

4. Compatibility testing : Compatibility testing verifies that the system operates seamlessly
across different environments and configurations. This involves testing on various operating
systems, validating compatibility with different Python versions and dependencies, and ensuring
adaptability to changes in third-party libraries or frameworks.

5. Reliability testing : Reliability testing aims to confirm the consistent and accurate
performance of the system. It involves executing the system over an extended period to identify
memory leaks or performance degradation, simulating unexpected failures, and validating the
system's ability to consistently deliver reliable outputs.

6. Regression testing : Regression testing ensures that new changes or updates do not adversely
impact existing functionalities. By re-running previous tests after implementing modifications,

29
developers verify that changes do not introduce errors or compromise existing features,
maintaining the system's stability.

7. Scalability testing : Scalability testing, if applicable, evaluates the system's capacity to scale
with increased load or data volume. It involves testing performance with a growing number of
resumes in the dataset and assessing scalability under varying levels of computational resources,
such as CPU and memory. This testing ensures the system's resilience and effectiveness in
handling increased demands.

6.4 TEST CASES

ID TEST PRE- EXPECTED ACTUAL PASS


CASES CONDITIONS RESULTS RESULTS /
FAIL
TC001 Data The system is The system All resumes are PASS
Preparation installed and successfully processed
Accuracy configured with the prepares the data, without errors,
necessary and labeled and labeled
dependencies. entities are entities are
A sample dataset of generated for correctly
resumes, including each resume. identified.
varied formats (PDF,
DOCX), is available.
TC002 Model Data preparation is The trained The model PASS
Training completed model accurately demonstrates
Efficiency successfully. identifies entities high accuracy in
The system is in the test identifying
configured with the dataset. entities in the
necessary training test dataset.
parameters.

30
TC003 PDF Text The system is The system Text extraction PASS
Extraction operational with accurately from PDF
PyMuPDF for PDF extracts text resumes is
text extraction. from PDF successful,
A set of resumes in resumes without preserving the
PDF format is loss or distortion. content.
available for testing.
TC004 Job The model is build Jobs based on Jobs based on PASS
Recommendat already so that it can the skills of the the skills of the
ion recommend job user is user is
according to the skills recommended. recommended.
the user has.
TC005 End-to-End The system is The system Resumes are PASS
Processing configured with the processes processed
trained model and resumes successfully, and
PDF text extraction accurately, the system
functionality. extracting produces
A diverse set of relevant accurate
resumes in different information and summaries.
formats is available generating
for processing. summaries.

31
CHAPTER 7
RESULTS & DISCUSSION

32
CHAPTER 7
RESULTS & DISCUSSION

7.1 RESULTS

Certainly! In the results section, the project report provides a detailed analysis of the
performance and effectiveness of the job recommendation system across various dimensions.
This includes both quantitative measurements and qualitative assessments aimed at evaluating
different aspects of the system's functionality.

Quantitative analysis involves the measurement of specific metrics to quantify the system's
performance objectively. For instance, accuracy metrics assess the correctness of job
recommendations made by the system compared to ground truth data or user feedback. Precision
and recall metrics provide insights into the system's ability to generate relevant recommendations
while minimizing false positives and false negatives. The F1 score offers a balanced measure of
the system's precision and recall, providing a single metric to evaluate overall performance.

In addition to quantitative metrics, qualitative analysis delves into the subjective aspects of the
system's performance. This may involve gathering user feedback through surveys, interviews, or
usability testing sessions. Qualitative assessments aim to understand user satisfaction, perception
of recommendation relevance, ease of use, and overall utility of the system in the context of job
searching and career exploration.

Furthermore, the results section may present findings from specific use cases or scenarios to
illustrate the system's performance under different conditions. This could involve analyzing the
effectiveness of the system across various industries, job roles, or skill sets. By examining
performance in diverse contexts, the report provides a nuanced understanding of the system's
capabilities and limitations.

Overall, the results section serves as a comprehensive evaluation of the job recommendation
system, combining quantitative measurements with qualitative insights to validate its

33
effectiveness and inform future improvements. It offers a detailed assessment of how well the
system meets the needs of users and stakeholders, paving the way for informed conclusions and
recommendations in the report.

7.2 DISCUSSION

In the discussion section, the project report critically analyzes the results presented in the
previous section, providing insights, interpretations, and implications derived from the findings.
This section serves as a platform to reflect on the effectiveness of the job recommendation
system, address any limitations or challenges encountered during the project, and propose
recommendations for future improvements or research directions.

One key aspect of the discussion involves comparing the observed results with the initial
objectives and expectations outlined in the project's scope and objectives. This comparison helps
assess the extent to which the system has achieved its intended goals and whether any deviations
or discrepancies exist. Additionally, the discussion explores the reasons behind any observed
discrepancies and considers potential factors that may have influenced the outcomes.

Furthermore, the discussion section delves into the implications of the results for both theoretical
understanding and practical applications. It may explore how the findings contribute to existing
knowledge in the field of job recommendation systems, highlighting any novel insights or
contributions. Moreover, the discussion considers the practical implications of the results for
stakeholders, such as recruiters, job seekers, and system developers, outlining potential benefits,
challenges, and recommendations for implementation or adoption.

The discussion also provides a platform to address any limitations or constraints encountered
during the project. This may include limitations in the data used for training and evaluation,
constraints in computational resources or technology, as well as any methodological limitations
or assumptions made during the project. Acknowledging these limitations helps contextualize the
results and provides guidance for future research or development efforts.

34
Finally, the discussion section may conclude with recommendations for future research,
highlighting areas for further investigation or refinement of the job recommendation system.
These recommendations may include suggestions for improving system performance, addressing
identified limitations, exploring new avenues for research, or extending the application of the
system to different domains or contexts.

Overall, the discussion section synthesizes the project's findings, interprets their significance,
and offers insights and recommendations for advancing the field of job recommendation systems.
It serves as a critical reflection on the project's outcomes and provides guidance for future
endeavors in this area of research and development.

35
CHAPTER 8
CONCLUSION AND FUTURE ENHANCEMENT

36
CHAPTER 8
CONCLUSION AND FUTURE ENHANCEMENT

8.1 CONCLUSION

In conclusion, this project represents a significant leap forward in the optimization of the job
application and recruitment process. By harnessing the power of advanced technologies like
Natural Language Processing (NLP) and Convolutional Neural Network (CNN) algorithms, the
system introduces a paradigm shift in how resumes are parsed and job roles are predicted. The
integration of the Resume Parsing and NLP modules marks a pivotal advancement, ensuring
meticulous extraction of skills from resumes, thereby enabling a nuanced understanding of
candidates' qualifications. This in-depth analysis is further enhanced by the CNN algorithm,
which, when applied to a meticulously curated CSV dataset, adopts a data-driven approach to
accurately forecast job roles based on historical data patterns and correlations. The culmination
of these efforts manifests in the creation of a refined Job Role Recommendation system,
characterized by its ability to furnish users with personalized and precise suggestions tailored to
their skill sets and career aspirations. Beyond merely enhancing technological capabilities within
the recruitment domain, this project aspires to redefine the dynamics of candidate-recruiter
interactions in the job market, promising not only increased efficiency but also a more seamless
and effective matching process between candidates and job opportunities.

8.2 FUTURE ENHANCEMENT

For future enhancements, several promising avenues can be explored to further elevate the
capabilities and impact of this project. Firstly, the integration of additional advanced machine
learning models, such as recurrent neural networks (RNNs) or transformer models, could
significantly enhance the system's understanding of context within job descriptions. This could
lead to more nuanced skill extraction and improved job role predictions, especially in scenarios
with complex language or evolving job requirements. Furthermore, incorporating feedback loops
from both recruiters and candidates could contribute to the creation of a more dynamic learning
system. By allowing users to provide feedback on the accuracy and relevance of job role

37
recommendations, the system could continuously refine its algorithms, resulting in a more
adaptive and user-centric platform. Additionally, developing mechanisms for proactive updates
based on evolving trends in the job market and industry demands would ensure that the system
remains responsive to changing dynamics. This could involve regularly updating the dataset with
new job descriptions and skill requirements to ensure that the recommendations provided remain
relevant and up-to-date. Finally, exploring opportunities for integration with emerging
technologies such as blockchain for enhanced security and transparency in job matching
processes could further enhance the overall efficacy and trustworthiness of the platform.

38
ANNEXURE

39
ANNEXURE
APPENDIX I
DATASET:
Name: Job Recommendation DataBase
Link:
https://drive.google.com/drive/folders/1m53TTnpB3uEA_2ZLRXUFgNqnMU
TVZHDX?usp=sharing

SOURCE CODE:
app=Flask(__name__)
app.secret_key = 'jndjsahdjxasudhas-09vzx2223'
database = "new.db"
conn = sqlite3.connect(database)
cursor = conn.cursor()
cursor.execute('''
CREATE TABLE IF NOT EXISTS register (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_name TEXT, user_email TEXT, password TEXT
)
''')
@app.route('/')
def index():
return render_template('index.html')

@app.route('/register', methods=["GET", "POST"])


def register():
if request.method == "POST":
user_name = request.form['user_name']

40
user_email = request.form['user_email']
password = request.form['password']
conn = sqlite3.connect(database)
cursor = conn.cursor()
cursor.execute("INSERT INTO register (user_name, user_email, password) VALUES
(?, ?, ?)",
(user_name, user_email, password))
conn.commit()
flash('Registration successful!', 'success')
return render_template('index.html')

return render_template('index.html')

u=[]
name=[]
email=[]
@app.route('/login', methods=["GET", "POST"])
def login():
if request.method == "POST":
conn = sqlite3.connect(database)
cursor = conn.cursor()
user_email = request.form['user_email']
password = request.form['password']
cursor.execute("SELECT * FROM register WHERE user_email=? AND password=?",
(user_email, password))
user = cursor.fetchone()
if user:
u.append(user_email)
name.append(user[1])

41
email.append(user[2])
return render_template('upload.html',name=user[1],email=user[2])
else:
return "password mismatch"
return render_template('register.html')

nlp = spacy.load('en_core_web_sm')
result = []
with open('linkedin skill',encoding='utf-8') as f:
external_source = list(f)

for element in external_source:


result.append(element.strip().lower())

def extract_skill_1(resume_text):
nlp_text = nlp(resume_text)
tokens = [token.text for token in nlp_text if not token.is_stop]
skills = result
skillset = []
for i in tokens:
if i.lower() in skills:
skillset.append(i)
for i in nlp_text.noun_chunks:
i = i.text.lower().strip()
if i in skills:
skillset.append(i)

42
return [word.capitalize() for word in set([word.lower() for word in skillset])]

STOPWORDS = set(stopwords.words('english'))
EDUCATION = [
'CSE','EEE.', 'ECE', 'IT',"MCA"]

def extract_education(resume_text):
nlp_text = nlp(resume_text)
nlp_text = [sent.text.strip() for sent in nlp_text.sents]
edu = {}
for index, text in enumerate(nlp_text):
for tex in text.split():
tex = re.sub(r'[?|$|.|!|,]', r'', tex)
if tex.upper() in EDUCATION and tex not in STOPWORDS:
edu[tex] = text + nlp_text[index + 1]
education = []
for key in edu.keys():
year = re.search(re.compile(r'(((20|19)(\d{2})))'), edu[key])
if year:
education.append((key, ''.join(year[0])))
else:
education.append(key)
return education

def predict(mark,skill):

43
class_names = {
0: 'Birlasoft', 1: 'Cognizant', 2: 'Hexaware Technologies',
3: 'Infosys', 4: 'KPIT Technologies', 5: 'L&T Infotech',
6: 'Tech Mahindra', 7: 'Wipro Technologies', 8: 'css corp',9:'TCS'
}
train_data=pd.read_csv("Book2.csv", encoding='latin-1')
le_Skill = LabelEncoder()
le_depart= LabelEncoder()
le_Company = LabelEncoder()
train_data['skill'] = le_Skill.fit_transform(train_data['Skills Known'])
train_data['dept'] = le_depart.fit_transform(train_data['department'])
train_data['target'] = le_Company.fit_transform(train_data['Company Placed'])
x = train_data.drop(['Full Name', "12th Mark","10th Mark","dept",'Company Placed', "Skills
Known","Projects Done",'target','department',"Certifications/Internships"], axis = 1)
y = train_data['target']
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)
num_classes = 10
y_train_processed = np.clip(y_train, 0, num_classes - 1)
y_test_processed = np.clip(y_test, 0, num_classes - 1)

scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(x_train)
X_test_scaled = scaler.transform(x_test)
X_train_reshaped = X_train_scaled.reshape((X_train_scaled.shape[0],
X_train_scaled.shape[1], 1))
X_test_reshaped = X_test_scaled.reshape((X_test_scaled.shape[0], X_test_scaled.shape[1], 1))
cnn_model = Sequential()
cnn_model.add(Conv1D(filters=64, kernel_size=5, activation='relu',
input_shape=(X_train_reshaped.shape[1], X_train_reshaped.shape[2]), padding='same'))

44
cnn_model.add(BatchNormalization())
cnn_model.add(MaxPooling1D(pool_size=1))
cnn_model.add(Dropout(0.5))
cnn_model.add(Conv1D(filters=128, kernel_size=5, activation='relu', padding='same'))
cnn_model.add(BatchNormalization())
cnn_model.add(MaxPooling1D(pool_size=2))
cnn_model.add(Dropout(0.5))
cnn_model.add(Flatten())
cnn_model.add(Dense(256, activation='relu'))
cnn_model.add(Dropout(0.5))
cnn_model.add(Dense(num_classes, activation='softmax'))
cnn_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
history_cnn = cnn_model.fit(X_train_reshaped, y_train_processed, epochs=5, batch_size=64,
validation_data=(X_test_reshaped, y_test_processed),verbose=0)
predicted_probs = cnn_model.predict([[mark,0,skill]])
top_indices = np.argsort(predicted_probs[0])[::-1][:3]
top_companies = [(class_names[i]) for i in top_indices]
return(top_companies)

45
ANNEXURE
APPENDIX II

SAMPLE OUTPUT:

46
REFERENCES

47
REFERENCES

[1]R. Kwieciński, G. Melniczak and T. Górecki, "Comparison of Real-Time and Batch Job
Recommendations," in IEEE Access, vol. 11, pp. 20553-20559, 2023, doi:
10.1109/ACCESS.2023.3249356.

[2]M. Agung, Y. Watanabe, H. Weber, R. Egawa and H. Takizawa, "Preemptive Parallel


Job Scheduling for Heterogeneous Systems Supporting Urgent Computing," in IEEE
Access, vol. 9, pp. 17557-17571, 2021, doi: 10.1109/ACCESS.2021.3053162.

[3]I. Khaouja, I. Kassou and M. Ghogho, "A Survey on Skill Identification From Online
Job Ads," in IEEE Access, vol. 9, pp. 118134-118153, 2021, doi:
10.1109/ACCESS.2021.3106120.

[4]T. Ha, M. Lee, B. Yun and B. -Y. Coh, "Job Forecasting Based on the Patent Information:
A Word Embedding-Based Approach," in IEEE Access, vol. 10, pp. 7223-7233, 2022, doi:
10.1109/ACCESS.2022.3141910.

[5]G. Van Dongen and D. Van Den Poel, "Influencing Factors in the Scalability of
Distributed Stream Processing Jobs," in IEEE Access, vol. 9, pp. 109413-109431, 2021, doi:
10.1109/ACCESS.2021.3102645.

[6] T. Danişan, E. Özcan, and T. Eren, ‘‘Personnel selection with multi-criteria decision
making methods in the ready-to-wear sector,’’ Tehnički vjesnik, vol. 29, no. 4, pp. 1339–
1347, 2022.

[7] S. G. Abbasi, M. S. Tahir, M. Abbas, and M. S. Shabbir, ‘‘Examining the relationship


between recruitment & selection practices and business growth: An exploratory study,’’ J.
Public Affairs, vol. 22, no. 2, 2022, Art. no. e2438.

[8] A. B. Raj, ‘‘Impact of employee value proposition on employees’ intention to stay:


Moderating role of psychological contract and social identity,’’ South Asian J. Bus. Stud.,
vol. 10, no. 2, pp. 203–226, Apr. 2021.

48
[9] A. Malik, P. Thevisuthan, and T. De Sliva, ‘‘Artificial intelligence, employee engagement,
experience, and HRM,’’ in Strategic Human Resource Management and Employment
Relations: An International Perspective. Cham, Switzerland: Springer, 2022, pp. 171–184.

[10] P. Budhwar, A. Malik, M. T. De Silva, and P. Thevisuthan, ‘‘Artificial intelligence–


challenges and opportunities for international HRM: A review and research agenda,’’ Int. J.
Hum. Resour. Manage., vol. 33, no. 6, pp. 1065–1097, 2022.

49
Job Recommendation Using AI
GUIDING CAREERS THROUGH SKILL-BASED INSIGHTS

SANTHOSH T, JEFFRIN RIJO V C, ARUNESHWAR V , Dr. S. PADMA PRIYA,


Department of artificial Department of artificial Department of artificial Professor, Head of the Department
intelligence and data science, intelligence and data science, intelligence and data science, of artificial intelligence and data
Velammal institute Of Velammal institute Of Velammal institute Of science,
technology, technology, technology, Velammal institute Of technology,
Chennai ,india. Chennai ,india. Chennai ,india. Chennai ,india.
[email protected] [email protected] [email protected] [email protected]

significance of project extends beyond its technological


prowess. By offering users personalized job role
ABSTRACT: recommendations based on their unique skill profiles, the
platform aims to streamline the job-seeking experience,
The system ingests resume data, harnessing the power of reducing the often overwhelming burden of sifting through
NLP to extract and analyze nuanced information vast job listings. The fusion of NLP and machine learning in
embedded within the text. By deciphering the language of this project not only addresses the immediate challenge of job
resumes, skills gains deep insights into users' professional matching but also marks a stride towards a more nuanced and
competencies and preferences. The algorithm then empathetic approach to career guidance. Project signifies a
employs this enriched data to provide personalized job paradigm shift in the way individuals navigate their
role recommendations, transforming the job-seeking professional futures, presenting a comprehensive and tailored
experience into a tailored and efficient process. This solution in an era where personalization and efficiency are
paramount in the realm of career development.
fusion of NLP and machine learning not only enhances
the accuracy of skill assessment but also ensures that
users receive targeted and relevant career suggestions,
II. LITERATURE SURVEY
marking a significant advancement in the field of data-
driven career guidance. Skills emerge as a pioneering
solution, empowering individuals to make informed Collaborative filtering systems for recommendations are
often developed to work in batches, catering to many users
decisions about their professional futures with confidence.
simultaneously with personalized suggestions. Yet, the need
for real-time recommendations, which incorporate the latest
user activities, is becoming evident in various industrial
1. INTRODUCTION applications. Our methodology enables any system capable
In the ever-evolving landscape of career exploration of generating user-to-item suggestions from item-to-item
and development, individuals often grapple with the daunting data to apply our approach. We've conducted real-world A/B
task of aligning their unique skill sets with the diverse array testing in the context of job recommendations with nearly
of job opportunities available. Skill based job 200,000 OLX platform users, revealing that real-time
recommendation as a project designed to tackle this recommendations lead to a minimum of 10% increase in job
challenge head-on. Leveraging cutting-edge natural language application rates over batch-processed suggestions. These
processing (NLP) techniques, the system goes beyond
findings offer valuable insights for organizations
conventional methods of resume analysis. It not only extracts
contemplating the shift from batch to real-time
explicit skill information but delves into the subtle nuances of
recommendation systems.
language, capturing the essence of professional experiences
and preferences embedded in the resumes. The heart of
project lies in its sophisticated algorithm, meticulously In the realm of computing, especially for time-sensitive tasks,
crafted to decipher the language of resumes and transform it relying on dedicated infrastructures can be impractical due
into actionable insights. Through an iterative process of to cost concerns. These resources are shared among
learning and adaptation, the platform refines its numerous tasks, and high-priority tasks may need to
understanding of industry-specific terms, evolving to provide interrupt ongoing ones. Traditional scheduling techniques
increasingly accurate and personalized job role primarily aim to minimize job wait times, potentially
recommendations. The use of NLP ensures a holistic sidelining urgent tasks. This can be particularly challenging
understanding of the user's professional journey, considering in diverse environments that include coprocessors, as
not just technical skills but also soft skills and industry- preemption involves complex interactions with the host
specific language that might otherwise go unnoticed. The processor's system software. We introduce a novel approach
for scheduling parallel jobs in shared, heterogeneous
environments, incorporating an in-memory job swapping Recognition (NER), and keyword extraction, the system can
strategy for efficiently prioritizing urgent tasks. Our accurately identify and extract relevant skills from resumes.
simulation outcomes demonstrate this method's effectiveness This not only improves the precision of skill recognition but
in reducing wait times and minimizing disruption for urgent also ensures a deeper understanding of the context in which
tasks. these skills are presented.

The job market is continuously evolving, driven by


globalization, demographic changes, and the digital IV. METHODOLOGY
revolution, necessitating vigilant observation. With job
advertisements increasingly available online, there's a rich RESUME PARSING AND NLP TECHNIQUE:
vein of data for analyzing market demands and identifying
necessary skills. This survey aims to synthesize existing The module plays a pivotal role in extracting
research on extracting skills from job ads, offering a valuable insights from candidate resumes. It begins with a
comprehensive overview of the methods, types of skills user-friendly interface enabling easy resume uploads. The
identified, sectors studied, and the granularity of these module utilizes sophisticated Natural Language Processing
(NLP) techniques, including tokenization, Named Entity
analyses. Through a review of 108 articles, we assess the
Recognition (NER), and keyword extraction, to thoroughly
landscape of skill identification research, highlighting
analyse the textual content of resumes. This process ensures
application areas, challenges, and emerging trends. This
accurate identification and extraction of skills, providing a
synthesis not only categorizes existing efforts but also charts structured representation of the candidate's qualifications. By
a course for future investigation in this vital area. seamlessly integrating these NLP techniques into the parsing
. process, the module enhances the system's ability to
III. DESIGN understand the context and nuances of the skills presented in
the resumes.
DATA PRE-PROCESSING:
The module serves as a foundational step to ensure
the quality and relevance of the data used in subsequent
stages. It involves cleaning and transforming raw data into a
format suitable for analysis. In the context of our project,
data pre-processing encompasses tasks such as handling
missing information, standardizing formats, and removing
redundancies. This module is crucial for maintaining data
integrity, improving the efficiency of downstream processes,
and ultimately enhancing the accuracy of job role
recommendations.
CONVOLUTIONAL NEURAL NETWORK:
The CNN model is designed and implemented. The
architecture typically includes convolutional layers to capture
spatial patterns in the textual data, followed by pooling layers
to reduce dimensionality and highlight essential features. The
output is then flattened and connected to one or more fully
Fig. 1. Activity diagram
connected layers, allowing the model to learn intricate
The proposed system aims to overcome the relationships between the input features. Training the CNN
limitations of the existing job recruitment process by involves optimizing the model's parameters using the dataset.
introducing an advanced and automated solution. Leveraging This includes feeding batches of labeled job descriptions into
state-of-the-art technologies such as CNN algorithms and the network, adjusting weights and biases through
NLP techniques, the system offers a more sophisticated backpropagation, and minimizing a defined loss function.
approach to resume screening and job role prediction. In the Training continues iteratively until the model converges and
proposed system, the integration of CNN algorithms allows accurately captures patterns in the data.
for data-driven predictions of suitable job roles based on
historical patterns and associations within a vast dataset of JOB ROLE RECOMMENDATION:
job descriptions. This approach enhances the accuracy and
efficiency of job matching, addressing the inefficiencies The module synthesizes outputs from the Resume
associated with manual processes and keyword-based Parsing, NLP, Data Preprocessing, and CNN modules
matching. Furthermore, the incorporation of NLP techniques to provide tailored recommendations to users. By
enables a nuanced analysis of textual content within resumes. integrating the identified skills from resumes and the
Through methods such as tokenization, Named Entity predictions made by the CNN algorithm, the system
generates a ranked list of job roles that closely align technological advancements in the recruitment domain but
with the candidate's qualifications. This module ensures also seeks to revolutionize how candidates and recruiters
that users receive personalized and relevant job interact with the job market.
suggestions, optimizing the overall user experience.
The collaborative efforts of these modules contribute to
a comprehensive and intelligent job recommendation
system, streamlining the recruitment process and VI. CONCLUSION
fostering better matches between candidates and job
opportunities.This methodology for this AI-powered
In conclusion, this project presents a comprehensive and
nutritional analysis project begins with the acquisition
innovative solution to streamline the job application and
and preparation of a diverse dataset of fruit images,
ensuring proper labeling with corresponding nutritional recruitment process. By integrating advanced technologies
information. such as Natural Language Processing (NLP) and
Convolutional Neural Network (CNN) algorithms, the
system enhances the traditional resume screening and job
role prediction methods. The Resume Parsing and NLP
V. RESULTS modules enable precise extraction of skills from resumes,
ensuring a thorough understanding of candidates'
qualifications. The CNN algorithm, applied to a CSV dataset,
provides a data-driven approach to predict job roles based on
historical patterns and associations. The collaborative efforts
of these modules culminate in a sophisticated Job Role
Recommendation system, offering users personalized and
accurate suggestions. This project not only contributes to the
technological advancements in the recruitment domain but
also seeks to revolutionize how candidates and recruiters
interact with the job market

Fig. 2 Website Home Page VII. FUTURE ENHANCEMENT

For future enhancements, several avenues can be explored


to further elevate the capabilities and impact of this project.
Firstly, the integration of additional advanced machine
learning models, such as recurrent neural networks (RNNs)
or transformer models could enhance the system's
understanding of context within job descriptions, potentially
leading to more nuanced skill extraction and improved job
role predictions. Furthermore, incorporating feedback loops
from recruiters and candidates could contribute to a dynamic
learning system. Developing mechanisms for users to
Fig. 3 Output Page provide feedback on the accuracy of job role
recommendations would enable continuous improvement
In conclusion, this project presents a comprehensive and and refinement of the algorithm, creating a more adaptive
innovative solution to streamline the job application and and user-centric platform.
recruitment process. By integrating advanced technologies
such as Natural Language Processing (NLP) and
Convolutional Neural Network (CNN) algorithms, the system REFERNCES
enhances the traditional resume screening and job role
prediction methods. The Resume Parsing and NLP modules [1]R. Kwieciński, G. Melniczak and T. Górecki,
enable precise extraction of skills from resumes, ensuring a "Comparison of Real-Time and Batch Job
thorough understanding of candidates' qualifications. Recommendations," in IEEE Access, vol. 11, pp. 20553-
The CNN algorithm, applied to a CSV dataset, provides a 20559, 2023, doi: 10.1109/ACCESS.2023.3249356.
data-driven approach to predict job roles based on historical
patterns and associations. The collaborative efforts of these [2]M. Agung, Y. Watanabe, H. Weber, R. Egawa and H.
Takizawa, "Preemptive Parallel Job Scheduling for
modules culminate in a sophisticated Job Role
Heterogeneous Systems Supporting Urgent Computing,"
Recommendation system, offering users personalized and in IEEE Access, vol. 9, pp. 17557-17571, 2021, doi:
accurate suggestions. This project not only contributes to the 10.1109/ACCESS.2021.3053162.
[3]I. Khaouja, I. Kassou and M. Ghogho, "A Survey on Skill
Identification From Online Job Ads," in IEEE Access, vol. 9,
pp. 118134-118153, 2021, doi:
10.1109/ACCESS.2021.3106120.
[4]T. Ha, M. Lee, B. Yun and B. -Y. Coh, "Job Forecasting
Based on the Patent Information: A Word Embedding-Based
Approach," in IEEE Access, vol. 10, pp. 7223-7233, 2022,
doi: 10.1109/ACCESS.2022.3141910.
[5]G. Van Dongen and D. Van Den Poel, "Influencing Factors
in the Scalability of Distributed Stream Processing Jobs,"
in IEEE Access, vol. 9, pp. 109413-109431, 2021, doi:
10.1109/ACCESS.2021.3102645.
[6] T. Danişan, E. Özcan, and T. Eren, ‘‘Personnel selection
with multi-criteria decision making methods in the ready-to-
wear sector,’’ Tehnički vjesnik, vol. 29, no. 4, pp. 1339–1347,
2022.
[7] S. G. Abbasi, M. S. Tahir, M. Abbas, and M. S. Shabbir,
‘‘Examining the relationship between recruitment & selection
practices and business growth: An exploratory study,’’ J.
Public Affairs, vol. 22, no. 2, 2022, Art. no. e2438.
[8] A. B. Raj, ‘‘Impact of employee value proposition on
employees’ intention to stay: Moderating role of
psychological contract and social identity,’’ South Asian J.
Bus. Stud., vol. 10, no. 2, pp. 203–226, Apr. 2021.
[9] A. Malik, P. Thevisuthan, and T. De Sliva, ‘‘Artificial
intelligence, employee engagement, experience, and HRM,’’
in Strategic Human Resource Management and Employment
Relations: An International Perspective. Cham, Switzerland:
Springer, 2022, pp. 171–184.
[10] P. Budhwar, A. Malik, M. T. De Silva, and P.
Thevisuthan, ‘‘Artificial intelligence–challenges and
opportunities for international HRM: A review and research
agenda,’’ Int. J. Hum. Resour. Manage., vol. 33, no. 6, pp.
1065–1097, 2022.
st
1 INTERNATIONAL CONFERENCE ON DATA ANALYTICS
AND INTELLIGENCE COMPUTING-2024
(ICDAIC'24) POORVKA
CERTIFICATE OF PARTICIPATION
This is to certify that Prof./Dr/Mr/Ms. JEFERIN of
VEAMMH TeCHIOLOE has presented a paper
titledOR REOMMMENDATIOA| 25JNE 92
in ist International Conference on Data Analytics and Intelligence Computing organized
by the Department of Artificial Intelligence and Data Science, Velammal Institute of
Technology, Chennai, TamilNadu, India on April 06, 2024.
CÐORDINATORS HOb VICE PRINCIPAL PRINCIPAL
Dr.Prarmeeladevi Chilakuru
Ms.K.Sudha DrS. PadmaPriya VELAMMAL Dr.S.Soundararajan Dr.N.Balaji
Technology, titledOB
International
by Istin
VELAMMAL This
Dr.Prameeladevi
Chillakuru
the
COORDINATORS is
Ms.K.Sudha Department to
certify 1St
Chennai, INTERNATIONAL
ITUTDsT
Pe that
of
Conference
ENDATLON
COM
.S.PadmaPriya Artificial Prof./Dr./Mr./Ms. COMPUTING-2024
TamilNadu, INTELLIGENCE
AND
CHOD PARTICIPATION
CERTIFICATEOF
Intelligence on D
India Data
TeCADloty CONFERENCE
IECHNOLOGY
INSTITUTEOF
VELAMMAL
onAnalytics
ARUNESHIALAR-V
April and
Data
06, and A2
PRINCIPAL VICE
Science,
2024.
Intelligence ON
Soundararajan
DATA
Velammal (1CDAIC24)
Computing
organized has ANALYTICS
presented
Institute
PRINCIPAL
Dr.N.Balaji
a
of paper
of
POORVIKA

You might also like