0% found this document useful (0 votes)
14 views

Major Project Report

Uploaded by

yevob11108
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Major Project Report

Uploaded by

yevob11108
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 43

COGNITIVE ANALYTICS : ENHANCING SUBJECTIVE ANSWER

ASSESSMENT
MAJOR PROJECT REPORT
Submitted in partial fulfilment of the requirements for the award of the degree

of

BACHELOR OF TECHNOLOGY

in

COMPUTER SCIENCE & ENGINEERING

by

Name: Mohak Khatri Name: Mohammad Danish Khan


Enrollment No: 07615602720 Enrollment No: 07715602720

Name: Muhammad Imran Khan Name: Mukesh Chandra


Enrollment No: 08115602720 Enrollment No: 08215602720

Guided by

Ms. Vaishali Sharma


Assistant Professor

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


DR. AKHILESH DAS GUPTA INSTITUTE OF PROFESSIONAL
STUDIES (AFFILIATED TO GURU GOBIND SINGH INDRAPRASTHA UNIVERSITY,
DELHI) NEW DELHI – 110053
MAY 2024
CANDIDATE’S DECLARATION

It is hereby certified that the work which is being presented in the B. Tech Major Project
Report entitled "COGNITIVE ANALYTICS : ENHANCING SUBJECTIVE ANSWER
ASSESSMENT" in partial fulfilment of the requirements for the award of the degree of
Bachelor of Technology and submitted in the Department of Computer Science &
Engineering of Dr. Akhilesh Das Gupta Institute of Professional Studies, New Delhi
(Affiliated to Guru Gobind Singh Indraprastha University, Delhi) is an authentic record of
our own work carried out under the guidance of Ms. Vaishali Sharma, Assistant Professor.

The matter presented in the B. Tech Major Project Report has not been submitted by me for
the award of any other degree of this or any other Institute.

Name: Mohak Khatri Name: Mohammad Danish Khan


Enrollment No: 07615602720 Enrollment No: 07715602720

Name: Muhammad Imran Khan Name: Mukesh Chandra


Enrollment No: 08115602720 Enrollment No: 08215602720

This is to certify that the above statement made by the candidate is correct to the best of my
knowledge. He/She/They are permitted to appear in the External Major Project Examination

(Ms. Vaishali Sharma) Dr. Saurabh Gupta


Assistant Professor Head, CSE

The B. Tech Major Project Viva-Voce Examination of Name of the Student (Enrollment No:
XXX), has been held on ……………………………….

Dr. Rakesh Kumar Arora (Signature of External


Examiner) Project Coordinator
ABSTRACT

This project report delves into the development of a web application aimed at facilitating
communication and assessment between students and teachers. The application allows both
students and teachers to log in, with teachers having the capability to post questions
accompanied by model answer keys, and students being able to submit their responses upon
logging in.

In contemporary educational settings, a diverse array of students participate in various types


of examinations, ranging from institutional to non-institutional, and at times, competitive
ones. Our focus primarily revolves around multiple-choice questions (MCQs) due to their
objective nature, thereby facilitating automated scoring. However, challenges persist in
automating scoring for subjective papers.

The central objective of our project is to design an algorithm that automates the evaluation of
subjective answers, thereby significantly reducing manual effort. Leveraging machine
learning techniques, we endeavour to deliver an automated scoring system that maintains
accuracy while minimizing human intervention.

Through this endeavour, we aim to streamline the assessment process, enhance efficiency,
and provide educators with a reliable tool for evaluating student responses, thereby fostering
a conducive learning environment.
ACKNOWLEDGEMENT

We express our deep gratitude to Ms. Vaishali Sharma, Assistant Professor, Department of
Computer Science & Engineering for his valuable guidance and suggestion throughout my
project work. We are thankful to Prof. (Dr.) Rakesh Kumar Arora, Project Coordinator for
their valuable guidance.

We would like to extend my sincere thanks to Prof. (Dr.) Saurabh Gupta, Head of
Department, Department of Computer Science & Engineering for his time to time
suggestions to complete my project work. I am also thankful to Prof. (Dr.) Niranjan
Bhattacharyya, Director for providing me the facilities to carry out my project work.

Name: Mohak Khatri Name: Mohammad Danish Khan


Enrollment No: 07615602720 Enrollment No: 07715602720

Name: Muhammad Imran Khan Name: Mukesh Chandra


Enrollment No: 08115602720 Enrollment No: 08215602720
TABLE OF CONTENTS
Candidates Declaration ……………………………………………………………………... i

Abstract.....................................................................................................................................ii

Acknowledgement....................................................................................................................iii

Table of Contents……………………………………………………………………………. iv

Chapter 1. Introduction….......................................................................................................1

1.1 Introduction….......................................................................................................2

1.2 Motivation ………………………………………………………………………. 3

Chapter 2. Literature Survey................................................................................................4

2.1 Survey of Existing System...................................................................................5

Chapter 3. Proposed System..................................................................................................7

3.1 Problem Statement and Objectives....................................................................8

3.2 Scope of the Work…............................................................................................10

3.3 Analysis.................................................................................................................12

3.4 Framework...........................................................................................................16

3.5 Algorithm............................................................................................................18

3.6 Details of Hardware & Software…....................................................................22

3.7 Design details........................................................................................................23

3.8 Methodology.........................................................................................................26

Chapter 4. Results..................................................................................................................27

4.1 Results...................................................................................................................28

Chapter 5. Conclusion and Future Scope ………………………………………………….29


5.1 Conclusion……………………………………………………………………… 30
5.2 Future scope.....................................................................................................31

Chapter 6. References........................................................................................................32

6.1 References........................................................................................................33
CHAPTER 1

INTRODUCTION

1
1.1 INTRODUCTION

Examinations have always been part of every educational, and non-educational organization.
Examinations can be either descriptive or objective or both. Every examination needs
evaluation. The majority of competitive exams are objective in structure. They happen on
similar machines that have been examined. These systems, or any other related methods,
offer greater advantages in terms of resource conservation. However, it has been noticed that
these systems can only contain multiple-choice questions and cannot be expanded to include
subjective questions. These methods cannot be used in board exams or university exams
where students give subjective answers due to a few issues, hence there is a need for software
that will aid in conserving resources.

It can be seen the amount of pressure that is held on the education system and teachers to
evaluate the n number of answer copies of the students. On average, each institute has four
examinations per year, resulting in more than 6.4 million answer sheets being generated.
Keeping faculty in mind, evaluating papers, and assigning grades are time-consuming tasks.
The traditional exam typically consisted of subjective answers, which were not the most
effective way of assessing the student's understanding of the subject. Because examiners can
be bored checking so many answer sheets, there may be an increase in false evaluations. As a
result, the Answer Verifier is required to grade the student after he or she has completed the
question paper. The process of evaluating the descriptive answer will not only save resources
but will also overcome human limitations. It will also help to speed up the overall educational
system because students will not have to wait as long for a reason.

The evaluation of subjective answers has long been a challenge for educators, employers, and
researchers. CheckMyAnswer, powered by machine learning algorithms, has emerged as a
solution to this challenge. These checkers can analyse and evaluate subjective responses,
providing objective and consistent feedback to users. However, the development and
implementation of machine learning algorithms come with challenges, including the need for
large amounts of training data and ensuring the transparency and explain ability of the
algorithms.

2
1.2 MOTIVATION

Assessing subjective answers holds paramount importance across diverse domains such as
education, recruitment, and competitive examinations. However, the manual assessment of
subjective responses entails significant drawbacks including time intensiveness, high costs,
and susceptibility to subjective biases. Consequently, there arises a pressing demand for
automated systems capable of efficiently and accurately evaluating subjective answers, thus
mitigating the aforementioned challenges.

The advent of automated subjective answer checking systems addresses these concerns by
streamlining the evaluation process, thereby reducing the time and resources expended in
assessing subjective responses. Moreover, automated systems offer the promise of
consistency and impartiality in feedback provision, thereby fostering equitable opportunities
for students or candidates to enhance their performance and achieve success.

Furthermore, the implementation of a subjective answer checker holds the potential for
scalability in evaluation processes, facilitating the efficient assessment of a large volume of
responses without compromising accuracy. This scalability is particularly invaluable in
contexts where the evaluation of numerous responses is requisite, such as large-scale
examinations or recruitment drives.

In the context of this project, CheckMyAnswer emerges as a pioneering solution aimed at


analysing and evaluating subjective responses. By leveraging advanced algorithms,
CheckMyAnswer endeavours to provide objective and precise scores to students or users,
thereby enriching the assessment landscape with its automated capabilities. Through this
initiative, the project seeks to revolutionize the subjective assessment paradigm, offering a
transformative approach to evaluating subjective responses across various domains.

3
CHAPTER 2

LITERTURE SURVEY

4
2.1 SURVEY OF EXISTING SYSTEM

Serial Author(s) Title of the paper & Major Contribution


No year of publication
(Old to recent)
1. Vaishali Sharma, Practical Approaches: Development of a
Khyati Ahlawat, and User Stories Domain recommendation system for user
Shorya Sharma Prediction and Keyword stories domain prediction and
Extraction for Insights. keyword extraction, facilitating
(2022) business analysts in writing
well- defined user stories and
providing insights for business
purposes through the extraction
of useful keywords and key
phrases.
2. Ronika Shrestha, Raj Automatic Answer Sheet The answer is captured in form
Gupta and Priya Checker (2022) of photos. Then the answers are
Kumari scanned and the keywords are
picked from the photos. And the
system will automatically
calculate result using two
algorithms of NLP and ANN.
3. Shreya Singh, Prof. Tool for Evaluating Introduction of a tool leveraging
Uday Rote, Omkar Subjective Answers AI for assessing subjective
Manchekar, Prof. (TESA) using AI (2021) answers, potentially improving
Sheetal Jagtap, accuracy and efficiency.
Ambar Patwardha,
Dr. Hariram Chavan
4. Prof. Era Johri, Prem ASSESS – Automated Google’s USE algorithm to
Chandak, Nidhi subjective answer generate sentence embeddings. It
Dedhia, Hunain evaluation using is use to encode both the
Adhikari, Kunal Semantic Learning answers into vectors and
Bohra (2021) similarity.

5
5. Muhammad Farrukh Subjective Answers Cosine similarity performs
Bashir, Hamza Evaluation Using poorly semantic-wise compared
Arshad, Abdul Machine Learning and to WDM but can make some
Rehman Javed Natural Language pretty good estimates where
Natalia Kryvinska Processing (2021) semantics are unnecessary.
and Shahab S. Band
6. Jagadamba G, And Online Subjective answer Used cosine similarity, text
Chaya Shree G verifying system Using gears grammar API. Systems
Artificial Intelligence efficiency was found to be in
(2020) range of 60- 90% based on
different parameters(text length,
keyword matching and both).
7. Ashutosh Shinde, Ai Answer Verifier 2018 Introduction of an AI-based
Nishit Nirbhavane, answer verifier, potentially
Sharda Mahajan, improving accuracy and
Vikas Katkar, efficiency in answer evaluation.
Supriya Chaudhary Used to grade the Multiple-
choice questions and provided
scoring on keywords, length and
grammar.
8. A. Singh Automated Essay Exploration of machine learning
Grading Using Machine techniques for automated essay
Learning (2015) grading, offering insights into
potential applications and
methodologies.

6
CHAPTER 3

PROPOSED SYSTEM

7
3.1 PROBLEM STATEMENT AND OBJECTIVES

Problem Statement:
The challenge lies in the conventional approaches to grading and assessment, which often
prove laborious and susceptible to human fallibility. This inherent inefficiency can impede
organizations in accurately evaluating the responses furnished by employees or students,
thereby compromising performance standards and hindering productivity levels. In response
to this predicament, CheckMyAnswer emerges as a transformative solution, streamlining the
evaluation process through automation. By harnessing automated mechanisms,
CheckMyAnswer offers a remedy to the shortcomings of traditional assessment methods,
furnishing organizations with more precise and efficient feedback mechanisms. Through its
automated evaluation capabilities, CheckMyAnswer endeavours to optimize performance
assessment practices, fostering enhanced productivity and bolstering organizational efficacy.

Objectives:
The essence of a subjective answer evaluation model lies in its capacity to automate the
intricate process of assessing and grading open-ended responses provided by students during
educational assessments. This model is engineered to scrutinize various aspects of the
answers, including their relevance, coherence, clarity, and comprehensive grasp of the subject
matter.

The overarching objectives of a subjective answer evaluation model encompass multifaceted


aims aimed at revolutionizing the assessment landscape:

1. Streamlining Evaluation: The foremost aim is to curtail the time and effort expended in
evaluating vast quantities of subjective responses, all while upholding the integrity and
impartiality of the grading process. By automating this labour-intensive task, the model seeks
to enhance efficiency without compromising accuracy or fairness.

2. Ensuring Consistency: Another pivotal goal is to ensure the consistent and dependable
grading of subjective responses, thereby mitigating the inherent risks of human bias and
variability. By adhering to predefined criteria and objective benchmarks, the model
endeavours to standardize the assessment process, fostering equity and reliability in grading
outcomes.

8
3. Enhancing Assessment Quality: A critical facet of the model's mission is to elevate the
calibre of educational assessments by instituting a standardized framework for grading
subjective answers. By aligning evaluation criteria with objective measures, the model
endeavours to fortify the validity and reliability of assessment outcomes, thereby bolstering
the overall quality of education.

Through the automation of the subjective answer evaluation process, educational institutions
stand to reap a plethora of benefits. Notably, this automation liberates valuable time and
resources, empowering educators and administrators to divert their focus towards pivotal
tasks such as curriculum development, student feedback mechanisms, and overall educational
enhancement initiatives. Furthermore, students stand to gain from expedited feedback
mechanisms, receiving timely insights into their performance, thereby fostering a culture of
continuous improvement and academic excellence. Thus, the implementation of a subjective
answer evaluation model transcends mere efficiency gains, heralding a paradigm shift in the
educational assessment landscape, characterized by enhanced fairness, consistency, and
educational quality.

9
3.2 SCOPE OF THE WORK

The ambit of CheckMyAnswer, an online tool meticulously crafted to aid both students and
teachers in the validation of answer accuracy, is delineated across several key dimensions,
each contributing to its comprehensive utility:

1. Real-Time Feedback Provision: At the core of its functionality, CheckMyAnswer


furnishes students with instantaneous feedback regarding the correctness of their
responses. This immediate feedback mechanism empowers students to promptly
identify and rectify any inaccuracies present in their answers, thereby fostering a
culture of continuous improvement and academic excellence.
2. Teacher Time Optimization: CheckMyAnswer serves as a boon for educators by
automating the arduous task of scrutinizing students' responses. By delegating the
responsibility of answer evaluation to the tool, teachers can reclaim valuable time that
would otherwise be expended on manual grading processes. This time-saving attribute
enables educators to allocate their resources more efficiently, focusing their attention
on devising innovative teaching methodologies and nurturing student engagement.
3. Cultivation of Independent Learning: A salient feature of CheckMyAnswer lies in
its capacity to cultivate independent learning among students. By affording learners
the autonomy to assess their responses sans external intervention, the tool instils a
sense of self-reliance and academic autonomy. This empowerment encourages
students to take ownership of their learning journey, fostering a culture of self-
directed inquiry and exploration.
4. Stimulating Self-Assessment Practices: CheckMyAnswer functions as a catalyst for
self-assessment and self-correction endeavours. Empowered with the ability to utilize
the tool for self-evaluation, students are incentivized to engage in critical reflection
and introspection, thereby honing their analytical skills and metacognitive abilities.
By pinpointing areas of improvement and facilitating self-directed remediation,
CheckMyAnswer catalyses the iterative process of learning and skill refinement.

In essence, the scope of CheckMyAnswer transcends mere validation of answer accuracy,


encompassing a multifaceted array of functionalities aimed at enriching the learning
experience for both students and educators. Through its real-time feedback provision, time-
saving attributes, promotion of independent learning, and stimulation of self-assessment

10
practices,

10
CheckMyAnswer emerges as a transformative tool poised to revolutionize the educational
landscape. By harnessing the power of technology to augment learning outcomes and foster
academic empowerment, CheckMyAnswer embodies a paradigm shift in educational
pedagogy, heralding a future characterized by enhanced efficacy, autonomy, and student-
centred learning paradigms.

11
3.3 ANALYSIS
Cosine Similarity stands as a prevalent metric in text analytics, serving as a cornerstone for
gauging the resemblance between two text documents. This method quantifies the cosine of
the angle formed between two vectors within a multidimensional space, garnering acclaim
due to several key attributes:

1. Computational Efficiency: One of its prominent virtues lies in its computational


simplicity and efficiency. Unlike more intricate similarity metrics such as Euclidean
distance, computing Cosine Similarity proves relatively straightforward and
expedient.
2. Scale-Invariance: Notably, Cosine Similarity remains unaffected by the scale of the
vectors under comparison. Since it solely evaluates the angle between vectors, rather
than their magnitudes, it permits the comparison of documents with varying lengths
without necessitating normalization or weighting.
3. Ubiquitous Usage in Natural Language Processing (NLP): Cosine Similarity
enjoys widespread adoption within the realm of natural language processing.
Numerous prominent machine learning models, including BERT and word2vec,
incorporate Cosine Similarity into their training and evaluation frameworks.

While Cosine Similarity holds utility in gauging text similarity across documents or vectors,
its efficacy in comparing the likeness of two sentences may be subject to limitations. Several
factors undermine its suitability in this context:

1. Sentence Length Sensitivity: Cosine Similarity's sensitivity to vector length poses a


challenge when applied to sentences. Varying sentence lengths result in vectors of
differing dimensions, potentially diminishing the effectiveness of Cosine Similarity.
2. Ignorance of Syntax and Grammar: The metric fails to account for syntactical and
grammatical nuances, which are pivotal in determining sentence similarity.
Consequently, sentences sharing similar semantic meaning yet differing in word order
or phrasing may yield low Cosine Similarity scores.
3. Disregard for Word Order and Semantics: Cosine Similarity disregards the
positional arrangement of words within sentences, as well as their semantic
relationships. Thus, sentences expressing identical ideas through disparate wording
may exhibit low Cosine Similarity, despite semantic equivalence.

12
4. Contextual Blindness: Cosine Similarity overlooks the contextual nuances in which
sentences operate. Consequently, sentences possessing divergent meanings across
different contexts may erroneously yield high Cosine Similarity scores.

While Cosine Similarity remains invaluable for text comparison at the document or vector
level, its limitations render it suboptimal for sentence similarity assessments. Thus, while it
serves as a valuable tool in text analytics, it necessitates judicious consideration and
supplementation with complementary methods when applied to sentence-level comparisons.

13
Jaccard Similarity:

Jaccard similarity emerges as a valuable metric for discerning text similarity between two
sentences, particularly under specific circumstances where word order holds little
significance, and the emphasis lies on the inclusion or exclusion of particular words.

This metric quantifies the similarity between two sets by examining the intersection and
union of their constituent elements. In the context of text analysis, each sentence is construed
as a set comprising its individual tokens. The Jaccard similarity score is then computed as the
ratio of the intersection of these sets to their union.

Jaccard similarity finds applicability in various scenarios, owing to its unique characteristics:

1. Keyword Matching: In scenarios necessitating the identification of documents or


sentences containing specific keywords, Jaccard similarity proves instrumental. By
discerning sentences with a substantial overlap in terms of keyword presence or
absence, it facilitates efficient keyword-based search and retrieval.
2. Short Text Analysis: When grappling with concise texts such as tweets or headlines,
Jaccard similarity emerges as a pertinent metric for gauging textual resemblance. Its
ability to capture the similarity of short texts enables effective analysis and
categorization of succinct content.
3. Plagiarism Detection: Jaccard similarity serves as a robust tool in the realm of
plagiarism detection. By comparing the similarity between a given text and a corpus
of reference documents, it facilitates the identification of instances where textual
content closely mirrors existing material, thereby aiding in the preservation of
academic integrity and intellectual property rights.

In essence, Jaccard similarity stands as a versatile measure, offering nuanced insights into
text similarity across diverse contexts. Its ability to disregard word order and focus solely on
the presence or absence of tokens renders it invaluable in scenarios necessitating efficient
keyword matching, short text analysis, and plagiarism detection. Through its distinctive
approach to similarity assessment, Jaccard similarity emerges as a pivotal asset in the arsenal
of text analytics tools, facilitating myriad applications across various domains.

Nevertheless, despite its utility, Jaccard similarity exhibits limitations in capturing the
comprehensive semantic likeness between two sentences. Its disregard for word order,
synonyms, and paraphrases renders it less suitable for applications prioritizing semantic

14
congruence. In such contexts, alternative measures leveraging pre-trained language models
may offer superior efficacy.

Jaccard similarity's shortcomings in certain text similarity assessments stem from several
factors:

1. Word Order Disregard: The metric solely evaluates whether two sets share identical
items, neglecting their order or position within the sentences. Consequently, sentences
with identical words arranged differently receive identical Jaccard similarity scores,
irrespective of their disparate meanings.
2. Synonyms and Paraphrases Sensitivity: Jaccard similarity's sensitivity to exact
word matches poses challenges when confronted with sentences conveying similar
meanings through varied wording. Instances where synonym usage or paraphrasing
occurs may result in artificially low Jaccard similarity scores, hindering accurate
semantic similarity assessment crucial in natural language processing (NLP)
applications.
3. Stop Words Equal Treatment: Jaccard similarity treats all words impartially,
including commonly occurring stop words such as "the," "and," and "a."
Consequently, the presence of stop words can unduly influence the Jaccard similarity
score, a phenomenon undesirable in certain applications requiring nuanced similarity
evaluation.
4. Contextual Ignorance: The metric operates in a context-agnostic manner, failing to
consider the contextual nuances in which sentences are situated. Consequently,
sentences possessing divergent meanings across different contexts may yield
misleadingly high Jaccard similarity scores, despite lacking semantic congruence
within the given context.

In summary, while Jaccard similarity remains a valuable tool for certain text similarity
assessments, its limitations underscore the necessity of exercising caution and employing
complementary measures in contexts necessitating nuanced semantic similarity evaluation.
By recognizing these constraints and supplementing Jaccard similarity with alternative
methodologies tailored to specific application requirements, practitioners can navigate the
intricacies of text similarity assessment with greater precision and efficacy.

15
3.4 FRAMEWORK

Fig 4: Design of CheckMyAnswer

The system workflow unfolds in a structured manner to facilitate efficient assessment


processes:

1. Model Answer Retrieval: Initially, the system retrieves the model answer provided
by the teacher, serving as the benchmark for evaluation.
2. Student Answer Submission: Subsequently, the student submits their response via
the designated text box, initiating the evaluation process.
3. Tokenization Using Bert Tokenizer: Both the model answer and the student's
response undergo tokenization employing the Bert tokenizer and the Hugging Face
transformer.

16
This step breaks down the text into constituent tokens, preparing them for further
analysis.
4. Utilization of Bert Model for Analysis: The tokenized answers are then fed into our
Bert model, which leverages sophisticated algorithms to assess both the similarity and
grammar of the student’s response in relation to the model answer.
5. Calculation of Similarity Score: Based on the comparison, the system generates a
similarity score encompassing three key parameters: perfect alignment, neutral
deviation, and contradiction. This comprehensive evaluation framework offers
nuanced insights into the correspondence between the student’s answer and the model
solution.
6. Display of Final Result: Finally, the system presents the outcome in a user-friendly
format, showcasing the total percentage of marks attained by the student. This
succinct representation encapsulates the evaluation results, enabling stakeholders to
gauge performance effortlessly.

In essence, the system workflow unfolds seamlessly, encompassing distinct stages ranging
from answer retrieval and tokenization to sophisticated analysis and result presentation. By
orchestrating these sequential steps, the system optimizes the assessment process, fostering
efficiency, accuracy, and transparency in evaluating student responses.

17
3.5 ALGORITHMS
Hugging Face Transformer:
Hugging Face, a pioneering company, spearheads the development and maintenance of
Transformers, an open-source software library tailored for seamless integration of cutting-
edge natural language processing (NLP) models like BERT, GPT, and RoBERTa.
Functioning atop PyTorch and TensorFlow frameworks, this library offers a user-friendly
interface, streamlining the utilization of pre-trained models for a myriad of NLP applications
including text classification, sentiment analysis, and question answering.
The adoption of Hugging Face is underpinned by several compelling factors:
1. Access to Pre-Trained Models: Hugging Face affords users access to a rich
repository of pre-trained NLP models, meticulously trained on vast volumes of text
data. These pre-existing models serve as a robust foundation, facilitating fine-tuning
for specific tasks and delivering state-of-the-art performance even with limited
training data.
2. Simplicity and Consistency: The Transformers library boasts an intuitive and
standardized interface, rendering it accessible to developers and researchers alike. Its
user-friendly design fosters experimentation with diverse models and tasks,
empowering users to explore novel avenues in NLP effortlessly.
3. Vibrant Community Engagement: Hugging Face cultivates a thriving community
of developers and researchers who actively contribute to the library's enhancement
and offer invaluable support to fellow users. This collaborative ethos fosters
knowledge exchange and innovation, enriching the ecosystem surrounding NLP
technologies.
4. Customizability and Adaptability: The library exhibits remarkable flexibility,
enabling users to tailor existing models to suit specific requirements, craft bespoke
models, or seamlessly integrate multiple models to enhance task performance. This
adaptability underscores its versatility and efficacy across a spectrum of NLP
applications.

The transformer architecture revolutionizes natural language processing through its


fundamental concept of self-attention, enabling the model to discern the significance of
various segments within the input sequence during prediction. This mechanism empowers the
model to focus its attention on the most pertinent elements of the input, enhancing its
predictive capabilities.

18
Comprising both an encoder and a decoder, the transformer architecture orchestrates the
processing of input sequences. The encoder interprets the input sequence, generating a series

19
of hidden states that encapsulate the encoded information. Subsequently, the decoder utilizes
these hidden states to formulate an output sequence, thus facilitating the transformation of
input data into meaningful outputs.

A pivotal feature distinguishing the transformer architecture is its adoption of multi-head


attention. This innovation permits the model to concurrently attend to disparate sections of
the input sequence, enabling it to capture intricate dependencies across the entirety of the
input.

Furthermore, the transformer architecture incorporates positional encoding to imbue the


model with an understanding of the sequential order inherent within the input sequence. This
obviates the need for traditional sequential processing techniques like recurrent or
convolutional layers, streamlining the architecture while preserving its efficacy.

The versatility of the transformer architecture extends across various NLP domains, including
machine translation, sentiment analysis, and question answering. Its widespread adoption
underscores its efficacy and adaptability in tackling diverse linguistic tasks, thus solidifying
its status as a foundational framework within the field of natural language processing.

11
0
Fig 5: Transformer architecture

11
1
BERT

BERT, short for Bidirectional Encoder Representations from Transformers, signifies a


groundbreaking achievement in natural language processing (NLP), conceived by Google in
2018. Serving as a pre-trained neural network, BERT heralds a new era in NLP by offering a
versatile framework that can be finely tuned to excel in an array of language-related tasks,
spanning sentiment analysis, text classification, and question answering.

Central to BERT's efficacy is its ingenious mechanism for capturing context-dependent


linguistic nuances. This feat is accomplished through the utilization of a bi-directional
transformer, enabling the model to assimilate the entirety of the input sequence during
prediction. Unlike its predecessors like Word2Vec and GloVe, which focus solely on word-
level context, BERT's holistic approach revolutionizes language understanding by
considering the context of the entire sequence, thus discerning more nuanced meanings.

BERT's proficiency is further underscored by its pre-training regimen, wherein it is exposed


to vast swaths of textual data sourced from extensive corpora such as Wikipedia and the
Common Crawl corpus. Employing a technique known as masked language modelling during
pre- training, BERT is tasked with predicting masked words within a given sentence, thereby
acquiring a comprehensive understanding of contextual word meanings. This pre-training
phase imbues BERT with a nuanced representation of word and phrase semantics, laying the
groundwork for subsequent fine-tuning on specific NLP tasks.

One of BERT's hallmark features is its facilitation of transfer learning, a paradigm where the
pre-trained model is adeptly tailored to suit distinct NLP tasks with minimal additional
training data. This characteristic proves particularly advantageous in scenarios characterized
by limited availability of labelled data, or where acquiring such data entails significant costs.
BERT's versatility in accommodating diverse task requirements while minimizing the need
for extensive training data underscores its practical utility and broad applicability across
various NLP domains.

The transformative impact of BERT transcends theoretical advancements, manifesting in its


remarkable performance across a gamut of NLP applications. Demonstrating state-of-the-art
proficiency in tasks ranging from sentiment analysis and text classification to question
answering and named entity recognition, BERT emerges as a veritable cornerstone in
contemporary NLP research and application. Its ability to grasp intricate linguistic nuances
and
20
adapt to diverse task requirements with unparalleled efficiency underscores its
indispensability in modern language processing endeavours.

In essence, BERT stands as a testament to the power of innovative thinking and rigorous
research in advancing the frontiers of natural language understanding. With its robust
architecture, comprehensive pre-training regimen, and seamless adaptability to diverse task
domains, BERT catalyses transformative breakthroughs in NLP, paving the way for enhanced
language comprehension and more sophisticated language-based applications.

21
3.6 DETAILS OF HARDWARE & SOFTWARE

Hardware Details:

1. CPU Processor: i3 10 generation


2. RAM: 4GB
3. Operating System: Linux, Windows, Mac
4. Graphics: NVIDIA GeForce GTX 1650
5. Operating System Architecture: 64 bits

Software Details:

1. Google Collaboratory
2. Navigator Anaconda version 3
3. Visual Studio Code version 2019

Libraries Details:

1. Python Version 3.9


2. NumPy
3. Pandas
4. Keras
5. TensorFlow

22
3.7 DESIGN DETAILS

Python emerges as a versatile and indispensable tool in the development of CheckMyAnswer,


offering multifaceted capabilities that span natural language processing (NLP), machine
learning (ML), web development, and deployment. Leveraging Python's expansive library
ecosystem, developers are equipped with a comprehensive toolkit to craft end-to-end
solutions tailored to CheckMyAnswer's requirements.

Python's rich library ecosystem, encompassing stalwarts like NumPy, pandas, and nltk,
empowers developers to harness sophisticated NLP and ML capabilities. These libraries
facilitate the development of advanced algorithms adept at evaluating subjective answers
with precision and efficacy. By leveraging NumPy and pandas for data manipulation and
analysis, and nltk for natural language processing tasks such as tokenization and sentiment
analysis, Python provides a robust foundation for implementing intricate evaluation
methodologies within CheckMyAnswer.

Moreover, Python frameworks like Flask facilitate the creation of intuitive web interfaces,
enabling seamless submission and viewing of answers by users. The simplicity and flexibility
of Flask streamline the development process, allowing developers to focus on crafting an
engaging user experience while ensuring efficient backend functionality.

TensorFlow, an open-source ML library, emerges as a pivotal component in


CheckMyAnswer's development arsenal. Equipped with a high-level API called Keras,
TensorFlow empowers developers to construct and train complex deep learning models
tailored to NLP tasks. By leveraging TensorFlow's formidable computational capabilities,
developers can train models on extensive datasets comprising answers and their
corresponding scores or probabilities. These trained models serve as the bedrock for
evaluating new answers submitted by users, facilitating the provision of accurate and
consistent feedback.

Flask stands out as a prominent Python web framework ideally suited for crafting web
applications like CheckMyAnswer. With its lightweight and adaptable architecture, Flask
offers developers a streamlined platform for implementing complex logic underlying answer-
checking algorithms. By leveraging Flask, CheckMyAnswer can be seamlessly integrated
into a web-based environment, where users conveniently submit their responses via web
forms.

23
The flexibility inherent in Flask empowers developers to concentrate on refining the
intricacies of the answer-checking logic, unencumbered by cumbersome web development
tasks. As users

24
submit their responses through the web interface, Flask efficiently processes the inputs using
either pre-trained models or custom algorithms. Subsequently, the application swiftly computes
and delivers a score or probability indicative of the response's correctness.

Flask's adeptness in handling HTTP requests and responses endows CheckMyAnswer with
robust functionality accessible through standard web browsers. The framework's capability to
manage the flow of data between clients and servers ensures seamless interaction, facilitating
an intuitive user experience.

On the other hand, Hugging Face Transformers emerges as a pivotal component in the arsenal
of tools for building CheckMyAnswer. This open-source library furnishes pre-trained models
tailored for diverse natural language processing tasks, ranging from text classification to
question answering and language translation. Leveraging these pre-existing models,
CheckMyAnswer harnesses the power of Hugging Face Transformers to compare a student's
response against the wealth of knowledge encoded within the model.

ReactJS emerges as an optimal choice for constructing the front-end of CheckMyAnswer,


facilitating the development of a user-friendly interface tailored for submitting and reviewing
answers. With ReactJS at the helm, developers can craft an aesthetically pleasing and highly
responsive web application geared specifically towards subjective answer assessment. This
front-end interface seamlessly integrates with back-end technologies like Flask or
TensorFlow, culminating in a comprehensive solution for subjective answer evaluation.

ReactJS's prowess extends beyond its capacity to create visually appealing interfaces. Its
component-based architecture fosters modularity and reusability, enabling developers to build
complex user interfaces with ease. By breaking down the interface into distinct, self-
contained components, ReactJS promotes code organization and maintainability, facilitating
collaborative development efforts.

Additionally, ReactJS's extensive ecosystem comprises a plethora of libraries and tools that
augment its functionality and streamline development workflows. From state management
solutions like Redux to routing libraries such as React Router, the ReactJS ecosystem offers a
rich array of resources to address diverse development needs.

ReactJS emerges as a versatile and powerful framework for building the front-end of
CheckMyAnswer and other web applications. Its component-based architecture, virtual DOM
optimization, one-way data binding, JSX syntax, and vibrant ecosystem collectively empower

25
developers to create intuitive, efficient, and feature-rich user interfaces, thereby elevating the
user experience and driving the success of subjective answer evaluation platforms like
CheckMyAnswer.

In summary, for the design details, ReactJS is chosen for CheckMyAnswer's front-end,
facilitating a user-friendly interface. Its component-based architecture ensures modularity and
reusability, while the virtual DOM enhances performance. One-way data binding simplifies
data management, and JSX streamlines development. ReactJS's extensive ecosystem offers
additional tools like Redux and React Router for enhanced functionality. Back-end
integration with Flask or TensorFlow completes the end-to-end solution for subjective answer
checking, aligning with the project's objectives for efficiency and usability.

26
3.8 METHODOLOGY

Dataset:

The SNLI corpus [9] (version 1.0) is a collection of 570k human-written English sentence
pairs manually labelled for balanced classification with the labels entailment, contradiction,
and neutral, supporting the task of natural language inference (NLI), also known as
recognizing textual entailment (RTE). We aim for it to serve both as a benchmark for
evaluating representational systems for text, especially including those induced by
representation learning methods, as well as a resource for developing NLP models of any
kind.

Model Building:

1. Tokenize the input data: Convert the raw text input into numerical inputs that can be
fed into the model. Hugging Face provides a tokenizer for each pre-trained model,
which converts the text input into numerical input sequences.
2. Choose a pre-trained model: Hugging Face offers a range of pre-trained models for
different NLP tasks such as question answering. Choose a pre-trained model that is
appropriate for the task.

Model Training:

1. Fine-tune the model: Fine-tuning involves training the pre-trained model on specific
task using given dataset. Hugging Face provides a range of utilities for fine-tuning the
pre-trained models, including trainers, optimizers, and schedulers.
2. Evaluate the model: Evaluate the performance of the model on a validation set to
determine how well it generalizes to new data. Hugging Face provides tools for
evaluating the model's performance on a range of metrics, such as accuracy.

Model Deploying:

Once the model has been trained and evaluated, it can be deployed using ReactJs in front-end
and flask in back-end. Hugging Face provides tools for deploying the model as a web
application.

27
CHAPTER 4

RESULTS

28
4.1 RESULTS

The end-to-end model is implemented using Python programming, Flask, HTML, and CSS
on the front end. For an introductory computer science course, the assessment model has been
tested with more than 20 questions and responses from 14 students. The expected answer and
student responses are compared to determine the similarity score. The keywords, grammar,
and semantics of words are checked to ensure that the response is accurate. The evaluator
module determines the best approach for similarity checking, or it can also be selected
manually. The total score is the sum of the similarity, grammar/ language, and keyword
scores.

Fig: A comparison of the final score obtained by the students using the proposed model and marks awarded by
the teacher.

Fig: A comparison of the marks obtained by the students using various similarity methods.

29
CHAPTER 5

CONCLUSION AND FUTURE SCOPE

21
0
5.1 CONCLUSION
Our research endeavours culminated in the successful development and implementation of
the CheckMyAnswer website, which serves as a dedicated platform for evaluating subjective
answers. In today's educational landscape, where traditional assessments are transitioning to
online formats, there exists a pressing need for systems capable of automating the grading of
subjective responses. While online multiple-choice questions are efficiently graded by testing
machines, the assessment of end-semester exams, predominantly subjective in nature, poses a
significant challenge. Our system addresses this gap by offering a comprehensive platform
that automates the grading process, thereby alleviating the time and effort associated with
manual evaluation.

By leveraging parameters such as similarity index and word matching, our system accurately
scores student responses against model solutions provided by instructors. This comparison
enables the system to assign grades objectively, ensuring fair and consistent evaluation across
all submissions. The seamless integration of these evaluation methodologies within
educational institutes' workflows enhances efficiency and streamlines the grading process.
Moreover, our system's versatility extends beyond traditional academic settings, finding
application in various online evaluation platforms and college portals.

The implementation of CheckMyAnswer represents a significant leap forward in addressing


the challenges associated with subjective answer evaluation. Through its innovative approach
and robust functionality, our system empowers educational institutions to adapt to the
evolving demands of modern assessment practices. By embracing automation and leveraging
advanced evaluation techniques, CheckMyAnswer not only enhances the grading process but
also fosters a conducive learning environment conducive to academic excellence.

30
5.2 FUTURE SCOPE
Our future endeavours encompass enhancing the capability of our system to evaluate
subjective answers through text extraction from images, accommodating responses enriched
with diagrams and mathematical expressions. Additionally, while the current system focuses
solely on assessing answers in English, we aim to extend its functionality to evaluate
responses in other languages, thereby promoting inclusivity and accessibility. Furthermore,
we plan to introduce advanced analytical features that enable examiners to gain deeper
insights into the evaluation process. This includes generating comprehensive analyses of
scores, allowing examiners to identify trends and patterns in student performance. Moreover,
our vision involves implementing machine learning algorithms to refine the system's
evaluation methodologies continuously, ensuring adaptability to evolving assessment criteria.
By embracing these advancements, we are committed to providing a versatile and robust
solution that empowers educators and fosters academic excellence across diverse linguistic
and educational contexts.

31
CHAPTER 6

REFERENCES

32
REFERENCES

[1] Vaishali Sharma, Khyati Ahlawat, and Shorya Sharma. Practical Approaches: User
Stories Domain Prediction and Keyword Extraction for Insights.

[2] Shinde, A., Nirbhavane, N., Mahajan, S., Katkar, V., & Chaudhary, S. (2018). Ai Answer
Verifier.

[3] Jagadamba G., & Chaya Shree G. (2020). Online Subjective answer verifying system
Using Artificial Intelligence.

[4] Bashir, M. F., Arshad, H., Javed, A. R., Kryvinska, N., & Band, S. S. (2021). Subjective
Answers Evaluation Using Machine Learning and Natural Language Processing.

[5] Johri, E., Chandak, P., Dedhia, N., Adhikari, H., & Bohra, K. (2021). ASSESS –
Automated subjective answer evaluation using Semantic Learning.

[6] Singh, S., Rote, U., Manchekar, O., Jagtap, S., Patwardha, A., & Chavan, H. (2021). Tool
for Evaluating Subjective Answers using AI (TESA).

[7] Shrestha, R., Gupta, R., & Kumari, P. (2022). Automatic Answer Sheet Checker.

[8] "Automated Essay Grading Using Machine Learning" by A. Singh et al. (2015), which
describes a machine learning approach to grading essays.

33

You might also like