SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2856
Jay Sharma,1 Harsh Hardel,1 Chirag Sahuji,1 and Rajesh Prasad1
1Department of Computer Science and Engineering, School Of Engineering,MIT Art, Design and
Technology University, Pune, Maharashtra – 412201
-------------------------------------------------------------***-------------------------------------------------------------
Abstract - The amount of data on the internet is increasing
day by day. So it becomes more difficult to retrieve important
points from large documents without reading the whole
document. Manually extracting key components from large
documents requires a lot of time and energy. Therefore an
automatic summarizer is must needed which will produce the
summary of a document on its own. In this, a text is given to
the computer and the computer using algorithms produces a
summary of the text as output. In this paper, we have discussed
our attempt to prepare an extraction-based automatic text
summarizer in which paragraphs of documents are split into
sentences and then these sentences are ranked based on some
features of summarization where higher rank sentences are
found to be important which is used to generate the summary.
1. INTRODUCTION
The World of text is vast and is evolving constantly. Most
information available on the internet is still in a text format.
We humans as frequent users of the internet have to go
through large and many different documents to gain that
information which takes lots of our time as well as energy.
Due to this massive increase in text information, a summary
of this information is must needed which will extract the
important points from a large set of information or
documents. Therefore, a process of producing a concise
summary while maintaining important information and the
overall meaning of the text is called Text Summarization.
When it comes to summarizing text, humans understand
the context of documents and create a summary that
conveys the same meaning as the original text but when it
comes to automation this task becomes way too complex.
Text Summarization is one of the use cases of Natural
Language Processing (NLP). Natural Language Processing is
the branch of Artificial Intelligence concerned with making
computers understand text and words in the same way as
human beings used to understand them. We can divide Text
summarization into two categories:-
1. Extractive Summarization: In this, the most
important parts of sentences are selected from the text
and a summary is generated from it.
2. Abstractive Summarization: In this method,
summary involves the words which are generally not
present in the actual text it means that it produces a
summary in a new way by selecting words on semantic
analysis just like how humans read articles and then write
the summary of it in their own words.
1.1 Contribution
Over the years, there has been a lot of discussion
regarding text summarization and how to deal with its
problems. Several papers have been published that discuss
various approaches to dealing with automating text
summarization. In this paper, we have extended this area in
novel ways and contributed to the area of text
summarization especially in the Ex- tractive approach
because computers still cannot understand every aspect of
the Abstractive approach of text summarization. Our study
can be summarized as follows:-
• We have presented the TextRank approach to deal
with extractive text summarization.
• Also, we have performed evaluation measures and
described them in detail.
• Further, we have made a comparison of our proposed
system with existing MS-Word approaches.
1.2 Organizational Structure
The outline of the rest of the paper is as follows: In section
II we have presented a literature review of text
summarization. Section III describes our pro- posed
idea/system of extractive text summarization. Section IV
describes the working algorithm we have used in our
proposed system. In section V, the result of our proposed
system is discussed which involves evaluation as well as
performance measures of our system. Section VI includes the
future scope of our system and concludes our study.
Automatic Text Summarization
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2857
2. LITERATURE SURVEY
There has been a lot of development in the field of Text
Summarizing in the past 9 years. With every advancement
comes a new technique with a more optimized approach.
Some of these techniques are mentioned below
2.1 Abstractive Text summarization Approaches
with Analysis of Evaluation Techniques [1]
The aim is to summarize large documents to extract the
key components from the text which makes it very
convenient to understand the context and main points of
the text. This paper was published in the year 2021 and the
author was Abdullah Khilji.
The approach described in the paper uses a baseline
machine learning model for summarizing long text. This
model takes raw text as input and gives a predicted
summary as output. They experiment with both abstractive
and extractive-based text summarization techniques. For
verifying the model they use BELU, ROGUE, and a textual
entailment method.
Dataset being used is Amazon fine foods reviews which
consist of half a million reviews collected over 10 years. The
methodology being used is Sequence to Sequence and
Generative Adversarial. However, the paper is missing
details of extractive summarization techniques.
2.2 A Comprehensive Survey on Text
Summarization Systems [2]
This paper describes the classification of summarization
techniques and also the criteria that are important for the
system to generate the summary. This paper was published
in 2009 and the author was Saeedeh Gholamrezazadeh.
The extractive approach is described as reusing the
portions of the main text which are italics, bold phrases, the
first line of each paragraph, special names, etc. For
Abstractive approach involves rewriting the original text in
a short version by replacing large words with their short
alternatives.
Drawbacks of Extractive summarization include
inconsistencies, lack of balance, and lack of cohesion.
Examples of extractive summarization are Summ-It applet
and examples of abstractive summarization are
SUMMARIST.
2.3 Extractive Automatic Text Summarization
Based on Lexical-Semantic Keywords [3]
This paper focuses on Automatic Text Summarization to
get the condensed version of the text. The method mentioned
also takes into account the title of the paper, the words
mentioned in it, and how they are relevant to the document.
However, the summary might not include all the topics
mentioned in the input text. This paper was published in the
year 2020 by A´ngel Hern´andez-Castan˜eda.
The proposed approach first passes the text to feature
generation methods like D2V, LDA, etc. to generate vectors
for each sentence. Then it undergoes clustering which
measures the proximity among different vectors generated
in the previous step. Then LDA is used to get the main
sentences above the rest of all sentences which are then used
to generate the summary.
This paper explains the whole process in a very precise
manner however there is less information on the methods
being used in some steps of the proposed flow. No info on
term frequency and inverse term frequency is present in the
paper.
2.4 Abstractive Text Summarization based on
Improved Semantic Graph Approach [4]
This paper talks about the graph-based method for
summarizing text. It also tells about how to graph-based
method can be used to implement abstractive as well as
extractive text summarization. This paper was published in
the year 2018 and is authored by Atif Khan.
According to almost all graph-based methods text is
considered as a bag of words and for summarization, it uses
content similarity measure but it might fail to detect
semantically equivalent redundant sentences. The proposed
system has two parts one for making the semantic graph and
the other part for improving the ranking algorithm based on
the weighted graph. Finally, after both parts are executed
successfully, an abstractive summary of the text is generated.
The paper has good information on the graph based
ranking algorithm and improved versions of it. However,
there is no mention of the text rank algorithm. Also, a
comparison between the graph-based ranking algorithm and
text rank ranking algorithm is absent.
2.5 Text Summarizer Using Abstractive and
Extractive Method [5]
In this paper the main motivation is to make computers
understand the documents with any extension and how to
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2858
make it generate the summary. In this method the system
uses a combination of both statistical and linguistic analysis.
This paper was published in the year 2014 and is authored
by Ms. Anusha Pai.
The system introduced in the paper takes text input from
user. For summarizing this input it firstly separates the
phases, then removes the stop words from the input. After
this it performs Statistical and Linguistic analysis to
generate the summary. This output is then sent and stored
in the database.
The system proposed generates summary according to
the input given by the user. This can further be improved by
adding synonyms resolution to the model which will treat
synonyms words as same. Also multiple documents
summarization support can be added.
2.6 Evolutionary Algorithm for Extractive Text
Summarization [6]
In this paper a new possibility is introduced, abstractive
text summarization might compose of novel sentences,
which are not present in the original document. The method
introduced uses an unsupervised document summarization
method which uses clustering and extracting to generate a
summary of the text in the main document. This paper was
published in 2009 by Rasim Algulie This method uses
Sentence clustering to classify sentences based on similarity
measures which classify the sentences in clusters. After this
objective function is used to calculate their importance.
Along with this Modified Discrete Differential Evolution
Algorithm is also mentioned in the paper.
No information is given about the Graph-based approach
and also no comparison between the graph based approach
and objective function is given. Also, the result is less
optimized than the approaches we saw in other papers.
2.7 Automatic Keyword Extraction for Text
Summarization: A Survey [7]
This paper talks about current present approaches for
summarization. It also talks about Extractive and
Abstractive text summarization. This paper was published
in the year 2017 and is authored by San- tosh Kumar Bharti.
This paper has divided Automatic Text Summarization
into 4 types Simple Statistics, Linguistic, Machine Learning,
and Hybrid. These are techniques in which we can
implement Automatic Text Summarization. In Simple
Statistics, we have Inverse Document Frequency, Relative
Frequency Ratio, Term Frequency, etc. In the Linguistic
Approach, we have Electronic Dictionary, Tree Tagger, n-
Grams, WordNet, etc. In Machine Learning we have SVM,
Bagging, HMM, etc. Hybrid is the combination of the previous
three mentioned.
This paper contains all the techniques present to the date
however since 2017 there has been a lot of improvement in
this field. This paper needs to be updated with the latest
technologies.
2.8 Automatic Keyword Extraction for Text
Summarization in Multi-document e-
Newspapers Articles [8]
This paper was published in the year 2017 and is authored
by Santosh Kumar Bharti. It takes about implementing
Extractive Summarization in everyday life for summarizing
e-Newspaper articles.
This paper shows the comparison between TF- IDF, TF-
AIDF, NFA, and Proposed methods. It uses F-measure as a
parameter to measure the efficiencies of the techniques. It
can be seen that the proposed method is more efficient than
the other three methods.
The paper only mentioned model implementation in
Newspaper articles and not in any other type of articles. Also,
only Extractive Summarization is used for this model. This
model should be tested on different articles and also using
different methodologies.
2.9 Graph-based keyword extraction for single-
document summarization [9]
This paper focuses on the approach used for selecting
keywords from the text input given by the user. The
comparison is done between the supervised and
unsupervised approaches to identify keywords. This paper
was published in the year 2008 by the writer Marina Litvak.
This approach takes into account some structured
document features using the graph-based syntactic
representation of the text and web documents which
improves the traditional vector-space model. For the
supervised approach, a summarized collection of documents
is used to train the classification algorithm which induces a
keyword identification model. Similarly, for the
unsupervised approach, the HITS algorithm is used on the
document graphs. This is done under the assumption that
the top-ranked nodes are representing the document
keywords.
The supervised Classification model provides the best
keyword identification accuracy. But a simple degree-based
ranking reaches the highest F-measure. Also, the only first
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2859
iteration of HITS is enough instead of running it till we get
convergence,
2.10 Extractive approach for text summarization
using graphs [10]
This paper implements the Extractive approach but uses
a different approach. It uses two matrices for sentence
overlap and edits distance to measure sentence similarity.
This paper was published in the year 2021 by the author
Kastriot Kadriu.
The proposed model takes a document as input. The
document undergoes tokenization after which
lemmatization is performed and is checked for de-
pendency parsing. Then the model checks for sentence
overlap and edits distance if necessary. After this graph
representation is done. Now we can apply different
algorithms to generate a summary of the text.
In the paper, the author has taken into account different
methods for generating the summary. This includes
Pagerank, hits, closeness, betweenness, degree, and clusters.
Finally, the summary is generated. The paper also shows a
comparison between different methods and how accurate
their summary is. We use F-score to measure their
effectiveness.
2.11 Implementation and Evaluation of
Evolutionary Connectionist Approaches to
Automated Text Summarization [11]
This paper implemented and compared the performance
of three text summarizations developed using existing
summarization systems to achieve connectionism. This
paper was published in 2010 by the authors Rajesh
Shardanand Prasad and Uday Kulkarni.
Three approaches used in the paper are based on
semantic nets, fuzzy logic, and evolutionary programming
respectively. The results they got were that the first
approach performs better than MS Word, the second
approach was resulting in an efficient system and the third
approach showed the most promising results in the case of
precision and F- measure. The paper has used the DUC 2002
dataset to evaluate summarized results based on precision
and F-measure.
Approaches used in this paper focus only on small details
related to general summarization rather than developing an
entire summarization system and thus are only helpful for
research purposes.
2.12 Feature Based Text Summarization [12]
This paper aims at creating a feature-based text
summarizer that is applied to different sizes of documents.
This paper was published in the year 2012 by author Dr.
Rajesh Prasad.
This paper follows the extractive way of summarizing the
text and utilizes the combination of nine features to calculate
the feature scores of each sentence and rank them according
to the scores they get. The higher rank sentences are part of
the final summary of the text. They have used different types
of documents that require different features to get a
summary as a data set for their model. This approach gave
better results when compared to MS word in terms of
precision, Recall and F-measure in most types of documents.
This paper shows great results for different types of
documents but some documents may require features more
than this paper has used so further research is needed in this
approach.
2.13 Review of Proposed Architectures for
Automated Text Summarization [13]
This paper aims to review various architectures which
have been proposed for automatic text summarization. This
paper was published in the year 2013 by its authors Tejas
Yedke, Vishal Jain, and Dr.Rajesh Prasad.
In this paper, different techniques for text summarization
are discussed and their advantages and drawbacks are also
reviewed. DUC 2002 data set is used to calculate results that
which approach per- forms better for which type of
document in terms of precision, recall and F-measure. The
limitation of this paper is that it doesn’t provide the most
effective technique for summarization.
2.14 Automatic Extractive Text
Summarizer(AETS): Using Genetic Algorithm
[14]
This paper aims at developing an extractive text
summarizer using the Genetic algorithm. This paper was
published in the year 2017 by its authors Alok Rai,
Yashashree Patil, Pooja Sulakhe, Gaurav Lal, and Dr.Rajesh
Prasad.
The approach this paper follows involves feature
extraction, fuzzy logic, and a genetic algorithm to train the
machine to produce better results in automatic
summarization. This paper defines Genetic algorithms as the
search strategies that cop with the population of
simultaneously seeking positions. Ac- cording to the input
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2860
file and compression rate given by the user, the authors
resulted in forming a meaningful summary as output text.
According to the paper genetic algorithm is a sentence-
choice-based technique and gives the best results when text
summarization is done.
2.15 A Novel Evolutionary Connectionist Text
Summarizer (ECTS) [15]
This paper aims to create an efficient tool that can
summarize large documents easily using the evolving
connectionist approach. This paper was published in the
year 2009 by Rajesh S.Prasad, Dr. U.V Kulkarni, and
Jayashree R.Prasad.
In this paper, a novel approach is proposed for part of
speech disambiguation using a recurrent neural network,
which deals with sequential data. Fifty random different
articles were used as data sets for this paper. Authors found
the accuracy of ECTS was ranging from 95 to 100 percent
which average accuracy of 94 percent when compared to
other summarizers.
Though this paper has explained the connectionist
approach very well there lies some issues in POS
disambiguation and deviations found in ECTS for a couple of
sentences.
2.16 Abstractive method of text summarization
with sequence to sequence RNNs [16]
This paper was published in the year 2019 and is
authored by Abu Kaisar Mohammad Masum. It takes
about how bi-directional RNN and LSTM canbe used in
the encoding layer along with the attention model in the
decoding layer for performing abstractive text
summarization on the amazon fine food review dataset.
The focus of this paper is Abstractive Summarization
by the use of sequence to sequence RNNs. It performs
Data Processing in which we Split Text, Add
Contractions, Remove stop words and perform
Lemmatization. After verifying if the text is purified,
we count the size of the vocabulary and add a word
embedding file.
The next step is the addition of special tokens which
mark important waypoints in the dataset. After this, an
Encoder layer and Decoder Layer are present in LSTM
and then the Sequence to Sequencemodel is built.
The model is then trained with data for generating the
response summary. Even though the above model
performs well for short text it suffers when long text
input is given. Another drawback is thatit is currently
trained for the English language but no such
summarizer is available for other languages.
2.17 Automatic Text Summarization Using Local
Scoring and Ranking [17]
This paper was published in the year 2017 and is
authored by Diksha Kumar. Instead of building a text
summarizer, this paper focuses on improvingthe currently
available Automatic Text Summarizer to achieve more
coherent and meaningful summaries.The model introduced
in the paper uses an automatic feature-based extractive
text summarizer to better understand the document and
improve its coherence. The summary of the given input is
generated based on local scoring and local ranking. Wecan
select the top n sentences in the ranking for thesummary.
Here n depends on the compression ratio of the
summarizer. Feature Extraction can be done based on
differentcriteria. It can be done based on the frequency of
a word appearing in a sentence by selecting the words
occurring the most, based on the length of the sentence by
avoiding too short or too long sentences, based on the
position of the sentence by giving a highscore to the first
sentence and less to the second sentence and so on, based
on sentences overlapping thetitle or heading which can be
considered important and based on similarity of a
sentence concerning all other sentences in the document.
2.18 A Genetic Fuzzy Automatic Text Summarizer
[18]
This paper was published in the year 2009 and is
authored by Daniel Leite. This paper focuses on the fuzzy-
based ranking system to select the sentences for
performing extractive summarization on the input data
set.
The fuzzy knowledge base used in this model was
generated by a genetic algorithm. For fitness function,
ROGUE in formativeness measures were adopted and a
corpus of newswire text is being used along with their
human-generated summaries for defining the fuzzy
classification rules.
The paper also talks about SuPor-2 features which use a
Na¨ıve-Bayes probabilistic classifier to find the relevance
of a sentence to be extracted from the dataset to be in the
generated summary. It has 11 features that address either
the surface or the linguistic factors that interact with one
another to find the relevance of a sentence. For future
scope, the paper talks about using ideal mutation and
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2861
crossing rates in the evolution phase for the genetic
algorithm. Also, membership functions can be explored
for modeling fuzzy sets.
2.19 Enhancing Performance of Deep Learning
Based Text Summarizer [19]
This paper was published in the year 2017 and is
authored by Maya John. This paper aims to enhance the
performance of the currently present deep learning
model used for text summarization.
In current deep learning models, the summary sentences
form the minority class which is very small when
compared to the majority class which leads to inaccuracy in
a summary generation. To enhance the performance, data
can be resampled before giving it to the deep learning
model.
The proposed system is divided into steps for
simplicity. These are text preprocessing, feature
extraction, resampling, and classification. Inside text
preprocessing we have tokenization, stop word removal,
stemming, and lemmatization. Along with this several
resampling, mechanisms are discussed in the paper
which can be used on the data set for improving the
classifier performance.
2.20 Cut and Paste Based Text Summarization
[20]
This paper was published in the year 2000 and is
authored by Hongyan Jing. This paper’s text
summarization model is based on the examination of
human written abstracts for a specific text.
The model extracts text from the input for generating a
summary and removes the inessential phrases from the
text. The phrases are given as output and then joined
together to form coherency. This is done based on a
statistically based sentence decomposition program
which finds where the phrases of a summary begin in the
original text input. This produces an aligned corpus of
the summary along with the articles used to make the
summary.
The model uses a Corpus of human-written abstracts
for analyzing the input. It also uses WordNet for Sentence
reduction and Sentence combination. Even though it is
very accurate, when we test this model with current
models it is very simple and old compared to the ones that
are being used currently. A lot of advancement has been
done in this field since the time this paper was published.
3. PROPOSED SYSTEM
3.1 Preprocessing
Preprocessing involves the taking text as input from the
user in different forms like Wikipedia or other links,
simple text box and upload text document, etc. and stop
words from the sentences are removed.
3.2 Graph Building
These sentences are represented as nodes with all their
properties and edges representing interest (similarity)
between two sentences.
3.2 Sentence Ranking Algorithm
The concept of Sentence Ranking tells that a document
ranks high in terms of ranking, if high ranking documents
are linked to it. Here we have used a Textrank algorithm
inspired by the Pagerank algorithm which extracts all
sentences from input text, create vectors for all sentences,
then the similarity between sentence pairs is calculated
and finally, sentences are ranked based on score.
3.3 Summarization
Based on the Similarity Score obtained from the
similarity matrix the top N sentences are to be included in
the final summary and thus Summary is generated from
the original text as the output of the system.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2862
FIG. 1. Proposed System Architecture or Flow Diagram
4. ALGORITHM
4.1 Preprocessing
The algorithm that we have used here is called as Text
Rank Algorithm which is inspired by Google’s Page rank
algorithm and is used for ranking web pages. Consider a
web page A which has a link to a web page Z. A PageRank
score is calculated based on the probability of users
visiting that page to rank these pages. A Matrix is created
to store the probabilities of users navigating from one
page to another is called Similarity Matrix. The
probability of going from page A to Z is M[A][Z]
initialized as 1/(number of unique links on the website).
Suppose A which contains links to 2 pages, so the
contribution of A to PageRank of Z will be PageRank
(A)/2. If there is no link between web pages then
probability should be initialized as 0. To solve this issue a
constant d called as damping constant is added. So final
equation will be:
PageRank(Z)=(1-d)+d*(PageRank(A)/2)
Text Rank has similar logic as PageRank but there are
some changes like in-place of web pages Text sentences
are taken and a similarity matrix is calculated based on
similarity values between two sentences using maximum
common words. So accordingto the algorithm to create a
graph of sentence ranking a vertex of each sentence is
created and added to the graph. Further, the similarity
between the two sentences is calculated based on the
common wordtoken present in the two sentences. In the
graph, an edge between two vertices or sentences
denotesthe similar content or interest among two sentences.
Long sentences are avoided to be recommended in summary
by multiplying with a normalizing factor.
FIG. 2. Similarity graph drawn based on similarity
matrix for sentence similarity
The similarity between the two sentences is givenby:
where given two sentences Si and Sj, with a sentence as
set N-words appear in that sentence. Let’s take an
example to illustrate the working of algorithm:
Let there be three sentences as follows:
A=’ He is a tall guy ’
B=’ He has a lot of friends ’
C=’ Jay is his close friend ’
Initialize TextRank(A), TextRank(B), TextRank(C) as 1
and take d as 0.85.
Figure 3 shows how the similarity matrix is created for
the above sentences.
TextRank(A)
=(1-0.85) + 0.85 * (TextRank(B) * M[B,A] +
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2863
TextRank(C) * M[A,C])
= 0.15 + 0.85*(1.34*0.5+1*0.2) = 0.889
TextRank(C)
= (1-0.85) + 0.85 * (TextRank(A) * M[C,A] +
TextRank(B) * M[C,B])
FIG. 3. Similarity Matrix where values shows the
similarity between two sentences
=0.15 + 0.85*(0.889*0.2+1.34*0.9) = 1.326
The above process is continue for every sentence for n
iterations. Therefore all the sentences are arranged
according to their text ranks and the most important
sentences are added to a summary.
Graphical heat map representation of similarity matrix is
shown in FIG. 4
FIG. 4. Graphical representation of Similarity Matrix
using a Heat map
4. RESULT AND ANALYSIS
Applying the proposed idea and algorithm we have
implemented an automatic text summarization system
that takes text as input from the user and gives him
precise summary. The experimental results and analysis
of our proposed system is been discussed in this section.
The proposed summarization system is implemented in
Python using Flask and nltk library. For testing purposes,
we have used BBC News summary data set from Kaggle.
This data set contains the documents of news articles in
different categories and their extractive reference
summary respectively.
Evaluation Measures: Evaluation of our proposed
Text Summarization System is carried out to determine
the quality of the summary produced by the system. We
have evaluated our system using the ROUGE evaluation
measure. ROUGE stands for Recall-Oriented Understudy
for Gisting Evaluation which is a set of metrics to evaluate
the quality of the summary produced by the system.
ROUGE-N measures the number of matching n-grams
between our model generated summary and human
generated summary. ROUGE-1 measures for uni-gram.
Similarly ROUGE-2 measures for bi-grams. ROUGE-L
measures the longest common sub-sequence between our
model and referenced summary. We have calculated the
precision, recall, and f-measure of each ROUGE-1, ROUGE-
2 and ROUGE-L. Precision is the proportion of correctness
of sentences in the summary. For precision, higher the
value, better the system performs to generate summary.
Precision = Retrieved Sentences ∩ Relevant Sentences
Retrieved Sentences
The Recall is the ratio of the relevant sentence in the
summary.
Recall = Retrieved Sentences ∩ Relevant Sentences
Relevant Sentences
F-measure calculates the harmonic mean of precision
and recall.
Higher is the value, higher is the similarity between the
model generated and referenced summary.
Performance Measures: When it comes to the
performance of the summarizer it should give a concise
summary of text similar to a human-generated
summary. The performance of our system is evaluated
TextRank(C) * M[B,C])
=0.15 + 0.85*(1*0.5+1*0.9) = 1.340
TextRank(B)
= (1-0.85) + 0.85 * (TextRank(B) * M[A,B] +
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2864
on summary available in BBC news dataset using
evaluation measures described above. We havetaken 5
documents from the dataset of each category. Then we
generated a summary for each document using our
proposed summarizer. For experimentation, the
summary is generated for different percentages of the
summary that the user wants to generate, and this
generated summary is evaluated on the extractive
summary available in the dataset using evaluation
measures.
To check for the efficiency of our proposed system we
used it to generate summary of different documents and
then compared that with the summary generated by one
of the currently present system forgenerating summary.
To check the efficiency we have
FIG. 5. Comparison of precision, recall and f-measure of 5
documents when summary generated is 15% of the total
document length
FIG. 6. Comparison of precision, recall and f-measure of 5
documents when summary generated is 30% of the total
document length
3 evaluations. First, we compare our proposed system
generated summary with the original summary of the
input. We can find the original summary of the document
in the database along with the input document. By this we
can check the accuracy of our model. Secondly, we
compare our proposed system generated summary with
the summary generated for the same document by some
other currently present summarization model. By this we
can check the performance of our model with respect to
the currently present model. Lastly, we check our system
generated summary with the summary made by a human.
This is one of the most important evaluation as it gives us
the exact idea of the summary that is required by a user
and how much of it is present in the summary generated
by our model. We have tested our model for 2 different
lengths of generated summary: 15% and 30% length of
the input document. When the summary for different
documents is evaluated at the rate of 15% of the summary
the computed evaluation measures we got can be seen in
FIG.5 and similarly for a rate of 30% of summary the
computed evaluation can be seen in FIG.6.
To test the summarizer we have summarized the
different documents from the dataset based on domains
of politics, entertainment, sports, technology, etc. This is
done to see how our proposed system reacts to
documents of different types, different lengths, etc. and
how the generated summary is affected by these.
Precision, recall, and F-measure for each document was
calculated to test the content understanding of our
proposed summarization system.
TABLE I. Comparison of precision values of proposed
system with respect to existing MS-word system
Methods Dataset R-1 R-2 R-L
TextRank
D1 0.9607 0.8833 0.9411
D2 0.9756 0.9313 0.9756
D3 0.9324 0.9052 0.9324
D4 0.9583 0.8852 0.9583
D5 0.9487 0.9189 0.9487
Average 0.9551 0.9047 0.9512
MS-Word
D1 0.9534 0.8333 0.9534
D2 0.909 0.8645 0.909
D3 0.9 0.8809 0.9
D4 0.7619 0.5925 0.7619
D5 0.7391 0.6144 0.7391
Average 0.8526 0.7571 0.8526
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2865
These evaluation measures are further compared with
MS WORD 2007 Summarizer systems to test how well
our proposed system works compared to the existing
summarization tool. The result shown in Table I shows
the comparison of average values of precision for
different ROUGE measures
i.e. ROUGE-1(R-1), ROUGE-2(R-2) and ROUGE- L (R-L)
of our proposed system and MS Word 2007
summarization system, similarly result shown in Table
II shows the average values of recall of our proposed
system and MS Word 2007 summarization system and
Table III result shows the average values of F-measure of
our proposed system and MS Word 2007
summarization system.
TABLE II. Comparison of recall values of proposed system
with respect to existing MS-word system
Methods Dataset R-1 R-2 R-L
TextRank
D1 0.4537 0.3812 0.4444
D2 0.5194 0.426 0.5194
D3 0.5036 0.4257 0.5036
D4 0.4646 0.3802 0.4646
D5 0.5401 0.4766 0.5401
Average 0.4963 0.4179 0.4944
MS-Word
D1 0.3796 0.2877 0.3796
D2 0.4545 0.3721 0.4545
D3 0.4598 0.3663 0.4598
D4 0.4848 0.338 0.4848
D5 0.3722 0.2383 0.3722
Average 0.4662 0.3736 0.4652
Methods Dataset R-1 R-2 R-L
TextRank
D1 0.6163 0.5326 0.6037
D2 0.6779 0.5846 0.6779
D3 0.654 0.5791 0.654
D4 0.6258 0.532 0.6258
D5 0.5401 0.4766 0.5401
Average 0.6228 0.541 0.6203
MS-Word
D1 0.543 0.4278 0.543
D2 0.606 0.5203 0.606
D3 0.6086 0.5174 0.6086
D4 0.5925 0.4304 0.5925
D5 0.3722 0.2383 0.3722
Average 0.5872 0.4891 0.5858
TABLE III. Comparison of f-measure values of proposed
system with respect to existing MS-word system
4. CONCLUSION AND FUTURE WORK
Graphs above show that the Precision, recall and, f-
measure values of our model for different length of
summary shows that the proposed system extracts a
good amount of retrieved sentences from the document
in the summary which shows the sign of good accuracy
of our proposed summarization system. Also for 30%
summary generated we can see that precision, recall and
f-measure values are better compared to when 15%
summary is generated.
Thus, our system performs better for longer length of
summary.
As shown in Table I it is clear that the average
precision values of our proposed system are better
compared to the values we get from the MS Word
summarizer for three documents and are close to equal
for two documents. Similarly in Table II and Table III
shows that our system also gives better recall and f-
measure average values when compared to MS-Word
2007 summarization system.
A Multilingual feature is also added to our
summarization model which can help generate summary
of different spoken languages text documents so that the
people around the world can also use our Text
summarization system.
28
66
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2866
It is concluded that the achieved results of our pro-
posed text summarization model are better in most of the
aspects compared to existing models and therefore it can
be a good start to further studies.
As extractive summarization is evolving day by day
with increase in its research and more algorithms coming
out, in the future we will try to get more accuracy in
generating summary using extractive approach which
would be more feature based i. e it will include more
features than just sentence ranking based on correlation
between them such as sentence length, word based
similarity, title features, etc. Also we would try to use
more advanced algorithms like neural networks for
generating summary.
The extractive approach is more straightforward
because copying big sections of text from the original
document ensures grammar and accuracy. Para- phrasing,
Generalization and, Assimilation are a part of abstractive
summarization. Even though abstractive summarization is
a more difficult process, thanks to recent improvements
in the deep learning field, there has been some progress.
In future it might be possible that the summary generated
using extractive approach can be more accurate and
meaningful compared to abstractive approach.
5. REFERENCES
[1] Abdullah Faiz Ur Rahman Khilji, Utkarsh Sinha, Pintu
Singh, Adnan Ali, and Partha Pakray. Abstractive text
summarization approaches with analysis of
evaluation techniques. In International Conference
on Computational Intelligence in Communications
and Business Analytics, pages 243–258. Springer,
2021.
[2] Saeedeh Gholamrezazadeh, Mohsen Amini Salehi,
and Bahareh Gholamzadeh. A comprehensive survey
on text summarization systems. In 2009 2nd
International Conference on Computer Science and
its Applications, pages 1–6. IEEE, 2009.
[3] A´ngel Hern´andez-Castan˜eda, Ren´e Arnulfo
Garc´ıa Hern´andez, Yulia Ledeneva, and
Christian Eduardo Mill´an-Hernandez. Extractive
automatic text summarization based on lexical-
semantic keywords. IEEE Access, 8:49896–49907,
2020.
[4] Atif Khan, Naomie Salim, Haleem Farman, Murad
Khan, Bilal Jan, Awais Ahmad, Imran Ahmed, and
Anand Paul. Abstractive text summarization based on
improved semantic graph approach. International
Journal of Parallel Programming, 46(5):992–1016,
2018.
[5] A Pai. Text summarizer using abstractive and
extractive method. International Journal of
Engineering Research & Technology,
3(5):22780181, 2014.
[6] Rasim Alguliev, Ramiz Aliguliyev, et al. Evolutionary
algorithm for extractive text summarization.
Intelligent Information Management, 1(02):128,
2009.
[7] Santosh Kumar Bharti and Korra Sathya Babu.
Automatic keyword extraction for text
summarization: A survey. arXiv preprint
arXiv:1704.03242, 2017.
[8] Santosh Kumar Bharti, Korra Sathya Babu, Anima
Pradhan, S Devi, TE Priya, E Orhorhoro, O
Orhorhoro, V Atumah, E Baruah, P Konwar, et al.
Automatic keyword extraction for text
summarization in multi-document e-newspapers
articles. European Journal of Advances in
Engineering and Technology, 4(6):410–427, 2017.
[9] Marina Litvak and Mark Last. Graph-based
keyword extraction for single-document
summarization. In Coling 2008: Proceedings of
the work- shop Multisource Multilingual
Information Extraction and Summarization, pages
17–24, 2008.
[10] Kastriot Kadriu and Milenko Obradovic. Extractive
approach for text summarisation using graphs. arXiv
preprint arXiv:2106.10955, 2021.
[11] Rajesh Shardan and Uday Kulkarni. Implementation
and evaluation of evolutionary connectionist
approaches to automated text summarization. 2010.
[12] Rajesh Shardanand Prasad, Nitish Milind Uplavikar,
Sanket Shantilalsa Wakhare, VY Jain, Tejas Avinash,
et al. Feature based text summarization.
International journal of advances in computing and
information researches, 1, 2012.
[13] Tejas Yedke, Vishal Jain, and RS Prasad. Review of
proposed architectures for automated text
summarization. In Proceedings of International
Conference on Advances in Computing, pages 155–
161. Springer, 2013.
[14] Alok Rai, Yashashree Patil, Pooja Sulakhe, Gaurav
Lal, and Rajesh S Prasad. Automatic extractive text
summarizer (aets): Using genetic algorithm. no,
3:2824–2833, 2017.
28
67
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2867
[15] Rajesh S Prasad, UV Kulkarni, and Jayashree R
Prasad. A novel evolutionary connectionist text
summarizer (ects). In 2009 3rd International
Conference on Anti-counterfeiting, Security, and
Identification in Communication, pages 606–610.
IEEE, 2009.
[16] Abu Kaisar Mohammad Masum, Sheikh Abujar, Md
Ashraful Islam Talukder, AKM Shahariar Azad Rabby,
and Syed Akhter Hossain. Abstractive method of text
summarization with sequence to sequence rnns. In
2019 10th International Conference on Computing,
Communication and Networking Technologies
(ICCCNT), pages 1–5. IEEE, 2019.
[17] Hari Disle Sameer Gorule Ketan Gotaranr Diksha
Kumar, Sumeet Bhalekar. Automatic text
summarization using local scoring and ranking. In
2017 international conference on computing
methodologies and communication (ICCMC), pages
59–64. IJRTI, 2019.
[18] Daniel Leite and L Rino. A genetic fuzzy automatic
text summarizer. Csbc2009. Inf. Ufrgs. Br, 2007:779–
788, 2009.
[19] Maya John and JS Jayasudha. Enhancing performance
of deep learning based text summarizer. Int. J. Appl.
Eng. Res, 12(24):15986–15993, 2017.
[20] Hongyan Jing and Kathleen McKeown. Cut and paste
based text summarization. In 1st Meeting of the
North American Chapter of the Association for
Computational Linguistics, 2000.
Ad

More Related Content

Similar to Automatic Text Summarization (20)

Review of Topic Modeling and Summarization
Review of Topic Modeling and SummarizationReview of Topic Modeling and Summarization
Review of Topic Modeling and Summarization
IRJET Journal
 
EXTRACTIVE TEXT SUMMARISATION TECHNIQUES- A SURVEY
EXTRACTIVE TEXT SUMMARISATION TECHNIQUES- A SURVEYEXTRACTIVE TEXT SUMMARISATION TECHNIQUES- A SURVEY
EXTRACTIVE TEXT SUMMARISATION TECHNIQUES- A SURVEY
IRJET Journal
 
8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network
INFOGAIN PUBLICATION
 
TextlyticResearchPapdaswer bgtyn ghner.pdf
TextlyticResearchPapdaswer bgtyn ghner.pdfTextlyticResearchPapdaswer bgtyn ghner.pdf
TextlyticResearchPapdaswer bgtyn ghner.pdf
araba8
 
MULTI-DOCUMENT SUMMARIZATION SYSTEM: USING FUZZY LOGIC AND GENETIC ALGORITHM
MULTI-DOCUMENT SUMMARIZATION SYSTEM: USING FUZZY LOGIC AND GENETIC ALGORITHM MULTI-DOCUMENT SUMMARIZATION SYSTEM: USING FUZZY LOGIC AND GENETIC ALGORITHM
MULTI-DOCUMENT SUMMARIZATION SYSTEM: USING FUZZY LOGIC AND GENETIC ALGORITHM
IAEME Publication
 
Comparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization TechniquesComparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization Techniques
IRJET Journal
 
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET Journal
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
ugginaramesh
 
IRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text DocumentIRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text Document
IRJET Journal
 
Text Summarization Using the T5 Transformer Model
Text Summarization Using the T5 Transformer ModelText Summarization Using the T5 Transformer Model
Text Summarization Using the T5 Transformer Model
IRJET Journal
 
IRJET- Text Highlighting – A Machine Learning Approach
IRJET- Text Highlighting – A Machine Learning ApproachIRJET- Text Highlighting – A Machine Learning Approach
IRJET- Text Highlighting – A Machine Learning Approach
IRJET Journal
 
IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...
IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...
IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...
IRJET Journal
 
Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)
Don Dooley
 
IRJET - Automated System for Frequently Occurred Exam Questions
IRJET - Automated System for Frequently Occurred Exam QuestionsIRJET - Automated System for Frequently Occurred Exam Questions
IRJET - Automated System for Frequently Occurred Exam Questions
IRJET Journal
 
A Survey on Automatic Text Summarization
A Survey on Automatic Text SummarizationA Survey on Automatic Text Summarization
A Survey on Automatic Text Summarization
IRJET Journal
 
Automatic Summarization in Chinese Product Reviews
Automatic Summarization in Chinese Product ReviewsAutomatic Summarization in Chinese Product Reviews
Automatic Summarization in Chinese Product Reviews
TELKOMNIKA JOURNAL
 
IRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET- A Novel Approch Automatically Categorizing Software TechnologiesIRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET Journal
 
A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...
IJECEIAES
 
Development of Information Extraction for Data Analysis using NLP
Development of Information Extraction for Data Analysis using NLPDevelopment of Information Extraction for Data Analysis using NLP
Development of Information Extraction for Data Analysis using NLP
IRJET Journal
 
IRJET- A Survey Paper on Text Summarization Methods
IRJET- A Survey Paper on Text Summarization MethodsIRJET- A Survey Paper on Text Summarization Methods
IRJET- A Survey Paper on Text Summarization Methods
IRJET Journal
 
Review of Topic Modeling and Summarization
Review of Topic Modeling and SummarizationReview of Topic Modeling and Summarization
Review of Topic Modeling and Summarization
IRJET Journal
 
EXTRACTIVE TEXT SUMMARISATION TECHNIQUES- A SURVEY
EXTRACTIVE TEXT SUMMARISATION TECHNIQUES- A SURVEYEXTRACTIVE TEXT SUMMARISATION TECHNIQUES- A SURVEY
EXTRACTIVE TEXT SUMMARISATION TECHNIQUES- A SURVEY
IRJET Journal
 
8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network
INFOGAIN PUBLICATION
 
TextlyticResearchPapdaswer bgtyn ghner.pdf
TextlyticResearchPapdaswer bgtyn ghner.pdfTextlyticResearchPapdaswer bgtyn ghner.pdf
TextlyticResearchPapdaswer bgtyn ghner.pdf
araba8
 
MULTI-DOCUMENT SUMMARIZATION SYSTEM: USING FUZZY LOGIC AND GENETIC ALGORITHM
MULTI-DOCUMENT SUMMARIZATION SYSTEM: USING FUZZY LOGIC AND GENETIC ALGORITHM MULTI-DOCUMENT SUMMARIZATION SYSTEM: USING FUZZY LOGIC AND GENETIC ALGORITHM
MULTI-DOCUMENT SUMMARIZATION SYSTEM: USING FUZZY LOGIC AND GENETIC ALGORITHM
IAEME Publication
 
Comparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization TechniquesComparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization Techniques
IRJET Journal
 
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET Journal
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
ugginaramesh
 
IRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text DocumentIRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text Document
IRJET Journal
 
Text Summarization Using the T5 Transformer Model
Text Summarization Using the T5 Transformer ModelText Summarization Using the T5 Transformer Model
Text Summarization Using the T5 Transformer Model
IRJET Journal
 
IRJET- Text Highlighting – A Machine Learning Approach
IRJET- Text Highlighting – A Machine Learning ApproachIRJET- Text Highlighting – A Machine Learning Approach
IRJET- Text Highlighting – A Machine Learning Approach
IRJET Journal
 
IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...
IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...
IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...
IRJET Journal
 
Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)
Don Dooley
 
IRJET - Automated System for Frequently Occurred Exam Questions
IRJET - Automated System for Frequently Occurred Exam QuestionsIRJET - Automated System for Frequently Occurred Exam Questions
IRJET - Automated System for Frequently Occurred Exam Questions
IRJET Journal
 
A Survey on Automatic Text Summarization
A Survey on Automatic Text SummarizationA Survey on Automatic Text Summarization
A Survey on Automatic Text Summarization
IRJET Journal
 
Automatic Summarization in Chinese Product Reviews
Automatic Summarization in Chinese Product ReviewsAutomatic Summarization in Chinese Product Reviews
Automatic Summarization in Chinese Product Reviews
TELKOMNIKA JOURNAL
 
IRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET- A Novel Approch Automatically Categorizing Software TechnologiesIRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET Journal
 
A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...
IJECEIAES
 
Development of Information Extraction for Data Analysis using NLP
Development of Information Extraction for Data Analysis using NLPDevelopment of Information Extraction for Data Analysis using NLP
Development of Information Extraction for Data Analysis using NLP
IRJET Journal
 
IRJET- A Survey Paper on Text Summarization Methods
IRJET- A Survey Paper on Text Summarization MethodsIRJET- A Survey Paper on Text Summarization Methods
IRJET- A Survey Paper on Text Summarization Methods
IRJET Journal
 

More from IRJET Journal (20)

Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
BRAIN TUMOUR DETECTION AND CLASSIFICATIONBRAIN TUMOUR DETECTION AND CLASSIFICATION
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ..."Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
Breast Cancer Detection using Computer VisionBreast Cancer Detection using Computer Vision
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the HeliosphereAnalysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
A Novel System for Recommending Agricultural Crops Using Machine Learning App...A Novel System for Recommending Agricultural Crops Using Machine Learning App...
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the HeliosphereAnalysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
FIR filter-based Sample Rate Convertors and its use in NR PRACH
FIR filter-based Sample Rate Convertors and its use in NR PRACHFIR filter-based Sample Rate Convertors and its use in NR PRACH
FIR filter-based Sample Rate Convertors and its use in NR PRACH
IRJET Journal
 
Kiona – A Smart Society Automation Project
Kiona – A Smart Society Automation ProjectKiona – A Smart Society Automation Project
Kiona – A Smart Society Automation Project
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
Invest in Innovation: Empowering Ideas through Blockchain Based CrowdfundingInvest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUBSPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
IRJET Journal
 
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
BRAIN TUMOUR DETECTION AND CLASSIFICATIONBRAIN TUMOUR DETECTION AND CLASSIFICATION
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ..."Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
Breast Cancer Detection using Computer VisionBreast Cancer Detection using Computer Vision
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the HeliosphereAnalysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
A Novel System for Recommending Agricultural Crops Using Machine Learning App...A Novel System for Recommending Agricultural Crops Using Machine Learning App...
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the HeliosphereAnalysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
FIR filter-based Sample Rate Convertors and its use in NR PRACH
FIR filter-based Sample Rate Convertors and its use in NR PRACHFIR filter-based Sample Rate Convertors and its use in NR PRACH
FIR filter-based Sample Rate Convertors and its use in NR PRACH
IRJET Journal
 
Kiona – A Smart Society Automation Project
Kiona – A Smart Society Automation ProjectKiona – A Smart Society Automation Project
Kiona – A Smart Society Automation Project
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
Invest in Innovation: Empowering Ideas through Blockchain Based CrowdfundingInvest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUBSPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
IRJET Journal
 
Ad

Recently uploaded (20)

Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdfRICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
MohamedAbdelkader115
 
15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...
IJCSES Journal
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Journal of Soft Computing in Civil Engineering
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdfRICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
MohamedAbdelkader115
 
15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...
IJCSES Journal
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
Ad

Automatic Text Summarization

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2856 Jay Sharma,1 Harsh Hardel,1 Chirag Sahuji,1 and Rajesh Prasad1 1Department of Computer Science and Engineering, School Of Engineering,MIT Art, Design and Technology University, Pune, Maharashtra – 412201 -------------------------------------------------------------***------------------------------------------------------------- Abstract - The amount of data on the internet is increasing day by day. So it becomes more difficult to retrieve important points from large documents without reading the whole document. Manually extracting key components from large documents requires a lot of time and energy. Therefore an automatic summarizer is must needed which will produce the summary of a document on its own. In this, a text is given to the computer and the computer using algorithms produces a summary of the text as output. In this paper, we have discussed our attempt to prepare an extraction-based automatic text summarizer in which paragraphs of documents are split into sentences and then these sentences are ranked based on some features of summarization where higher rank sentences are found to be important which is used to generate the summary. 1. INTRODUCTION The World of text is vast and is evolving constantly. Most information available on the internet is still in a text format. We humans as frequent users of the internet have to go through large and many different documents to gain that information which takes lots of our time as well as energy. Due to this massive increase in text information, a summary of this information is must needed which will extract the important points from a large set of information or documents. Therefore, a process of producing a concise summary while maintaining important information and the overall meaning of the text is called Text Summarization. When it comes to summarizing text, humans understand the context of documents and create a summary that conveys the same meaning as the original text but when it comes to automation this task becomes way too complex. Text Summarization is one of the use cases of Natural Language Processing (NLP). Natural Language Processing is the branch of Artificial Intelligence concerned with making computers understand text and words in the same way as human beings used to understand them. We can divide Text summarization into two categories:- 1. Extractive Summarization: In this, the most important parts of sentences are selected from the text and a summary is generated from it. 2. Abstractive Summarization: In this method, summary involves the words which are generally not present in the actual text it means that it produces a summary in a new way by selecting words on semantic analysis just like how humans read articles and then write the summary of it in their own words. 1.1 Contribution Over the years, there has been a lot of discussion regarding text summarization and how to deal with its problems. Several papers have been published that discuss various approaches to dealing with automating text summarization. In this paper, we have extended this area in novel ways and contributed to the area of text summarization especially in the Ex- tractive approach because computers still cannot understand every aspect of the Abstractive approach of text summarization. Our study can be summarized as follows:- • We have presented the TextRank approach to deal with extractive text summarization. • Also, we have performed evaluation measures and described them in detail. • Further, we have made a comparison of our proposed system with existing MS-Word approaches. 1.2 Organizational Structure The outline of the rest of the paper is as follows: In section II we have presented a literature review of text summarization. Section III describes our pro- posed idea/system of extractive text summarization. Section IV describes the working algorithm we have used in our proposed system. In section V, the result of our proposed system is discussed which involves evaluation as well as performance measures of our system. Section VI includes the future scope of our system and concludes our study. Automatic Text Summarization
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2857 2. LITERATURE SURVEY There has been a lot of development in the field of Text Summarizing in the past 9 years. With every advancement comes a new technique with a more optimized approach. Some of these techniques are mentioned below 2.1 Abstractive Text summarization Approaches with Analysis of Evaluation Techniques [1] The aim is to summarize large documents to extract the key components from the text which makes it very convenient to understand the context and main points of the text. This paper was published in the year 2021 and the author was Abdullah Khilji. The approach described in the paper uses a baseline machine learning model for summarizing long text. This model takes raw text as input and gives a predicted summary as output. They experiment with both abstractive and extractive-based text summarization techniques. For verifying the model they use BELU, ROGUE, and a textual entailment method. Dataset being used is Amazon fine foods reviews which consist of half a million reviews collected over 10 years. The methodology being used is Sequence to Sequence and Generative Adversarial. However, the paper is missing details of extractive summarization techniques. 2.2 A Comprehensive Survey on Text Summarization Systems [2] This paper describes the classification of summarization techniques and also the criteria that are important for the system to generate the summary. This paper was published in 2009 and the author was Saeedeh Gholamrezazadeh. The extractive approach is described as reusing the portions of the main text which are italics, bold phrases, the first line of each paragraph, special names, etc. For Abstractive approach involves rewriting the original text in a short version by replacing large words with their short alternatives. Drawbacks of Extractive summarization include inconsistencies, lack of balance, and lack of cohesion. Examples of extractive summarization are Summ-It applet and examples of abstractive summarization are SUMMARIST. 2.3 Extractive Automatic Text Summarization Based on Lexical-Semantic Keywords [3] This paper focuses on Automatic Text Summarization to get the condensed version of the text. The method mentioned also takes into account the title of the paper, the words mentioned in it, and how they are relevant to the document. However, the summary might not include all the topics mentioned in the input text. This paper was published in the year 2020 by A´ngel Hern´andez-Castan˜eda. The proposed approach first passes the text to feature generation methods like D2V, LDA, etc. to generate vectors for each sentence. Then it undergoes clustering which measures the proximity among different vectors generated in the previous step. Then LDA is used to get the main sentences above the rest of all sentences which are then used to generate the summary. This paper explains the whole process in a very precise manner however there is less information on the methods being used in some steps of the proposed flow. No info on term frequency and inverse term frequency is present in the paper. 2.4 Abstractive Text Summarization based on Improved Semantic Graph Approach [4] This paper talks about the graph-based method for summarizing text. It also tells about how to graph-based method can be used to implement abstractive as well as extractive text summarization. This paper was published in the year 2018 and is authored by Atif Khan. According to almost all graph-based methods text is considered as a bag of words and for summarization, it uses content similarity measure but it might fail to detect semantically equivalent redundant sentences. The proposed system has two parts one for making the semantic graph and the other part for improving the ranking algorithm based on the weighted graph. Finally, after both parts are executed successfully, an abstractive summary of the text is generated. The paper has good information on the graph based ranking algorithm and improved versions of it. However, there is no mention of the text rank algorithm. Also, a comparison between the graph-based ranking algorithm and text rank ranking algorithm is absent. 2.5 Text Summarizer Using Abstractive and Extractive Method [5] In this paper the main motivation is to make computers understand the documents with any extension and how to
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2858 make it generate the summary. In this method the system uses a combination of both statistical and linguistic analysis. This paper was published in the year 2014 and is authored by Ms. Anusha Pai. The system introduced in the paper takes text input from user. For summarizing this input it firstly separates the phases, then removes the stop words from the input. After this it performs Statistical and Linguistic analysis to generate the summary. This output is then sent and stored in the database. The system proposed generates summary according to the input given by the user. This can further be improved by adding synonyms resolution to the model which will treat synonyms words as same. Also multiple documents summarization support can be added. 2.6 Evolutionary Algorithm for Extractive Text Summarization [6] In this paper a new possibility is introduced, abstractive text summarization might compose of novel sentences, which are not present in the original document. The method introduced uses an unsupervised document summarization method which uses clustering and extracting to generate a summary of the text in the main document. This paper was published in 2009 by Rasim Algulie This method uses Sentence clustering to classify sentences based on similarity measures which classify the sentences in clusters. After this objective function is used to calculate their importance. Along with this Modified Discrete Differential Evolution Algorithm is also mentioned in the paper. No information is given about the Graph-based approach and also no comparison between the graph based approach and objective function is given. Also, the result is less optimized than the approaches we saw in other papers. 2.7 Automatic Keyword Extraction for Text Summarization: A Survey [7] This paper talks about current present approaches for summarization. It also talks about Extractive and Abstractive text summarization. This paper was published in the year 2017 and is authored by San- tosh Kumar Bharti. This paper has divided Automatic Text Summarization into 4 types Simple Statistics, Linguistic, Machine Learning, and Hybrid. These are techniques in which we can implement Automatic Text Summarization. In Simple Statistics, we have Inverse Document Frequency, Relative Frequency Ratio, Term Frequency, etc. In the Linguistic Approach, we have Electronic Dictionary, Tree Tagger, n- Grams, WordNet, etc. In Machine Learning we have SVM, Bagging, HMM, etc. Hybrid is the combination of the previous three mentioned. This paper contains all the techniques present to the date however since 2017 there has been a lot of improvement in this field. This paper needs to be updated with the latest technologies. 2.8 Automatic Keyword Extraction for Text Summarization in Multi-document e- Newspapers Articles [8] This paper was published in the year 2017 and is authored by Santosh Kumar Bharti. It takes about implementing Extractive Summarization in everyday life for summarizing e-Newspaper articles. This paper shows the comparison between TF- IDF, TF- AIDF, NFA, and Proposed methods. It uses F-measure as a parameter to measure the efficiencies of the techniques. It can be seen that the proposed method is more efficient than the other three methods. The paper only mentioned model implementation in Newspaper articles and not in any other type of articles. Also, only Extractive Summarization is used for this model. This model should be tested on different articles and also using different methodologies. 2.9 Graph-based keyword extraction for single- document summarization [9] This paper focuses on the approach used for selecting keywords from the text input given by the user. The comparison is done between the supervised and unsupervised approaches to identify keywords. This paper was published in the year 2008 by the writer Marina Litvak. This approach takes into account some structured document features using the graph-based syntactic representation of the text and web documents which improves the traditional vector-space model. For the supervised approach, a summarized collection of documents is used to train the classification algorithm which induces a keyword identification model. Similarly, for the unsupervised approach, the HITS algorithm is used on the document graphs. This is done under the assumption that the top-ranked nodes are representing the document keywords. The supervised Classification model provides the best keyword identification accuracy. But a simple degree-based ranking reaches the highest F-measure. Also, the only first
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2859 iteration of HITS is enough instead of running it till we get convergence, 2.10 Extractive approach for text summarization using graphs [10] This paper implements the Extractive approach but uses a different approach. It uses two matrices for sentence overlap and edits distance to measure sentence similarity. This paper was published in the year 2021 by the author Kastriot Kadriu. The proposed model takes a document as input. The document undergoes tokenization after which lemmatization is performed and is checked for de- pendency parsing. Then the model checks for sentence overlap and edits distance if necessary. After this graph representation is done. Now we can apply different algorithms to generate a summary of the text. In the paper, the author has taken into account different methods for generating the summary. This includes Pagerank, hits, closeness, betweenness, degree, and clusters. Finally, the summary is generated. The paper also shows a comparison between different methods and how accurate their summary is. We use F-score to measure their effectiveness. 2.11 Implementation and Evaluation of Evolutionary Connectionist Approaches to Automated Text Summarization [11] This paper implemented and compared the performance of three text summarizations developed using existing summarization systems to achieve connectionism. This paper was published in 2010 by the authors Rajesh Shardanand Prasad and Uday Kulkarni. Three approaches used in the paper are based on semantic nets, fuzzy logic, and evolutionary programming respectively. The results they got were that the first approach performs better than MS Word, the second approach was resulting in an efficient system and the third approach showed the most promising results in the case of precision and F- measure. The paper has used the DUC 2002 dataset to evaluate summarized results based on precision and F-measure. Approaches used in this paper focus only on small details related to general summarization rather than developing an entire summarization system and thus are only helpful for research purposes. 2.12 Feature Based Text Summarization [12] This paper aims at creating a feature-based text summarizer that is applied to different sizes of documents. This paper was published in the year 2012 by author Dr. Rajesh Prasad. This paper follows the extractive way of summarizing the text and utilizes the combination of nine features to calculate the feature scores of each sentence and rank them according to the scores they get. The higher rank sentences are part of the final summary of the text. They have used different types of documents that require different features to get a summary as a data set for their model. This approach gave better results when compared to MS word in terms of precision, Recall and F-measure in most types of documents. This paper shows great results for different types of documents but some documents may require features more than this paper has used so further research is needed in this approach. 2.13 Review of Proposed Architectures for Automated Text Summarization [13] This paper aims to review various architectures which have been proposed for automatic text summarization. This paper was published in the year 2013 by its authors Tejas Yedke, Vishal Jain, and Dr.Rajesh Prasad. In this paper, different techniques for text summarization are discussed and their advantages and drawbacks are also reviewed. DUC 2002 data set is used to calculate results that which approach per- forms better for which type of document in terms of precision, recall and F-measure. The limitation of this paper is that it doesn’t provide the most effective technique for summarization. 2.14 Automatic Extractive Text Summarizer(AETS): Using Genetic Algorithm [14] This paper aims at developing an extractive text summarizer using the Genetic algorithm. This paper was published in the year 2017 by its authors Alok Rai, Yashashree Patil, Pooja Sulakhe, Gaurav Lal, and Dr.Rajesh Prasad. The approach this paper follows involves feature extraction, fuzzy logic, and a genetic algorithm to train the machine to produce better results in automatic summarization. This paper defines Genetic algorithms as the search strategies that cop with the population of simultaneously seeking positions. Ac- cording to the input
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2860 file and compression rate given by the user, the authors resulted in forming a meaningful summary as output text. According to the paper genetic algorithm is a sentence- choice-based technique and gives the best results when text summarization is done. 2.15 A Novel Evolutionary Connectionist Text Summarizer (ECTS) [15] This paper aims to create an efficient tool that can summarize large documents easily using the evolving connectionist approach. This paper was published in the year 2009 by Rajesh S.Prasad, Dr. U.V Kulkarni, and Jayashree R.Prasad. In this paper, a novel approach is proposed for part of speech disambiguation using a recurrent neural network, which deals with sequential data. Fifty random different articles were used as data sets for this paper. Authors found the accuracy of ECTS was ranging from 95 to 100 percent which average accuracy of 94 percent when compared to other summarizers. Though this paper has explained the connectionist approach very well there lies some issues in POS disambiguation and deviations found in ECTS for a couple of sentences. 2.16 Abstractive method of text summarization with sequence to sequence RNNs [16] This paper was published in the year 2019 and is authored by Abu Kaisar Mohammad Masum. It takes about how bi-directional RNN and LSTM canbe used in the encoding layer along with the attention model in the decoding layer for performing abstractive text summarization on the amazon fine food review dataset. The focus of this paper is Abstractive Summarization by the use of sequence to sequence RNNs. It performs Data Processing in which we Split Text, Add Contractions, Remove stop words and perform Lemmatization. After verifying if the text is purified, we count the size of the vocabulary and add a word embedding file. The next step is the addition of special tokens which mark important waypoints in the dataset. After this, an Encoder layer and Decoder Layer are present in LSTM and then the Sequence to Sequencemodel is built. The model is then trained with data for generating the response summary. Even though the above model performs well for short text it suffers when long text input is given. Another drawback is thatit is currently trained for the English language but no such summarizer is available for other languages. 2.17 Automatic Text Summarization Using Local Scoring and Ranking [17] This paper was published in the year 2017 and is authored by Diksha Kumar. Instead of building a text summarizer, this paper focuses on improvingthe currently available Automatic Text Summarizer to achieve more coherent and meaningful summaries.The model introduced in the paper uses an automatic feature-based extractive text summarizer to better understand the document and improve its coherence. The summary of the given input is generated based on local scoring and local ranking. Wecan select the top n sentences in the ranking for thesummary. Here n depends on the compression ratio of the summarizer. Feature Extraction can be done based on differentcriteria. It can be done based on the frequency of a word appearing in a sentence by selecting the words occurring the most, based on the length of the sentence by avoiding too short or too long sentences, based on the position of the sentence by giving a highscore to the first sentence and less to the second sentence and so on, based on sentences overlapping thetitle or heading which can be considered important and based on similarity of a sentence concerning all other sentences in the document. 2.18 A Genetic Fuzzy Automatic Text Summarizer [18] This paper was published in the year 2009 and is authored by Daniel Leite. This paper focuses on the fuzzy- based ranking system to select the sentences for performing extractive summarization on the input data set. The fuzzy knowledge base used in this model was generated by a genetic algorithm. For fitness function, ROGUE in formativeness measures were adopted and a corpus of newswire text is being used along with their human-generated summaries for defining the fuzzy classification rules. The paper also talks about SuPor-2 features which use a Na¨ıve-Bayes probabilistic classifier to find the relevance of a sentence to be extracted from the dataset to be in the generated summary. It has 11 features that address either the surface or the linguistic factors that interact with one another to find the relevance of a sentence. For future scope, the paper talks about using ideal mutation and
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2861 crossing rates in the evolution phase for the genetic algorithm. Also, membership functions can be explored for modeling fuzzy sets. 2.19 Enhancing Performance of Deep Learning Based Text Summarizer [19] This paper was published in the year 2017 and is authored by Maya John. This paper aims to enhance the performance of the currently present deep learning model used for text summarization. In current deep learning models, the summary sentences form the minority class which is very small when compared to the majority class which leads to inaccuracy in a summary generation. To enhance the performance, data can be resampled before giving it to the deep learning model. The proposed system is divided into steps for simplicity. These are text preprocessing, feature extraction, resampling, and classification. Inside text preprocessing we have tokenization, stop word removal, stemming, and lemmatization. Along with this several resampling, mechanisms are discussed in the paper which can be used on the data set for improving the classifier performance. 2.20 Cut and Paste Based Text Summarization [20] This paper was published in the year 2000 and is authored by Hongyan Jing. This paper’s text summarization model is based on the examination of human written abstracts for a specific text. The model extracts text from the input for generating a summary and removes the inessential phrases from the text. The phrases are given as output and then joined together to form coherency. This is done based on a statistically based sentence decomposition program which finds where the phrases of a summary begin in the original text input. This produces an aligned corpus of the summary along with the articles used to make the summary. The model uses a Corpus of human-written abstracts for analyzing the input. It also uses WordNet for Sentence reduction and Sentence combination. Even though it is very accurate, when we test this model with current models it is very simple and old compared to the ones that are being used currently. A lot of advancement has been done in this field since the time this paper was published. 3. PROPOSED SYSTEM 3.1 Preprocessing Preprocessing involves the taking text as input from the user in different forms like Wikipedia or other links, simple text box and upload text document, etc. and stop words from the sentences are removed. 3.2 Graph Building These sentences are represented as nodes with all their properties and edges representing interest (similarity) between two sentences. 3.2 Sentence Ranking Algorithm The concept of Sentence Ranking tells that a document ranks high in terms of ranking, if high ranking documents are linked to it. Here we have used a Textrank algorithm inspired by the Pagerank algorithm which extracts all sentences from input text, create vectors for all sentences, then the similarity between sentence pairs is calculated and finally, sentences are ranked based on score. 3.3 Summarization Based on the Similarity Score obtained from the similarity matrix the top N sentences are to be included in the final summary and thus Summary is generated from the original text as the output of the system.
  • 7. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2862 FIG. 1. Proposed System Architecture or Flow Diagram 4. ALGORITHM 4.1 Preprocessing The algorithm that we have used here is called as Text Rank Algorithm which is inspired by Google’s Page rank algorithm and is used for ranking web pages. Consider a web page A which has a link to a web page Z. A PageRank score is calculated based on the probability of users visiting that page to rank these pages. A Matrix is created to store the probabilities of users navigating from one page to another is called Similarity Matrix. The probability of going from page A to Z is M[A][Z] initialized as 1/(number of unique links on the website). Suppose A which contains links to 2 pages, so the contribution of A to PageRank of Z will be PageRank (A)/2. If there is no link between web pages then probability should be initialized as 0. To solve this issue a constant d called as damping constant is added. So final equation will be: PageRank(Z)=(1-d)+d*(PageRank(A)/2) Text Rank has similar logic as PageRank but there are some changes like in-place of web pages Text sentences are taken and a similarity matrix is calculated based on similarity values between two sentences using maximum common words. So accordingto the algorithm to create a graph of sentence ranking a vertex of each sentence is created and added to the graph. Further, the similarity between the two sentences is calculated based on the common wordtoken present in the two sentences. In the graph, an edge between two vertices or sentences denotesthe similar content or interest among two sentences. Long sentences are avoided to be recommended in summary by multiplying with a normalizing factor. FIG. 2. Similarity graph drawn based on similarity matrix for sentence similarity The similarity between the two sentences is givenby: where given two sentences Si and Sj, with a sentence as set N-words appear in that sentence. Let’s take an example to illustrate the working of algorithm: Let there be three sentences as follows: A=’ He is a tall guy ’ B=’ He has a lot of friends ’ C=’ Jay is his close friend ’ Initialize TextRank(A), TextRank(B), TextRank(C) as 1 and take d as 0.85. Figure 3 shows how the similarity matrix is created for the above sentences. TextRank(A) =(1-0.85) + 0.85 * (TextRank(B) * M[B,A] +
  • 8. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2863 TextRank(C) * M[A,C]) = 0.15 + 0.85*(1.34*0.5+1*0.2) = 0.889 TextRank(C) = (1-0.85) + 0.85 * (TextRank(A) * M[C,A] + TextRank(B) * M[C,B]) FIG. 3. Similarity Matrix where values shows the similarity between two sentences =0.15 + 0.85*(0.889*0.2+1.34*0.9) = 1.326 The above process is continue for every sentence for n iterations. Therefore all the sentences are arranged according to their text ranks and the most important sentences are added to a summary. Graphical heat map representation of similarity matrix is shown in FIG. 4 FIG. 4. Graphical representation of Similarity Matrix using a Heat map 4. RESULT AND ANALYSIS Applying the proposed idea and algorithm we have implemented an automatic text summarization system that takes text as input from the user and gives him precise summary. The experimental results and analysis of our proposed system is been discussed in this section. The proposed summarization system is implemented in Python using Flask and nltk library. For testing purposes, we have used BBC News summary data set from Kaggle. This data set contains the documents of news articles in different categories and their extractive reference summary respectively. Evaluation Measures: Evaluation of our proposed Text Summarization System is carried out to determine the quality of the summary produced by the system. We have evaluated our system using the ROUGE evaluation measure. ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation which is a set of metrics to evaluate the quality of the summary produced by the system. ROUGE-N measures the number of matching n-grams between our model generated summary and human generated summary. ROUGE-1 measures for uni-gram. Similarly ROUGE-2 measures for bi-grams. ROUGE-L measures the longest common sub-sequence between our model and referenced summary. We have calculated the precision, recall, and f-measure of each ROUGE-1, ROUGE- 2 and ROUGE-L. Precision is the proportion of correctness of sentences in the summary. For precision, higher the value, better the system performs to generate summary. Precision = Retrieved Sentences ∩ Relevant Sentences Retrieved Sentences The Recall is the ratio of the relevant sentence in the summary. Recall = Retrieved Sentences ∩ Relevant Sentences Relevant Sentences F-measure calculates the harmonic mean of precision and recall. Higher is the value, higher is the similarity between the model generated and referenced summary. Performance Measures: When it comes to the performance of the summarizer it should give a concise summary of text similar to a human-generated summary. The performance of our system is evaluated TextRank(C) * M[B,C]) =0.15 + 0.85*(1*0.5+1*0.9) = 1.340 TextRank(B) = (1-0.85) + 0.85 * (TextRank(B) * M[A,B] +
  • 9. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2864 on summary available in BBC news dataset using evaluation measures described above. We havetaken 5 documents from the dataset of each category. Then we generated a summary for each document using our proposed summarizer. For experimentation, the summary is generated for different percentages of the summary that the user wants to generate, and this generated summary is evaluated on the extractive summary available in the dataset using evaluation measures. To check for the efficiency of our proposed system we used it to generate summary of different documents and then compared that with the summary generated by one of the currently present system forgenerating summary. To check the efficiency we have FIG. 5. Comparison of precision, recall and f-measure of 5 documents when summary generated is 15% of the total document length FIG. 6. Comparison of precision, recall and f-measure of 5 documents when summary generated is 30% of the total document length 3 evaluations. First, we compare our proposed system generated summary with the original summary of the input. We can find the original summary of the document in the database along with the input document. By this we can check the accuracy of our model. Secondly, we compare our proposed system generated summary with the summary generated for the same document by some other currently present summarization model. By this we can check the performance of our model with respect to the currently present model. Lastly, we check our system generated summary with the summary made by a human. This is one of the most important evaluation as it gives us the exact idea of the summary that is required by a user and how much of it is present in the summary generated by our model. We have tested our model for 2 different lengths of generated summary: 15% and 30% length of the input document. When the summary for different documents is evaluated at the rate of 15% of the summary the computed evaluation measures we got can be seen in FIG.5 and similarly for a rate of 30% of summary the computed evaluation can be seen in FIG.6. To test the summarizer we have summarized the different documents from the dataset based on domains of politics, entertainment, sports, technology, etc. This is done to see how our proposed system reacts to documents of different types, different lengths, etc. and how the generated summary is affected by these. Precision, recall, and F-measure for each document was calculated to test the content understanding of our proposed summarization system. TABLE I. Comparison of precision values of proposed system with respect to existing MS-word system Methods Dataset R-1 R-2 R-L TextRank D1 0.9607 0.8833 0.9411 D2 0.9756 0.9313 0.9756 D3 0.9324 0.9052 0.9324 D4 0.9583 0.8852 0.9583 D5 0.9487 0.9189 0.9487 Average 0.9551 0.9047 0.9512 MS-Word D1 0.9534 0.8333 0.9534 D2 0.909 0.8645 0.909 D3 0.9 0.8809 0.9 D4 0.7619 0.5925 0.7619 D5 0.7391 0.6144 0.7391 Average 0.8526 0.7571 0.8526
  • 10. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2865 These evaluation measures are further compared with MS WORD 2007 Summarizer systems to test how well our proposed system works compared to the existing summarization tool. The result shown in Table I shows the comparison of average values of precision for different ROUGE measures i.e. ROUGE-1(R-1), ROUGE-2(R-2) and ROUGE- L (R-L) of our proposed system and MS Word 2007 summarization system, similarly result shown in Table II shows the average values of recall of our proposed system and MS Word 2007 summarization system and Table III result shows the average values of F-measure of our proposed system and MS Word 2007 summarization system. TABLE II. Comparison of recall values of proposed system with respect to existing MS-word system Methods Dataset R-1 R-2 R-L TextRank D1 0.4537 0.3812 0.4444 D2 0.5194 0.426 0.5194 D3 0.5036 0.4257 0.5036 D4 0.4646 0.3802 0.4646 D5 0.5401 0.4766 0.5401 Average 0.4963 0.4179 0.4944 MS-Word D1 0.3796 0.2877 0.3796 D2 0.4545 0.3721 0.4545 D3 0.4598 0.3663 0.4598 D4 0.4848 0.338 0.4848 D5 0.3722 0.2383 0.3722 Average 0.4662 0.3736 0.4652 Methods Dataset R-1 R-2 R-L TextRank D1 0.6163 0.5326 0.6037 D2 0.6779 0.5846 0.6779 D3 0.654 0.5791 0.654 D4 0.6258 0.532 0.6258 D5 0.5401 0.4766 0.5401 Average 0.6228 0.541 0.6203 MS-Word D1 0.543 0.4278 0.543 D2 0.606 0.5203 0.606 D3 0.6086 0.5174 0.6086 D4 0.5925 0.4304 0.5925 D5 0.3722 0.2383 0.3722 Average 0.5872 0.4891 0.5858 TABLE III. Comparison of f-measure values of proposed system with respect to existing MS-word system 4. CONCLUSION AND FUTURE WORK Graphs above show that the Precision, recall and, f- measure values of our model for different length of summary shows that the proposed system extracts a good amount of retrieved sentences from the document in the summary which shows the sign of good accuracy of our proposed summarization system. Also for 30% summary generated we can see that precision, recall and f-measure values are better compared to when 15% summary is generated. Thus, our system performs better for longer length of summary. As shown in Table I it is clear that the average precision values of our proposed system are better compared to the values we get from the MS Word summarizer for three documents and are close to equal for two documents. Similarly in Table II and Table III shows that our system also gives better recall and f- measure average values when compared to MS-Word 2007 summarization system. A Multilingual feature is also added to our summarization model which can help generate summary of different spoken languages text documents so that the people around the world can also use our Text summarization system.
  • 11. 28 66 International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2866 It is concluded that the achieved results of our pro- posed text summarization model are better in most of the aspects compared to existing models and therefore it can be a good start to further studies. As extractive summarization is evolving day by day with increase in its research and more algorithms coming out, in the future we will try to get more accuracy in generating summary using extractive approach which would be more feature based i. e it will include more features than just sentence ranking based on correlation between them such as sentence length, word based similarity, title features, etc. Also we would try to use more advanced algorithms like neural networks for generating summary. The extractive approach is more straightforward because copying big sections of text from the original document ensures grammar and accuracy. Para- phrasing, Generalization and, Assimilation are a part of abstractive summarization. Even though abstractive summarization is a more difficult process, thanks to recent improvements in the deep learning field, there has been some progress. In future it might be possible that the summary generated using extractive approach can be more accurate and meaningful compared to abstractive approach. 5. REFERENCES [1] Abdullah Faiz Ur Rahman Khilji, Utkarsh Sinha, Pintu Singh, Adnan Ali, and Partha Pakray. Abstractive text summarization approaches with analysis of evaluation techniques. In International Conference on Computational Intelligence in Communications and Business Analytics, pages 243–258. Springer, 2021. [2] Saeedeh Gholamrezazadeh, Mohsen Amini Salehi, and Bahareh Gholamzadeh. A comprehensive survey on text summarization systems. In 2009 2nd International Conference on Computer Science and its Applications, pages 1–6. IEEE, 2009. [3] A´ngel Hern´andez-Castan˜eda, Ren´e Arnulfo Garc´ıa Hern´andez, Yulia Ledeneva, and Christian Eduardo Mill´an-Hernandez. Extractive automatic text summarization based on lexical- semantic keywords. IEEE Access, 8:49896–49907, 2020. [4] Atif Khan, Naomie Salim, Haleem Farman, Murad Khan, Bilal Jan, Awais Ahmad, Imran Ahmed, and Anand Paul. Abstractive text summarization based on improved semantic graph approach. International Journal of Parallel Programming, 46(5):992–1016, 2018. [5] A Pai. Text summarizer using abstractive and extractive method. International Journal of Engineering Research & Technology, 3(5):22780181, 2014. [6] Rasim Alguliev, Ramiz Aliguliyev, et al. Evolutionary algorithm for extractive text summarization. Intelligent Information Management, 1(02):128, 2009. [7] Santosh Kumar Bharti and Korra Sathya Babu. Automatic keyword extraction for text summarization: A survey. arXiv preprint arXiv:1704.03242, 2017. [8] Santosh Kumar Bharti, Korra Sathya Babu, Anima Pradhan, S Devi, TE Priya, E Orhorhoro, O Orhorhoro, V Atumah, E Baruah, P Konwar, et al. Automatic keyword extraction for text summarization in multi-document e-newspapers articles. European Journal of Advances in Engineering and Technology, 4(6):410–427, 2017. [9] Marina Litvak and Mark Last. Graph-based keyword extraction for single-document summarization. In Coling 2008: Proceedings of the work- shop Multisource Multilingual Information Extraction and Summarization, pages 17–24, 2008. [10] Kastriot Kadriu and Milenko Obradovic. Extractive approach for text summarisation using graphs. arXiv preprint arXiv:2106.10955, 2021. [11] Rajesh Shardan and Uday Kulkarni. Implementation and evaluation of evolutionary connectionist approaches to automated text summarization. 2010. [12] Rajesh Shardanand Prasad, Nitish Milind Uplavikar, Sanket Shantilalsa Wakhare, VY Jain, Tejas Avinash, et al. Feature based text summarization. International journal of advances in computing and information researches, 1, 2012. [13] Tejas Yedke, Vishal Jain, and RS Prasad. Review of proposed architectures for automated text summarization. In Proceedings of International Conference on Advances in Computing, pages 155– 161. Springer, 2013. [14] Alok Rai, Yashashree Patil, Pooja Sulakhe, Gaurav Lal, and Rajesh S Prasad. Automatic extractive text summarizer (aets): Using genetic algorithm. no, 3:2824–2833, 2017.
  • 12. 28 67 International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2867 [15] Rajesh S Prasad, UV Kulkarni, and Jayashree R Prasad. A novel evolutionary connectionist text summarizer (ects). In 2009 3rd International Conference on Anti-counterfeiting, Security, and Identification in Communication, pages 606–610. IEEE, 2009. [16] Abu Kaisar Mohammad Masum, Sheikh Abujar, Md Ashraful Islam Talukder, AKM Shahariar Azad Rabby, and Syed Akhter Hossain. Abstractive method of text summarization with sequence to sequence rnns. In 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pages 1–5. IEEE, 2019. [17] Hari Disle Sameer Gorule Ketan Gotaranr Diksha Kumar, Sumeet Bhalekar. Automatic text summarization using local scoring and ranking. In 2017 international conference on computing methodologies and communication (ICCMC), pages 59–64. IJRTI, 2019. [18] Daniel Leite and L Rino. A genetic fuzzy automatic text summarizer. Csbc2009. Inf. Ufrgs. Br, 2007:779– 788, 2009. [19] Maya John and JS Jayasudha. Enhancing performance of deep learning based text summarizer. Int. J. Appl. Eng. Res, 12(24):15986–15993, 2017. [20] Hongyan Jing and Kathleen McKeown. Cut and paste based text summarization. In 1st Meeting of the North American Chapter of the Association for Computational Linguistics, 2000.