0% found this document useful (0 votes)

10 views

Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization Technique

This paper introduces a novel deep learning-based text summarization approach that utilizes a hybrid optimization technique combining Cat Swarm Optimization (CSO) and Harris Hawk Optimization (HHO) for single document summarization. The method involves pre-processing, feature extraction using Restricted Boltzmann Machines (RBM), and vectorization with TF-IDF, achieving an accuracy of 99.56% on datasets from the Document Understanding Conference (DUC). The proposed approach is compared with existing methods, demonstrating superior performance in terms of various evaluation metrics including ROUGE and BLEU scores.

Uploaded by

samithaasankar21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization Technique

Uploaded by

samithaasankar21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Received 4 January 2025, accepted 27 January 2025, date of publication 3 February 2025, date of current version 7 February 2025.

Digital Object Identifier 10.1109/ACCESS.2025.3538169

Deep Learning-Based Feature Extraction

Technique for Single Document Summarization
Using Hybrid Optimization Technique
JYOTIRMAYEE RAUTARAY 1 , SANGRAM PANIGRAHI2 , AJIT KUMAR NAYAK2 ,
PREMANANDA SAHU 3 , AND KAUSHIK MISHRA 4 , (Member, IEEE)
1 Department of Computer Science and Engineering, Institute of Technical Education and Research, Siksha ‘‘O’’ Anusandhan University, Bhubaneswar, Odisha
751030, India
2 Department of Computer Science and Information Technology, Institute of Technical Education and Research, Siksha ‘‘O’’ Anusandhan University,

Bhubaneswar, Odisha 751030, India

3 School of Computer Science and Engineering, Lovely Professional University, Phagwara, Punjab 144001, India
4 Department of Computer Science and Engineering, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Manipal 560064, India

Corresponding author: Kaushik Mishra ([email protected])

ABSTRACT Presently, the exponential growth of unstructured data on the web and social networks has made
it increasingly challenging for individuals to retrieve relevant information efficiently. Over the years, various
text summarization techniques have been developed to address this issue. However, traditional approaches
that rely on directly extracting words often lead to redundancies and fail to establish a strong connection
between the summary and the original document. This paper presents a novel Deep Learning (DL)-
based text summarization approach incorporating the following phases: pre-processing, feature extraction,
vectorization, and summarization using a hybrid Cat Swarm Optimization (CSO) and Harris Hawk
Optimization (HHO) algorithm. Initially, input documents undergo pre-processing steps, including sentence
segmentation, word tokenization, stop word removal, and lemmatization, to enhance text quality. Features
are then extracted using a Restricted Boltzmann Machine (RBM) to obtain nine key attributes. Vectorization
is performed using Term Frequency-Inverse Document Frequency (TF-IDF) to represent sentences in vector
form. The hybrid CSO-HHO algorithm is subsequently applied to generate summaries. The proposed
method’s efficiency was evaluated using datasets from the Document Understanding Conference (DUC),
specifically DUC-2002, DUC-2003, and DUC-2005. Metrics such as sensitivity, readability, coherence,
precision, BLEU score, ROUGE score, and F-score were analyzed to assess performance. The proposed
approach’s results were compared with existing methods, including CSO, QABC, PSO, GJO, FF, and
machine learning techniques like SVM and RF. The hybrid CSO-HHO algorithm achieved an accuracy of
99.56%, demonstrating its superiority in text summarization tasks.

INDEX TERMS Text summarization, single document summarization, pre-processing, feature extraction,
vectorization, hybrid CSO-HHO algorithm.

I. INTRODUCTION vast amounts of data being produced daily through user

The exponential growth of data generation around the interactions and information exchanges on various online
world continues to rise at an unprecedented rate. This platforms, manual data processing has become both time-
data serves as a critical asset for organizations, offering intensive and costly [1], [2]. To address this challenge,
a competitive advantage and driving significant annual summarization has emerged as an effective solution, enabling
investments in its maintenance and management. With the extraction of key insights and relevant information
from extensive datasets. By providing concise and essential
The associate editor coordinating the review of this manuscript and summaries, this process saves time and resources while
approving it for publication was M. Shamim Kaiser . facilitating efficient information retrieval [3]. In the domain

2025 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 13, 2025 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 24515
J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

of computing, summarization holds a vital role. A summary algorithm. By integrating CSO [18] with HHO [19], this
is defined as a concise passage or collection of passages hybrid algorithm effectively addresses the challenges of
that effectively conveys the core elements of the source sentence selection for creating concise summaries.
material [4]. It should highlight key points from the document Contributions:
without repeating [5]. Text summarization techniques can ◦ It uses RBN, a DL technique as the feature extraction
be broadly classified into two main categories: extractive method.
and abstractive [6]. Extractive summarization focuses on ◦ It applies TF-IDF for vectorization to represent sentence
selecting and assembling significant sentences or paragraphs in vector forms.
directly from the original text. This method is straightforward ◦ It uses hybrid optimization algorithm CSO-HHO to
and ensures grammatical accuracy as it preserves the structure generate summaries using DUC 2002, DUC 2003 and
of the original content. However, it may result in summaries DUC 2005 dataset.
that lack coherence or appear fragmented [7]. It has a ◦ It compares with existing algorithms such as traditional
major drawback as it generates an inconsistent summary by optimization, CSO, QABC, FF, PSO, GJO, and ML
overlooking the semantic connections between sentences and algorithm such as SVM and RF.
relying solely on statistical features [8]. Conversely, abstrac- ◦ It evaluates the performance of the summary using
tive summarization generates summaries by comprehending the ROUGE score, sensitivity, readability, coherence,
the underlying content and rephrasing it in a more fluid precision, BLEU score, and F-score.
and concise manner. This approach offers greater flexibility,
The paper is structured as follows: Section I provides
producing summaries that are contextually richer and more
an overview of text summarization and its various types,
coherent. Nonetheless, it is computationally intensive and
Section II reviews existing literature that has focused on doc-
poses challenges due to the complexities of natural language
ument summarization research, and Section III demonstrates
understanding and generation. It is computationally intensive
the proposed model and implemented methods. Section IV
and poses challenges due to the complexities of natural
addresses the analysis of experiments and results, and
language understanding and generation [9]. Summarization
Section V presents the conclusions.
techniques can also be categorized into single-document
and multi-document summarization based on the number of
documents being summarized simultaneously [10]. Single- II. RELATED WORKS
document summarization focuses on generating a summary Automatic text summarization (ATS) has emerged as a
using only one document as the source. In contrast, multi- critical area of research, driven by the increasing need to
document summarization deals with a set of documents that extract meaningful insights from large volumes of text. Over
are related to a common topic, producing a cohesive summary recent years, numerous optimization-based methodologies,
by integrating information from multiple sources [11], [12]. including nature-inspired algorithms, clustering strategies,
Deep Learning (DL), a subset of Machine Learning (ML) and hybrid approaches, have been developed to address
methods, uncovers many layers of distributed representations the challenges of extractive summarization for both single
from input data [13]. DL, an improved version of Artificial and multi-document datasets. Techniques such as Cuckoo
Neural Networks (ANN), employs deep structures and Search, Genetic Algorithms, Firefly Optimization, and Cat
performs well in many fields, including Natural Language Swarm Optimization have demonstrated significant advance-
Processing (NLP). DL uses multi-layered neural networks for ments, particularly in improving summary informativeness,
hierarchical, nonlinear information processing [14]. These diversity, and ROUGE-based performance metrics. This
architectures gradually extract complicated and high-level study aims to build upon these advancements by exploring
features from fundamental parts to achieve structured and innovative optimization frameworks tailored for ATS tasks.
layered information acquisition [15]. DL advances many This section reviews prior text summarizing research.
fields by using hierarchical learning. In today’s world, In favor of a multi-report outline, Selvan et al. [20]
there is a growing requirement for text summarization, fostered an invention called the Improved Cuckoo Search
and there is a continuous effort to enhance summarization Optimization Algorithm (ICSOA). The main objective is to
methods [16]. This study aims to apply DL techniques, summarize the description task, which entails the creation
particularly RBM, to generate concise summaries from of a record outline that incorporates essential information.
single documents. RBM, a type of generative stochastic In the process of selecting optimal sentences using the ICSOA
artificial neural network, is well-suited for feature extraction model, the sentences that define the outline are identified.
and dimensionality reduction. It consists of a visible layer Debnath et al. [21] presented the Archive-based Micro
and a hidden layer, where the network learns to model Genetic-2 Algorithm (AMGA2) to address the multi-
the probability distribution of the input data, enabling the objective problem of Extractive Single Document Sum-
extraction of meaningful and high-quality features [17]. marization (ESDS). Evaluation was conducted using the
The proposed summarization approach involves three main DUC-2001 and DUC-2002 datasets, and the outcomes were
steps: pre-processing, feature extraction using RBM, and compared with those of other methods that have already
generating the final summary using a hybrid optimization been developed using ROUGE measures. According to the

24516 VOLUME 13, 2025

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

results, the recommended approach performs better in terms Pei et al. [29] have demonstrated a Cat Swarm Opti-
of convergence rate and ROUGE ratings. mization (CSO) method was introduced, incorporating an
TimeaBezdan et al. [22] have proposed a Hybrid Fruit-Fly adjustable mode ratio and an enhanced search technique
Optimization (FFO) Method has been established to enhance known as focus boost. In order to assess the tracing mode
Text Document Clustering Utilizing K-Means. The exper- within the CSO framework, this study used six functions.
imental framework applied to text documents and specific Average ranking was used to evaluate overall performance,
functions revealed that the implemented approach is effective showing better outcomes than previously studied methods.
and distinguishes itself from alternative methodologies. The Gravitational Search Algorithm and Sequential
Akhmetov et al. [23] have proposed an innovative Hybrid Particle Swarm Optimization Using Dependent
Extractive Summarization algorithm has been developed Random Coefficients are two hybrid approaches presented by
to effectively summarize logical articles, utilizing Variable Shanhe et al. [30]. This innovative technique integrates three
Neighborhood Search (VNS) to identify the optimal perfor- distinct strategies aimed at addressing global optimization
mance within the realm of Extractive Text Outline quality, challenges. The results obtained from this method have
as measured by ROUGE scores. Consequently, this approach demonstrated superior performance compared to alternative
achieves high rough SCORE respectively. approaches.
Abo-Bakr introduced an ATS framework designed to In 2020, Singh et al. [31] presented a study titled
handle the extraction of concise summaries from extensive ‘‘Quantum-behaved Particle Swarm Optimization with
multi-document texts composed of numerous sentences. The Cauchy Distribution (QPSO-CD) for short. This paper
ATS challenge is framed as a Multi-Objective Optimization introduced a novel hybrid quantum technique known as PSO,
(MOO) problem, focusing on enhancing summary quality which employs a Cauchy distribution method in conjunction
by preserving essential content while minimizing redundant with natural selection. Upon a systematic evaluation of the
information. To address this, the authors developed an effectiveness of this strategy, it was found that the results were
efficient multi-objective algorithm tailored to the large- nearly optimal.
scale MOO problem, ensuring improved summarization Using a Deep Structured Network (DSN) to guarantee
outcomes. a minimum performance threshold, Ghadimi et al. [32]
Tomer et al. [25] have proposed a nature-inspired swarm have examined a proposed submodular chart convolutional
intelligence algorithm, referred to as Fire Fly (FF), has been summarizer (SGCSumm), which is presented as an extrac-
proposed for multi-document summarization. The evalua- tive multi-document summary technique. According to the
tions were conducted using datasets from the Document experimental findings, SGC Summ performs on par with the
Understanding Conference. The algorithm’s performance most advanced summarizing methods in terms of ROUGE
was assessed using the ROUGE score. The findings show that scores.
the algorithm performs better than the existing algorithms. Thi et al. [33] have presented a new hybrid method
Alqaisi et al. [26] have analyzed a grouping-based called PSOGA-BKSum, which combines Particle Swarm
approach has been proposed to identify the key themes within Optimization (PSO) and Genetic method (GA) approaches
messages, while an advanced Multi-Objective Optimization for the single document summarizing task. This approach
(MOO) framework enhances three aspects: relevance, diver- is based on a subset of scoring features and involves
sity/redundancy, and significance for Arabic multi-document updating candidates towards the optimal solution. When
summarization. Furthermore, when applied to the DUC applied to the DUC2002 dataset, PSOGA-BK sum achieves
2002 dataset, the proposed framework achieved high F-scores significantly improved results, with ROUGE-1 and ROUGE-
respectively. 2 scores that exceed those of the best previous algorithms by
For programmed Automatic Extractive Text Summariza- approximately 4.3% to 8.7%.
tion (AETS) tasks, Hernandez-Castaneda. et al. [27] have A comparison study between the Cuckoo Search algorithm
examined a language-agnostic and space-efficient method (CS), Firefly algorithm (FF), and Flower Pollination
that uses a clustering strategy backed by a Genetic Algorithm algorithm (FP) has been proposed by Pati and Rautray [34]
(GA) to determine the best sentence selection. According to with the goal of minimizing the original document’s content
the experimental findings, the suggested GA improves the using the DUC 2003 dataset. This study focused on extractive
accuracy of the system. summarization of a single document. The findings indicate
According to Debnath et al. [28], the suggested methodol- that the Cuckoo Search algorithm outperformed the other two
ogy tackles the extractive single-document summary problem algorithms significantly.
of automated text summarization. A Cat Swarm Optimization Chandra et al. [35] have demonstrated A comprehensive
(CSO) method has been developed to address this problem, report structure was utilized, employing the SVM classifier
with the goal of producing excellent summaries that prioritize methodology. To select the most suitable sentence sets, the
informativeness, clarity, content coverage, and redundancy proposed approach incorporates a hybrid ABC-CS optimiza-
reduction. The results indicate an improvement of high tion algorithm. Additionally, after identifying several key
ROUGE scores, respectively, when compared to the best features, the SVM classifier method is employed to determine
existing methods on the datasets. the summary by ranking all the selected sentences.

VOLUME 13, 2025 24517

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

Mandal et al. [36] have suggested a method that uses TABLE 1. Summary of key studies on single document summarization.
sentiment analysis, Cuckoo Search (CS) computation, and
sentence scoring. Sentence scoring techniques are employed
to evaluate sentences based on mathematical frameworks,
after which CS computation is applied to identify the most
suitable sentences for generating the extractive summary.
Experimental results indicate that the proposed method sig-
nificantly improves accuracy, recall, and F1-score compared
to three existing methods, specifically CSSA.
Pati and Rautray [37]. The authors used single document
extractive summarization with Ant Colony Optimization
(ACO), Firefly Algorithm (FFA), and the Cuckoo Search
(CS) algorithm, demonstrating that CS outperformed the
other techniques. They also performed multi-document
extractive summarization on dataset with this proposed model
outperforming the other techniques.
Gamal et al. [38] presented a hybrid approach (CSO-
GA), emphasizing its feasibility and integration towards the
establishment of a Chicken Swarm Optimization (CSO) with
Genetic Algorithm (GA) in support of the text synopsis aimed
at ensuring the optimal arrangement. The proposed method
demonstrated the most significant improvement in accuracy,
and in terms of ROUGE score.
Wahab et al. [39] proposed the Multi-Objective Artificial
Bee Colony algorithm based on Decomposition (MOABC/D)
to handle the difficulty of extractive multi-document sum-
marization. An asynchronous parallel implementation of the
MOABC/D algorithm has been created to take use of multi-
core platforms. The findings show significant improvements
over current literature in terms of ROUGE-1, ROUGE-2, and
ROUGE-L scores, as well as a noticeable boost in processing
speed. Table 1 provides a summary of key studies on single
document summarization.
The literature review showcases various advanced algo-
rithms and optimization techniques, such as ICSOA,
AMGA2, FFO, CSO, and hybrid approaches like CSO-
GA and MOABC/D, applied to extractive text summa-
rization and clustering. However, a significant research
gap lies in the integration of feature extraction methods
that effectively capture domain-specific intricacies while
improving optimization efficiency. Most existing studies rely
on traditional feature scoring techniques, overlooking the
potential of advanced neural architectures like RBN for
adaptive and robust feature extraction. Furthermore, there
is a lack of comprehensive hybrid optimization frameworks
that synergize the strengths of different algorithms to address
challenges such as redundancy reduction and diversity
enhancement. Utilizing RBN for feature extraction combined
with a hybrid optimization approach can bridge this gap,
delivering improved summarization quality and adaptability
across datasets.

III. PROPOSED MODEL

The proposed model focuses on extractive text summa-
rization for individual documents. It comprises three main
phases: text pre-processing, feature extraction, and summary

24518 VOLUME 13, 2025

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

TABLE 1. (Continued.) Summary of key studies on single document removal follows, eliminating unnecessary terms from the
summarization.
text, and finally, lemmatization converts words to their
base forms. During feature extraction, nine relevant word
features are identified using the RBM architecture, followed
by vectorization techniques like TF-IDF. The CSO-HHO
method then utilizes these features as the objective function to
generate the summary. The experimental results demonstrate
that the proposed model achieves improved summarization
performance and higher accuracy. Below Figure 1 illustrates
the proposed methodology for extractive summarization

FIGURE 1. Work flow of proposed model for single document

summarization.

A. DATASET DESCRIPTION
The input used for single document extractive text summa-
rization is gathered from the DUC 2002, DUC 2003, and
DUC 2005 datasets.

B. PRE-PROCESSING
Preprocessing is essential for the effective processing of text.
Different recognized spellings of a word, different verb forms
of a single term, and the plural and singular versions of
the same entity can all lead to ambiguities. Words like a,
an, the, is, and of are also considered stop words. These
frequently used phrases don’t convey important information
or advance our summary goal.For text summarization,
we chose segmentation, tokenization, stop word removal, and
lemmatization as preprocessing steps because they effectively
support the goal of extracting meaningful content while
minimizing noise. Segmentation and tokenization break the
text into smaller units, enabling the algorithm to focus on
key phrases. Stop word removal eliminates non-informative
words, ensuring the summary remains focused on relevant
content. Lemmatization reduces words to their base forms,
enhancing consistency and improving summary quality.
Parsing and Named Entity Recognition (NER), which focus
on syntactic structure and entity identification, are not as
directly useful for summarization. Additionally, spaCy’s
generation using a hybrid method known as CSO-HHO. lemmatizer was preferred for its speed and efficiency in
In the pre-processing phase, the text undergoes segmentation, processing large-scale text, despite the WordNet Lemmatizer
lemmatization, stopword removal, and tokenization. Initially, offering more precise word sense disambiguation in certain
lengthy sentences are segmented into shorter ones, which contexts. Therefore, in this work, segmentation, tokenization,
are subsequently tokenized into individual words. Stopword stop word removal, and lemmatization were preferred over

VOLUME 13, 2025 24519

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

other preprocessing techniques. In this stage, we perform the Sentences containing these words generally carry more
following actions [40]. content than other sentences. For every phrase, the
Sentence Segmentation: In the process of sentence seg- frequency ratio of these words to the total number of
mentation, sentences are divided into distinct words, which words in the text was determined as shown in Eq.
are subsequently tokenized. The text is divided into distinct (1). For instance, in a document discussing ‘‘Artificial
sentences by punctuation, which includes commas, semi- Intelligence,’’ common terms like ‘‘AI,’’ ‘‘machine
colons, question marks, interjections, colons, and periods. learning,’’ and ‘‘algorithm’’ might frequently appear.
A sequential identification (ID) is given to every sentence. These words are important as they reflect the key topics
Tokenization: Sentences are tokenized by separating them of the text.
into individual words based on punctuation and spaces. no.of .thematicwords
After removing accents, conjunctions, interjections, and other Sentence_Thematic = (1)
Totalwords
symbols, the sentences are transformed into a continuous
2. Sentence position:The position of a sentence within
sequence of tokens.
a text determines its importance, with those at the
Stop Word Removal: Words that convey negation and hold
beginning or end being more significant than others.
limited significance, such as relational terms, conjunctions,
In a document, sentences at the beginning often
articles, possessives, and pronouns, are referred to as stop
contain key introductory information. This positioning
words. These words should be removed from sentences,
is considered a distinct feature as given in Eq. (2).
as they contribute to a negating effect when extracting
For instance, ‘‘Artificial Intelligence is transforming
meaningful tokens. Following the sentence segmentation
various industries’’ could be the first sentence, making
procedure, stop words such as ‘‘is’’, ‘‘and’’, and ‘‘the’’ are
it more likely to be included in the summary.
removed from the word list.
Lemmatization: After stopping, the word removal process firstorlastsentence 1
Sentence_position = (2)
reduces words to their base forms by stripping tense, prefixes, else, 0
and suffixes.
3. Sentence length:The transfer of information is often
limited by short sentences. Despite this, these sentences
C. FEATURE EXTRACTION
were not removed due to specific exceptions, and
Once the data is preprocessed, a matrix representing sen- instead, they were categorized as a feature which is
tences and their features is constructed. While BERT-based mentioned in Eq. (3). For instance, a shorter sentence
embeddings are powerful and have shown excellent results like ‘‘AI is revolutionizing healthcare’’ might have
in many NLP tasks, we chose to use the RBM (Restricted more weight than a long, complex sentence in a
Boltzmann Machine) method followed by TF-IDF for feature summary, as longer sentences can often be more
extraction in our work for a few key reasons. First, the RBM detailed or include redundant information.
and TF-IDF approach is more computationally efficient,
ifno.of .wordsarefiveorless 1

especially when working with large volumes of text. The Sentence_length =
RBM helps identify important features in the text without else, 0
requiring much computation, and TF-IDF helps prioritize the (3)
most significant words. On the other hand, BERT embeddings
4. Position of sentence relative to paragraph:The
are computationally expensive and require more time and
reason for this is that a new topic is introduced at
resources, as they need specialized hardware and fine-tuning
the beginning of every paragraph, and a definitive
with large datasets to perform well. While BERT embeddings
conclusion is presented at the end of each paragraph,
provide deeper understanding of text, the RBM + TF-
as expressed in Eq. (4). Sentences near the beginning
IDF method is simpler, faster, and easier to implement,
of a paragraph often contain main ideas, such as ‘‘Deep
making it a better fit for our task. This method also allows
learning techniques are a subset of machine learning,’’
us to handle large datasets more effectively while still
making them more significant for summarization
producing good results for summarization. Feature extraction
compared to sentences placed later in the paragraph.
involves identifying and collecting relevant attributes from
the raw data to enhance its statistical processing, as outlined PositionInpara
in [41]. This step is crucial for training machine learning
ifitisthefirstotherwiselastsentofpara 1
models. In this study, nine features were incorporated to = (4)
else, 0
train the summarizer, aiding in the identification of sentences
containing the most significant information within the text. 5. Number of proper nouns:The purpose of this function
These features form a feature vector, where each sentence is toward emphasizing sentences containing a signifi-
in the text is assigned its own feature vector. The RBM cant number of proper nouns. In this case, we calculate
architecture is employed to effectively extract these features. the overall count of words that have been labeled as
1. Frequently used words:The words frequently utilized proper nouns based on their part of speech for every
in a text are typically crucial for the subject matter. sentence. For example, a sentence like ‘‘Elon Musk and

24520 VOLUME 13, 2025

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

Tesla are leading the development of AI technology’’ of pertinent examples to transmit to the server. RBMs are a
has two proper nouns, ‘‘Elon Musk’’ and ‘‘Tesla,’’ type of generative model and serve as undirected probabilistic
which might indicate significant entities in the text, graphical models that find application in an extensive array
making it important for the summary. of uses, such as feature extraction, collaborative filtering,
6. Number of numerals:Given that numbers are essential and [42] image reconstruction. This structure consists of
for conveying information, this function prioritizes visible units (or visible vectors, i.e., information samples) and
sentences containing specific figures. We determine the some hidden units (or hidden vectors). Whereas the hidden
proportion of numerals to the total no of words in each layer represents the latent variables that represent the features
sentence which is outlined in Eq. (5). For instance, that were extracted from the visible data, the visible layer
a sentence like ‘‘The company invested $2 million in AI represents observable variables like training data. Both the
research last year’’ contains the numeral ‘‘2 million,’’ observable vector and the concealed vector are binary vectors
which could signify an important financial figure or with their states being {0, 1}, which is illustrated in Figure 2
milestone relevant to the document’s key message. given below.
No.of .Numerals
Sentence_Numerals = (5)
Totalwords
7. Named entities:In this exercise, we tally the overall
count of named entities in every sentence. Sentences
that contain mentions of named entities such as
companies or groups of people are frequently crucial
for understanding a factual report. For instance, in the
sentence ‘‘Apple Inc. launched a new AI-powered
product,’’ ‘‘Apple Inc.’’ is a named entity, representing
a significant organization related to the topic of the
document.
8. TF ISF:We selected the TF-ISF feature over TF-IDF
FIGURE 2. The structure of restricted boltzmann machine.
since the analysis focuses on a single document. In this
approach, the frequency of each word in a specific The two layers are connected through weighted connec-
sentence is multiplied by its total occurrences across all tions and do not have connections between units within
other sentences, and the resulting products are summed every layer. The visible layer of the network receives the
for all words as mentioned in Eq. (6). For instance, training dataset and the concealed layer acquires the ability
a term like ‘‘deep learning’’ might have a high TF to represent the data represented in a low-dimensional
in a document focused on AI but a lower ISF across probabilistic framework manner. The network assigns a
multiple documents, making it important to capture its Boltzmann probability distribution for each combination of
significance in the summarization process. visible and hidden units q(w, j) as illustrated in Eq.(8).
P
log allwordsTF ∗ ISF 1
TFI SF = (6) q(v, h) = exp(−E(w, j)) (8)
Totalwords z
9. Sentence to Centroid similarity:The centroid sen- v →i/p of visible layer, v = v1 , . . . ..vn and h →o/p of hidden
tence is the one with the highest TF-ISF score. layer, h = h1 . . . . . . hm , Z → The partition function derived
The cosine similarity between each sentence and from the summation of all potential pairs of vandh. E →
the centroid sentence is then calculated as shown in Energy of the joint configuration of v, h as depicted in Eq.(9).
Eq. (7). For example, a sentence like ‘‘AI systems X X X
are designed to mimic human intelligence’’ may be E(v, h) = − ai vi − bj hj − vi hj wij
i∈visible j∈hidden i,j
compared to the centroid (average) of all sentences in
the document, with its high similarity indicating that it (9)
closely represents the core idea of the document and The probability distribution associated with the visible
should be included in the summary. layer can be learned using RBM to generate new samples
Sentence_similarity = cos ine_sim(sentence, centroid) (7) P(v). Furthermore, by comprehending the probability distri-
bution of the hidden layer P (h), RBMs may extract features
These features help capture the most relevant content by and reduce dimensionality. The probability distribution for
emphasizing keywords, entities, and sentence structures that these layers can be expressed in Eq.(10) and Eq.(11).
contribute to the overall meaning of the text.
1X
p(v) = h exp(−E(v, h) (10)
1) RESTRICTED BOLTZMANN MACHINE (RBM) z
The RBM models are utilized to acquire relevant features 1X
p(h) = v exp(−E(v, h) (11)
prior to employing an IR method to select the subsection z

VOLUME 13, 2025 24521

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

The outcomes show that, when equipped with appropriate improved summarization performance and higher accuracy
structure and parameters, the performance of the suggested compared to traditional optimization-based approaches.
DL approach for feature extraction surpasses that of other
leading learning models like auto-encoders and Deep Belief 1) CAT SWARM OPTIMIZATION (CSO)
Network (DBN). This validates the suitability of RBM as a The CSO algorithm is a collective intelligence approach.
method for relevant feature extraction. This form of swarm intelligence operates by utilizing
multiple entities, referred to as agents, which interact and
D. VECTORIZATION communicate with one another to identify the optimal
In this stage, the sentences are transformed into vector solution. There are two phases to the movement of these
representations. Each sentence is segmented into a collection particles: exploration and exploitation [43]. They search the
of individual words. A TF-IDF score is given to each word, search space in order to find prospective locations during
and the matching list of words reflects this value. Potential the exploration phase, and they concentrate on honing their
vector shapes for the sentences are derived from the sum positions inside those areas during the exploitation phase.
of these TF-IDF scores. These vectors are then fed into the Observing the behaviour of cats in their natural environment
algorithm, which processes them and generates output. reveals two primary modes: seeking and tracing [44]. In the
seeking mode, cats tend to remain stationary while remaining
acutely aware of their surroundings. They maintain a high
E. HYBRID (CSO-HHO) level of awareness regarding their activities and actively seek
The proposed model focuses on extractive text summarization opportunities to pursue prey. When in tracing mode, cats are
for individual documents. It comprises three main phases: text generally more mobile and focused on capturing their target.
pre-processing, feature extraction, and summary generation Cats are known to be lazy, although they actually spend very
using a hybrid method known as CSO-HHO. In the pre- little of their day hunting and much of their time sleeping,
processing phase [48], the text undergoes segmentation, which is known as seeking mode. The CSO algorithm’s
lemmatization, stopword removal, and tokenization. Initially, implementation usually takes the solution’s position in the
lengthy sentences are segmented into shorter ones, which search space into account. The global best position of
are subsequently tokenized into individual words. Stopword particles, in relation to the location of the solution, and their
removal follows, eliminating unnecessary terms from the individual best positions have a major role in the movement of
text, and finally, lemmatization converts words to their particles in this space. Particles, by default, tend to follow the
base forms. During feature extraction, nine relevant word global best and try to match their individual best placements
features are identified using the RBM architecture, followed with the global best that was reached in earlier iterations [45].
by vectorization techniques like TF-IDF. After, feature If techniques other than the default spherical CSO method
extraction, it is proceeded for generating the summary using were used to evaluate the particle’s individual best and global
hybrid algorithm. The CSO-HHO method then utilizes these best scores, the particle progress in the search space would
features as the objective function to generate the summary. be substantially different from that of the spherical standard
By obtaining more precise fitness function, hybrid algorithm implementation and algorithm simulation.
is proposed. It also reduces global and local optimization
problems and their convergence speed is high. a: SEEKING MODE
Cat Swarm Optimization (CSO) is a swarm-based meta-
In this mode, the agent typically does not engage in
heuristic algorithm inspired by the social behaviour of
locomotion. Instead, it assimilates information from its sur-
cats, where their seeking and tracing modes are used
roundings and the nearby environment to determine the most
to balance exploration and exploitation. In the context
efficient next move toward its destination. To comprehend the
of text summarization, CSO efficiently selects the most
behaviour of agents in this mode, it is essential to first clarify
informative sentences by optimizing feature scores, ensuring
certain terminology, which is depicted in equation(12).
the summary retains key content while minimizing redun-
dancy. Harris Hawk Optimization (HHO), inspired by the
cooperative hunting strategy of Harris hawks, is an advanced Xjdnew = (1 + rand ∗ SRD) ∗ Xjdold (12)
optimization technique that enhances global and local search
capabilities. HHO dynamically adjusts search strategies using Here, SRD = Seeking Range of the Selected Dimensions
surprise pounce and escaping energy tactics, allowing it to in CDC, where X stands for the position, j : j for the cat, d
refine the selection of sentences for summarization. The for the length axis, and rand for the random integer.
hybrid CSO-HHO method leverages the strengths of both
algorithms: CSO’s efficient feature selection and HHO’s b: TRACING MODE
adaptive exploration and exploitation mechanism. This com- The agent’s cat is in a locomotory condition when it is in
bination improves the quality of sentence selection, leading tracing mode, attempting to move to the next spot that is
to more coherent and informative summaries. The experi- most advantageous to it in order to facilitate the most efficient
mental results demonstrate that the proposed model achieves route to the target [46]. As the particle progresses, its velocity

24522 VOLUME 13, 2025

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

continues to change, governed by the following Eq. (13). Hard besiege with increasingly fast dives
(t) − E |JYrabbit (t) − Ym (t)| 
 
new old  Y = Yrabbit P

Vk,d = Vk,d + r ∗ c Xbest,d − Xk,d (13)
Y (t) = N1 N i=1 Yi (t) (19)
where, d : Dimension and X : Location along the axis  m 
otherwise dimension, c : A dimensional constant is necessary
Yi (t) → Location of every hawk in iteration, t and N →total
to make the equation invariant, V : velocity, and r : random
no. of hawks.
integer between 0 and 1.
Before applying the text summarization algorithm, it is
Update position of catk according to the following Eq. (14).
crucial to define parameters for certain variables. The
Yk,d = Yk,d + Vk,d (14) maximum number of iterations, denoted as max_iteration,
must be specified, along with a threshold fitness score.
Yk,d →position of catk , c1 →The acceleration coefficient, Sentences with a fitness score less than or equal to this
which facilitates the increase in the cat velocity within threshold will be included in the summary. This score is
the solution space, is typically set at 2.05. Additionally, utilized once the algorithm has been executed. The process of
r1 represents a randomly generated value that is uniformly generating the summary involves several phases, as outlined
distributed within the interval [1, 0]. in our proposed methodology. Figure 3 explains the flowchart
of the proposed hybrid algorithm.
2) HARRIS HAWK OPTIMIZATION (HHO)
In 2019, Ali Asghar Heidari et al. [47] proposed HHO as a
swarm intelligence optimization algorithm, which is based on
HH hunting procedure. The algorithm separates its hunting
process into two parts: local development and global search.
The following Eq.(15) formulates the definition of the global
search:

Y (t + 1)
 
 Yrand (t) − r1 |Yrand (t) − 2r2X (t)| q ≥ 0.5 
= (Yrabbit (t) − Ym (t)) − r3(LB + r4(UB − LB))
q ≺ 0.5
 

(15)

Here, Y (t + 1) → be the position of the hawks vector in

the subsequent iteration, UBandLB →denotes the upper and
lower boundaries, q → random values inside (0,1) that are
updated in each iteration, Yrabbit (t) → position of rabbit,
Y (t) → denotes the current position, Yrand (t) → randomly
picked hawk from the current population and Ym → current
position of hawks average location.
The local development was divided into four categories:
soft besiege, hard besiege, soft besiege with progressive quick
dives, and hard besiege with progressive rapid dives.
The energy released during the escape played a crucial
role in all categories |E| < 1 Local development has been
conducted. The formulas for local development, divided into
four parts, are outlined in Eq. (16), Eq. (17), Eq. (18) and Eq.
(19).:
FIGURE 3. Detailed flowchart illustrating the CSO-HHO hybrid algorithm.

Y (t + 1) = 1Y (t) − E |JYrabbit (t) − Y (t)| However, the hybrid algorithm (CSO-HHO) is highly
1Y (t) = Yrabbit (t) − Y (t) (16) applicable in the context of text summarization due to its
ability to balance exploration and exploitation, which is
Hard besiege
crucial for identifying key information from large documents.
Y (t + 1) = Yrabbit (t) − E |1Y (t)| (17) In the CSO phase, the algorithm explores the search space
by simulating the seeking and tracing behaviours of cats,
Progressively fast dives with a gentle besiege enabling it to discover promising sentence clusters in the
document. The HHO phase, inspired by the hunting strategy
Y = Yrabbit (t) − E |JYrabbit (t) − Y (t)| (18) of hawks, refines this exploration by using local and global

VOLUME 13, 2025 24523

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

search mechanisms to focus on the most relevant sentences TABLE 2. IIharas. Comparison of ROUGE scores for various techniques,
including the proposed algorithm, on the DUC 2002 dataset.
and optimize the summary. By combining the strengths of
both algorithms, the CSO-HHO hybrid efficiently selects
sentences that represent the core content while eliminating
irrelevant information. This approach enhances the summa-
rization process by improving the accuracy and quality of
the summary, ensuring that the most significant details are
captured while maintaining conciseness.
Though, the proposed methodology demonstrates
improved summarization performance and higher accuracy,
it has certain limitations. Since it is designed for extractive
summarization, it may not effectively capture deep semantic
relationships or generate more coherent summaries compared
to abstractive approaches. The reliance on predefined feature
selection through RBM and TF-IDF may limit adaptability
across different document domains and writing styles.
Additionally, the model is tailored for individual document
summarization, making it less effective for multi-document
summarization tasks that require handling redundancy and
information fusion. The computational complexity may also
increase for longer documents due to extensive feature
TABLE 3. Comparative analysis of ROUGE scores for multiple techniques,
extraction and optimization processes involved in the including the proposed algorithm, on the DUC 2003 dataset.
CSO-HHO method. Future improvements could focus on
enhancing contextual understanding, reducing computational
overhead, and extending the approach to multi-document
summarization.

IV. EXPERIMENT RESULT ANALYSIS

The datasets are used to analyze the optimization technique
with an emphasis on extractive text summarization for single
documents. A 64-bit Windows 11 operating system with
an Intel Core i7 CPU running at 1.60 GHz and 1.80 GHz
and 8 GB of RAM was used for all algorithm simulations.
Python 3.10 is the version of Python used in this analysis.
The outcomes of these generated summaries, referred to as
candidate summaries, are assessed and contrasted using the
Rouge score methodology. Rouge 1 looks at the overlap of
unigrams between the reference summary and the candidate
summary, whereas Rouge 2 assesses the overlap of bigrams. can be attributed to variations in the summarization strategy,
Rouge L also discovers the longest common subsequence of the extent of content reduction, or the overall performance of
n-grams. the summarization system.
Text summarization performance is evaluated using the
DUC 2002, DUC 2003, and DUC 2005 datasets, which are A. SINGLE DOCUMENT TEXT SUMMARIZATION
standard benchmarks for assessing automatic summarization As presented in the tables below, the ROUGE scores are
systems. Each dataset contains five documents, with an calculated for the summaries generated by the framework
average of approximately 21 to 23 sentences per document. in comparison to the reference summaries. Specifically, the
The data is sourced from the TREC (Text Retrieval Confer- ROUGE scores for each of the four implementation strategies
ence) dataset, which is renowned for its diverse collection are compared and displayed in tabular form. Table 2 shows
of information retrieval tasks. The ‘‘Length of generated the ROUGE scores for different methods, including the
summary in words’’ refers to the total word count of proposed algorithm, across all five documents in the
the summaries produced by the automatic summarization DUC 2002 dataset. Table 3 presents the ROUGE scores for
model. For the three dataset versions, the word counts of various state-of-the-art techniques, including the proposed
the generated summaries are approximately 104, 97, and algorithm, across all five documents in the DUC 2003 dataset.
109 words, respectively. These variations suggest that the Lastly, Table 4 illustrates the ROUGE scores for various
methods employed for generating the summarized text may methods, including the proposed algorithm, across all five
differ slightly in terms of summary length. Such differences documents in the DUC 2005 dataset.

24524 VOLUME 13, 2025

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

TABLE 4. The ROUGE scores for various methods, including the proposed and the average score stood at 0.456. These scores highlight
algorithm, across all five documents in the DUC 2005 dataset.
the variability in the algorithm’s performance, with the best
and worst results revealing the extremes of its summarization
capability, while the average scores offer a more generalized
view of its overall effectiveness.

TABLE 5. Analysis of ROUGE scores in terms of best, worst, and average

cases for DUC 2002, DUC 2003, and DUC 2005 datasets.

Tables 2, 3, and 4 present an analysis of the datasets in

relation to several methodologies. The datasets used include
DUC 2002, DUC 2003, and DUC 2005. The methodologies
evaluated are CSO, QABC, PSO, GJO, FF, SVM, RF, and the
proposed hybrid algorithm. For each dataset, five documents
(Doc 1, Doc 2, Doc 3, Doc 4, and Doc 5) were considered.
The best ROUGE scores were achieved by the proposed
algorithm. Different ROUGE score values were obtained for
each document.
Table 5 provides a detailed performance analysis of
different algorithms applied to the datasets, specifically in
terms of predicting ROUGE scores, which are standard
metrics for evaluating automatic summarization systems. FIGURE 4. Illustration of Rouge scores of DUC 2002 Dataset (a)Rouge
The table presents the best-case, worst-case, and average- 1 score, (b) Rouge 2 score, (c) Rouge L score.
case ROUGE scores for each algorithm across three distinct
ROUGE measures—ROUGE-1, ROUGE-2, and ROUGE-L Figure 4 compares the performance of various techniques,
for DUC 2002, DUC 2003 and DUC 2005 datasets. For including CSO, QABC, PSO, GJO, FF, SVM, and RF,
ROUGE-1, which assesses unigram overlap between the with the proposed hybrid algorithm (CSO-HHO) on the
generated and reference summaries, the algorithm’s best DUC 2002 dataset using Rouge metrics: Rouge-1, Rouge-2,
performance achieved a score of 0.492, indicating the highest and Rouge-L. The Rouge-1 scores (a) show that the pro-
level of overlap. In the worst-case scenario, the algorithm posed CSO-HHO algorithm consistently outperforms other
recorded a score of 0.367, while the average score across all methods across all documents, indicating its superior unigram
instances was 0.422. For ROUGE-2, which focuses on bigram overlap. Similarly, Rouge-2 scores (b) highlight its significant
overlap, the best score reached 0.569, the worst was 0.371, improvement in bigram overlap, especially in documents

VOLUME 13, 2025 24525

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

3 and 5. The Rouge-L scores (c) further demonstrate

its ability to maintain sentence structure, achieving the
highest scores across all documents. The results clearly
show the superiority of the CSO-HHO algorithm over
traditional optimization methods. Its consistent performance
improvement underscores the benefits of integrating CSO
and HHO, enabling the generation of more accurate and
coherent summaries. The hybrid method outperforms its
counterparts, such as CSO, QABC, PSO, GJO, FF, SVM,
and RF, achieving the highest values for most documents,
particularly excelling in Doc 4. This highlights its robustness
and adaptability to varying document complexities. Unlike
standalone algorithms, the CSO-HHO effectively combines
the global optimization capabilities of CSO with the
adaptive exploration-exploitation balance of HHO, leading
to improved clustering and classification performance. Its
higher Precision signifies a reduction in false positives,
while superior Recall reflects its ability to comprehensively
identify relevant information. These results validate the
hybrid approach’s efficacy and its suitability for scenarios
requiring high accuracy and reliability, surpassing traditional
methods.
Figure 5 compares the Rouge scores (Rouge-1, Rouge-2,
and Rouge-L) of various state-of-the-art techniques, includ-
ing CSO, QABC, PSO, GJO, FF, SVM, and RF, with
the proposed hybrid algorithm (CSO-HHO) on the DUC
2003 dataset. Rouge-1 scores (a) show the proposed CSO-
HHO consistently outperforms other methods across all
documents, indicating superior unigram overlap. Rouge-2
scores (b) further demonstrate the effectiveness of CSO-
HHO in capturing bigram relationships, with a significant
performance gap compared to other techniques. The Rouge-L
scores (c) reveal its ability to maintain sentence structure and
coherence, achieving the highest scores across all documents.
The results emphasize the superiority of the CSO-HHO
algorithm over traditional optimization techniques. By lever-
aging the strengths of CSO and HHO, the hybrid approach
generates summaries that are more accurate, coherent, and
contextually relevant. These findings validate the robustness
and efficiency of the proposed algorithm for abstractive
FIGURE 5. (a) Rouge-1, (b) Rouge-2, and (c) Rouge-L scores comparison
summarization on the DUC 2003 dataset. The improvement of the proposed CSO-HHO algorithm with existing state-of-the-art
in Rouge-L scores for complex documents, such as Doc 4 and techniques for DUC 2003 dataset.
Doc 5, highlights the hybrid’s adaptability and robustness.
In contrast, traditional methods like CSO, QABC, and
GJO show consistently lower scores across all metrics, traditional optimization techniques such as CSO, PSO, and
reflecting their limitations in abstractive summarization tasks. QABC. Rouge-2 scores (b) further highlight the strong
By combining the global optimization strength of CSO with performance of CSO-HHO in capturing bigram relationships,
the adaptive exploration-exploitation capabilities of HHO, with significant improvement over other methods like
the hybrid approach generates summaries that are accurate, SVM and RF. Similarly, Rouge-L scores (c) demonstrate
contextually relevant, and structurally coherent. the ability of CSO-HHO to maintain sentence structure
Figure 6 illustrates the Rouge scores (Rouge-1, Rouge-2, and coherence, achieving the highest results across all
and Rouge-L) for various techniques, including CSO, QABC, cases. The findings confirm that the CSO-HHO hybrid
PSO, GJO, FF, SVM, and RF, alongside the proposed hybrid algorithm effectively leverages the strengths of CSO and
algorithm (CSO-HHO) on the DUC 2005 dataset. Rouge-1 HHO, surpassing optimization-based and machine learning
scores (a) reveal that CSO-HHO consistently achieves the methods. Its superior performance across all Rouge metrics
highest unigram overlap across all documents, outperforming validates its robustness and efficiency in generating accurate,

24526 VOLUME 13, 2025

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

the hybrid approach delivers summaries that excel in term

relevance, contextual relationships, and structural coherence.

FIGURE 7. Comparison analysis of performance evaluation metrics across

DUC 2002, DUC 2003, and DUC 2005 Dataset.

Figure 7 compares various performance metrics, including

Precision, Recall, F-score, Accuracy, BLEU Score, Read-
ability, Cohesion, and Sensitivity, across the DUC 2002,
DUC 2003, and DUC 2005 datasets. Precision consistently
outperforms Recall across all datasets, indicating a focus on
extracting accurate information. The F-score and Accuracy
show balanced and consistent performance, with slightly
better results on DUC 2002. BLEU scores are moderate,
reflecting the models’ ability to capture linguistic features
effectively. Readability scores are higher than Cohesion,
suggesting the summaries are easy to read but could improve
in logical flow. Sensitivity, the highest among all metrics,
demonstrates the robustness of the summarization approach
in handling diverse datasets. Overall, the results highlight the
adaptability and effectiveness of the proposed method across
datasets.

FIGURE 6. Rouge scores comparison across techniques on

DUC 2005 Dataset (a)Rouge 1 score, (b)Rouge 2 score, and (c)Rouge
3 score.

FIGURE 8. Comparison of Rouge scores with traditional method and

proposed technique.
coherent, and contextually relevant summaries for the DUC
2005 dataset. The significant performance gaps underline Figure 8 compares Rouge-1, Rouge-2, and Rouge-L
the hybrid algorithm’s robustness in adapting to varying scores of several state-of-the-art techniques, including CSO,
linguistic and structural complexities. In contrast, traditional QABC, PSO, GJO, FF, SVM, RF, and the proposed hybrid
optimization methods like QABC, PSO, and GJO, along algorithm (CSO-HHO). For Rouge-1 scores, which measure
with machine learning approaches like SVM and RF, unigram overlap, the proposed CSO-HHO algorithm achieves
show moderate to inconsistent performance across metrics. the highest score, indicating superior word-level content
By leveraging the global optimization strengths of CSO selection. In Rouge-2, which captures bigram overlap, CSO-
and the adaptive exploration-exploitation balance of HHO, HHO significantly outperforms other methods, showcasing

VOLUME 13, 2025 24527

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

its strength in preserving contextual relationships between [10] B. Ma, ‘‘Mining both commonality and specificity from multiple
words. Similarly, for Rouge-L scores, which measure the documents for multi-document summarization,’’ IEEE Access, vol. 12,
pp. 54371–54381, 2024.
longest common subsequence and reflect sentence-level [11] S. Ketineni and J. Sheela, ‘‘Metaheuristic aided improved LSTM for multi-
structure and coherence, the proposed algorithm consistently document summarization: A hybrid optimization model,’’ J. Web Eng.,
leads, demonstrating its ability to maintain sentence fluency vol. 22, no. 4, pp. 701–730, Oct. 2023.
[12] H. Rezaei, S. A. M. Mirhosseini, A. Shahgholian, and M. Saraee, ‘‘Features
and structure. This highlights the effectiveness of integrating in extractive supervised single-document summarization: Case of Persian
CSO and HHO in generating summaries that are more news,’’ Lang. Resources Eval., vol. 58, no. 4, pp. 1073–1091, Dec. 2024.
accurate, contextually rich, and linguistically coherent. [13] M. M. Taye, ‘‘Understanding of machine learning with deep learning:
Architectures, workflow, applications and future directions,’’ Computers,
vol. 12, no. 5, p. 91, Apr. 2023.
V. CONCLUSION [14] W. Khan, A. Daud, K. Khan, S. Muhammad, and R. Haq, ‘‘Exploring
Text summarization remains a challenging task due to the frontiers of deep learning and natural language processing: A
comprehensive overview of key challenges and emerging trends,’’ Natural
the need to balance accuracy, coherence, and contextual
Lang. Process. J., vol. 4, Sep. 2023, Art. no. 100026.
relevance while handling diverse datasets and varying text [15] K. A. Hashmi, G. Kallempudi, D. Stricker, and M. Z. Afzal, ‘‘FeatEn-
structures. The proposed model integrates pre-processing, Hancer: Enhancing hierarchical features for object detection and beyond
feature extraction, vectorization, and a hybrid algorithm to under low-light vision,’’ in Proc. IEEE/CVF Int. Conf. Comput. Vis.
(ICCV), Oct. 2023, pp. 6702–6712.
achieve superior results for single-document extractive text [16] P. Kouris, G. Alexandridis, and A. Stafylopatis, ‘‘Abstractive text sum-
summarization. By leveraging RBM for feature extraction marization based on deep learning and semantic content generalization,’’
and TF-IDF for vectorization, it generates summaries with Tech. Rep., 2022.
[17] Q. Wang, X. Gao, K. Wan, F. Li, and Z. Hu, ‘‘A novel restricted Boltzmann
an impressive accuracy of 99.56%, outperforming existing machine training algorithm with fast Gibbs sampling policy,’’ Math.
methods like CSO, QABC, PSO, GJO, FF, SVM, and RF. Problems Eng., vol. 2020, pp. 1–19, Mar. 2020.
The proposed CSO-HHO hybrid algorithm combines the [18] O. R. Adegboye and E. Deniz Ülker, ‘‘Hybrid artificial electric field
employing cuckoo search algorithm with refraction learning for engineer-
strengths of CSO and HHO, offering improved convergence ing optimization problems,’’ Sci. Rep., vol. 13, no. 1, p. 4098, Mar. 2023.
and robustness for text summarization tasks. This approach [19] T. Ye, H. Wang, T. Zeng, M. G. H. Omran, F. Wang, Z. Cui, and
addresses critical challenges in text summarization and finds J. Zhao, ‘‘An improved two-archive artificial bee colony algorithm for
many-objective optimization,’’ Expert Syst. Appl., vol. 236, Feb. 2024,
applications in fields such as legal, medical, and financial
Art. no. 121281.
document summarization, where precision and coherence are [20] R. S. Selvan and K. Arutchelvan, ‘‘Improved cuckoo search optimization
essential. algorithm based multi-document summarization model,’’ in Proc. 5th
The model is limited to single-document summarization Int. Conf. Comput. Methodologies Commun. (ICCMC), Apr. 2021,
pp. 735–739.
and may struggle to capture deeper semantic relationships or [21] D. Debnath, R. Das, and P. Pakray, ‘‘Extractive single document
handle large-scale applications due to its computational com- summarization using an archive-based micro genetic-2,’’ in Proc. 7th Int.
plexity. Future work can extend this model to multi-document Conf. Soft Comput. Mach. Intell. (ISCMI), Nov. 2020, pp. 244–248.
[22] T. Bezdan, C. Stoean, A. A. Naamany, N. Bacanin, T. A. Rashid,
summarization, incorporate transformer-based techniques for M. Zivkovic, and K. Venkatachalam, ‘‘Hybrid fruit-fly optimization
better feature selection, and optimize its algorithm with novel algorithm with K-means for text document clustering,’’ Mathematics,
fitness functions for enhanced accuracy and efficiency. vol. 9, no. 16, p. 1929, Aug. 2021.
[23] I. Akhmetov, A. Gelbukh, and R. Mussabayev, ‘‘Greedy optimization
method for extractive summarization of scientific articles,’’ IEEE Access,
REFERENCES vol. 9, pp. 168141–168153, 2021.
[1] C. B. V. Durga and D. Babu, ‘‘Telugu text summarization using histo fuzzy [24] H. Abo-Bakr and S. A. Mohamed, ‘‘Automatic multi-documents text
C-means and median support based grasshopper optimization algorithm summarization by a large-scale sparse multi-objective optimization
(MSGOA),’’ J. Theor. Appl. Inf. Technol., vol. 100, no. 17, pp. 1–15, 2022. algorithm,’’ Complex Intell. Syst., vol. 9, no. 4, pp. 4629–4644, Aug. 2023.
[2] M. Mohd, R. Jan, and M. Shah, ‘‘Text document summarization using word [25] M. Tomer and M. Kumar, ‘‘Multi-document extractive text summarization
embedding,’’ Expert Syst. Appl., vol. 143, Apr. 2020, Art. no. 112958. based on firefly algorithm,’’ J. King Saud Univ.-Comput. Inf. Sci., vol. 34,
[3] H. Jin, Y. Zhang, D. Meng, J. Wang, and J. Tan, ‘‘A comprehensive no. 8, pp. 6057–6065, Sep. 2022.
survey on process-oriented automatic text summarization with exploration [26] R. Alqaisi, W. Ghanem, and A. Qaroush, ‘‘Extractive multi-document
of LLM-based methods,’’ 2024, arXiv:2403.02901. Arabic text summarization using evolutionary multi-objective
[4] A. Nawaz, M. Bakhtyar, J. Baber, I. Ullah, W. Noor, and A. Basit, optimization with K-medoid clustering,’’ IEEE Access, vol. 8,
‘‘Extractive text summarization models for Urdu language,’’ Inf. Process. pp. 228206–228224, 2020.
Manage., vol. 57, no. 6, Nov. 2020, Art. no. 102383. [27] Á. Hernández-Castañeda, R. A. García-Hernández, Y. Ledeneva, and
[5] Z. Alami Merrouni, B. Frikh, and B. Ouhbi, ‘‘EXABSUM: A new C. E. Millán-Hernández, ‘‘Language-independent extractive automatic text
text summarization approach for generating extractive and abstractive summarization based on automatic keyword extraction,’’ Comput. Speech
summaries,’’ J. Big Data, vol. 10, no. 1, p. 163, Oct. 2023. Lang., vol. 71, Jan. 2022, Art. no. 101267.
[6] A. K. Yadav, R. S. Yadav, and A. K. Maurya, ‘‘State-of-the-art approach [28] D. Debnath, R. Das, and P. Pakray, ‘‘Single document text summarization
to extractive text summarization: A comprehensive review,’’ Multimedia addressed with a cat swarm optimization approach,’’ Int. J. Speech
Tools Appl., vol. 82, no. 19, pp. 29135–29197, Aug. 2023. Technol., vol. 53, no. 10, pp. 12268–12287, May 2023.
[7] T. Zhang, F. Ladhak, E. Durmus, P. Liang, K. McKeown, and [29] P.-W. Tsai, X. Xue, J. Zhang, and V. Istanda, ‘‘Adjustable mode ratio and
T. B. Hashimoto, ‘‘Benchmarking large language models for news summa- focus boost search strategy for cat swarm optimization,’’ Appl. Comput.
rization,’’ Trans. Assoc. Comput. Linguistics, vol. 12, pp. 39–57, Jan. 2024. Intell., vol. 1, no. 1, pp. 75–94, 2021.
[8] M. Kirmani, G. Kaur, and M. Mohd, ‘‘Analysis of abstractive and extractive [30] S. Jiang, C. Zhang, and S. Chen, ‘‘Sequential hybrid particle swarm
summarization methods,’’ Int. J. Emerg. Technol. Learn., vol. 19, no. 1, optimization and gravitational search algorithm with dependent random
pp. 86–96, Jan. 2024. coefficients,’’ Math. Problems Eng., vol. 2020, pp. 1–17, Apr. 2020.
[9] G. Kaur and A. Sharma, ‘‘A deep learning-based model using hybrid [31] A. S. Bhatia, M. K. Saggi, and S. Zheng, ‘‘QPSO-CD: Quantum-
feature extraction approach for consumer sentiment analysis,’’ J. Big Data, behaved particle swarm optimization algorithm with Cauchy distribution,’’
vol. 10, no. 1, p. 5, Jan. 2023. Quantum Inf. Process., vol. 19, no. 10, pp. 1–23, Oct. 2020.

24528 VOLUME 13, 2025

J. Rautaray et al.: DL-Based Feature Extraction Technique for Single Document Summarization

[32] A. Ghadimi and H. Beigy, ‘‘SGCSumm: An extractive multi-document SANGRAM PANIGRAHI received the Bachelor
summarization method based on pre-trained language model, submodular- of Technology degree in information technology
ity, and graph convolutional neural networks,’’ Expert Syst. Appl., vol. 215, from the Biju Patnaik University of Technology,
Apr. 2023, Art. no. 119308. Odisha, in 2006, the M.Tech. degree in informa-
[33] T. T. T. N. B. Thi, T. T. Dinh, and N. T. Hoai, ‘‘A hybrid PSO-GA for tion technology from Tezpur University (Central),
extractive text summarization,’’ in Proc. 35th Pacific Asia Conf. Lang., Inf. Assam, in 2009, and the Ph.D. degree from the
Comput., 2021, pp. 757–766. National Institute of Technology, Raipur, India,
[34] S. P. Pati and R. Rautray, ‘‘Single document extractive text summarization in 2018. He is currently an Associate Professor
using cuckoo search algorithm,’’ J. Inf. Optim. Sci., vol. 43, no. 5,
with the Department of Computer Science and
pp. 1089–1097, Jul. 2022.
Information Technology, Siksha ‘‘O’’ Anusandhan
[35] K. Kumar and S. Nagalla, ‘‘Multi-document summarization using CS-ABC
optimization algorithm,’’ EAI Endorsed Trans. Energy Web, vol. 7, no. 28, (Deemed to be University), Bhubaneswar, Odisha, India. With over 12 years
Jul. 2018, Art. no. 163835. of experience in teaching and research, he has more than 50 publications
[36] S. Mandal, G. K. Singh, and A. Pal, ‘‘Single document text summarization in reputed international and national journals and conferences. His research
technique using optimal combination of cuckoo search algorithm, sentence interests include natural language processing, text mining, wireless sensor
scoring and sentiment score,’’ Int. J. Inf. Technol., vol. 13, no. 5, networks, neural networks, data mining, and machine learning. He has
pp. 1805–1813, Oct. 2021. significantly contributed to advancing these fields through his numerous
[37] S. P. Patil and R. Rautray, ‘‘SMATS: Single and multi automatic text research publications.
summarization,’’ Karbala Int. J. Modern Sci., vol. 9, no. 1, p. 6, Jan. 2023.
[38] X. Chen, L. Liu, J. Du, D. Liu, L. Huang, and X. Li, ‘‘Intelligent AJIT KUMAR NAYAK received the Graduate
optimization based on a virtual marine diesel engine using GA-ICSO degree in electrical engineering from the Institu-
hybrid algorithm,’’ Machines, vol. 10, no. 4, p. 227, Mar. 2022. tion of Engineers, India, in 1994, and the M.Tech.
[39] M. H. H. Wahab, N. A. W. A. Hamid, S. Subramaniam, R. Latip, and and Ph.D. degrees in computer science from
M. Othman, ‘‘Decomposition–based multi-objective differential evolution Utkal University, in 2001 and 2010, respectively.
for extractive multi-document automatic text summarization,’’ Appl. Soft He is currently a Professor and the Head of the
Comput., vol. 151, Jan. 2024, Art. no. 110994. Department of Computer Science and Information
[40] M. Tomer, M. Kumar, A. Hashmi, B. Sharma, and U. Tomer, ‘‘Enhancing Technology, Siksha ‘‘O’’ Anusandhan Deemed to
metaheuristic based extractive text summarization with fuzzy logic,’’
be University, Bhubaneswar, Odisha. His research
Neural Comput. Appl., vol. 35, no. 13, pp. 9711–9723, May 2023.
interests include computer networking, ad-hoc and
[41] R. Hakima, Z. Maria, E. H. Norelislam, and H. Nabil, ‘‘A comparative
sensor networks, machine learning, natural language computing, and speech
study of several metaheuristic algorithms for optimization problems,’’ in
Proc. 8th Int. Conf. Optim. Appl. (ICOA), Oct. 2022, pp. 1–9. and image processing. He has published about 55 research papers in various
[42] A. Alshammari and K. El Hindi, ‘‘Privacy-preserving deep learning journals and conferences. Also, he has co-authored a book Computer
framework based on restricted Boltzmann machines and instance reduction Network Simulation Using NS2. He has six Ph.D. scholars have been awarded
algorithms,’’ Appl. Sci., vol. 14, no. 3, p. 1224, Feb. 2024. Ph.D. under his supervision. He has also participated as an organizing
[43] A. M. Ahmed, T. A. Rashid, and S. A. M. Saeed, ‘‘Cat swarm member of several conferences and workshops in international and national
optimization algorithm: A survey and performance evaluation,’’ Comput. level.
Intell. Neurosci., vol. 2020, pp. 1–20, Jan. 2020.
[44] D. David, T. Widayanti, and M. Q. Khairuzzahman, ‘‘Performance PREMANANDA SAHU received the Ph.D. degree
comparison of cat swarm optimization and genetic algorithm on optimizing from Centurion University, Odisha, India. He is
functions,’’ in Proc. 1st Int. Conf. Cybern. Intell. Syst. (ICORIS), vol. 1, currently an Associate Professor with Lovely
Aug. 2019, pp. 35–39. Professional University, Punjab, India. He has
[45] A. Sharaff, M. Jain, and G. Modugula, ‘‘Feature based cluster ranking more than 15 years of teaching and industrial expe-
approach for single document summarization,’’ Int. J. Inf. Technol., vol. 14, rience. He has published more than 30 research
no. 4, pp. 2057–2065, Jun. 2022. articles in different national and international
[46] M. Mojrian and S. A. Mirroshandel, ‘‘A novel extractive multi-document journals and conferences. He has also acted as
text summarization system using quantum-inspired genetic algorithm: a member of the reviewer board for leading
MTSQIGA,’’ Expert Syst. Appl., vol. 171, Jun. 2021, Art. no. 114555.
publishing houses, including Springer and Elsevier
[47] F. S. Gharehchopogh and B. Abdollahzadeh, ‘‘An efficient Harris hawk
and international conferences. His expertise areas are artificial intelligence,
optimization algorithm for solving the travelling salesman problem,’’
Cluster Comput., vol. 25, no. 3, pp. 1981–2005, Jun. 2022.
machine learning, and image processing computer vision.
[48] A. A. Nafea, R. R. Majeed, A. Ali, A. J. Yas, S. A. Alameri, and
M. M. Al-Ani, ‘‘A brief review of big data in healthcare: Challenges and KAUSHIK MISHRA (Member, IEEE) received
issues, recent developments, and future directions,’’ Babylonian J. Internet the Ph.D. degree from the Veer Surendra Sai Uni-
Things, vol. 2024, pp. 10–15, Feb. 2024. versity of Technology (VSSUT), Burla, in 2021.
He is currently an Associate Professor with the
Department of Computer Science and Engineering
(CSE), Manipal Institute of Technology Ben-
galuru, Manipal Academy of Higher Education,
Manipal, India. His research interests include task
JYOTIRMAYEE RAUTARAY received the bache- consolidation, load balancing, privacy protection
lor’s degree in computer science and engineering using security mechanisms for cloud computing,
from BPUT, Odisha, in 2011, and the M.Tech. fog computing, and the Internet of Medical Things. He has bagged two best
degree in computer science and engineering from paper awards in two of the conferences. His research findings are published
KIIT, Odisha, in 2013. She is currently pursuing in many journals of repute, such as IEEE TRANSACTIONS ON INDUSTRIAL
the Ph.D. degree with the Department of Computer INFORMATICS, IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT,
Science and Engineering, Siksha ‘‘O’’ Anusand- ACM Transactions on Internet Technology, Journal of Supercomputing,
han (Deemed to be University), Bhubaneswar, Journal of King Saud University, and Open Computer Science; and
Odisha, India. She has authored over 30 research conferences like IEEE TENSYMP. He has served as a Reviewer for many
articles and serves as a reviewer for several reputed journals, such as IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION
esteemed conferences and journals. Her primary research interests include SYSTEMS, Cluster Computing, Journal of Supercomputing, Multimedia Tools
natural language processing, automatic text summarization, sentiment and Applications, Journal of Grid Computing, and IGI Global.
analysis, text mining, and machine translation.