0% found this document useful (0 votes)
6 views10 pages

5_ines_topic-modeling-on-news-articles-using-latent-dirichlet-allocation-kretinin-a-kol

This document presents a study on topic modeling of news articles using the Latent Dirichlet Allocation (LDA) algorithm, focusing on the analysis of articles from the Reuters news website. The authors demonstrate the process of building and evaluating LDA models to extract key topics and keywords from the text, aiming to improve content recommendation systems. The paper discusses the training of models on different datasets, including a generic model trained on Wikipedia, and evaluates their performance based on various metrics.

Uploaded by

tomasgabrielmtc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views10 pages

5_ines_topic-modeling-on-news-articles-using-latent-dirichlet-allocation-kretinin-a-kol

This document presents a study on topic modeling of news articles using the Latent Dirichlet Allocation (LDA) algorithm, focusing on the analysis of articles from the Reuters news website. The authors demonstrate the process of building and evaluating LDA models to extract key topics and keywords from the text, aiming to improve content recommendation systems. The paper discusses the training of models on different datasets, including a generic model trained on Wikipedia, and evaluates their performance based on various metrics.

Uploaded by

tomasgabrielmtc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

INES 2022 • 26th IEEE International Conference on Intelligent Engineering Systems • August 12-15, 2022 • Crete, Greece

Topic Modeling on News Articles using


Latent Dirichlet Allocation
Mykyta Kretinin∗ and Giang Nguyen∗†
∗ Faculty of Informatics and Information Technologies, STU in Bratislava, Ilkovičova 2, Bratislava 84216, Slovakia
† Institute of Informatics, Slovak Academy of Sciences, Dúbravská cesta 9, Bratislava 84507, Slovakia

Emails: [email protected], [email protected]

Abstract—Topic modeling is widely used to obtain the most vis- extracting and summarizing information, building graphics,
2022 IEEE 26th International Conference on Intelligent Engineering Systems (INES) | 978-1-6654-9209-6/22/$31.00 ©2022 IEEE | DOI: 10.1109/INES56734.2022.9922609

ible topics from a given text corpus. In this work, a demonstration and visualizing statistics. Therefore, to be able to achieve
of the most discussed topic modeling is presented from articles different results and use topic modeling on different datasets
on the Reuters news website. These articles are collected and
consequently processed with a Latent Dirichlet Allocation (LDA) properly, many different models were developed. This work
unsupervised learning algorithm. The main goal is to build mainly focuses on LDA model, which is one of the most pop-
the best model(s) that accurately produces the most discussed ular and famous methods in topic modeling. The application
topics. Such a model(s) can be used in real life to instantly get of algorithms for text preprocessing, model creation, learning,
information about actual news to classify documents in a given and finally, model usage on real datasets is presented in the
dataset and extract dominated topics with their keywords. This
helps to build, for example, correlations with user preferences following parts of this work.
and recommend interesting content. There are works which use
different models to evaluate texts and obtain statistics about them, II. R ELATED WORK
such as the most popular people’s opinions about some question The use of NLP is very useful in the analysis of social
or to obtain popular and dominating subtopics of the specific networks. Each social network contains many chats, polls,
topic dataset (e.g., medicine articles). As a result of the work, we
were able to create a generic LDA model, trained on Wikipedia and groups with the opinions of users on a certain topic or
articles. The model successfully analyzes Reuters articles and situation. Sometimes, it may be very useful to get summa-
extracted their topics as keyword sets. Then, they can be used rized information in graphs and tables for faster analysis,
to recommend content that is interesting to the target user, for but this process of detecting and processing these data is
example, based on the recommended content tags. time-consuming. This problem was very relevant during the
Index Terms—Topic Modeling, Latent Dirichlet Allocation,
Reuters Articles, Wikipedia, Ukraine, War, Covid, NLP
COVID-19 epidemic, when governments of countries and
world organizations wanted to know the point of view of
I. I NTRODUCTION people about preventive measures, such as mask wear, antigen
tests, and vaccination.
Today, natural language processing (NLP) methods are Another goal was described in [2]. Authors studied the
widely used. They help humans process documents, extract possibility of using NLP in the detection of disease gene
needed information from text, analyze its content, and build associations within large volumes with a large number of
graphs and diagrams with statistics. NLP methods for text complicated associations. Therefore, they described a compu-
processing save a lot of time and optimize the process of tational framework that discovers latent disease mechanisms
searching and processing information. For example, topic by dissecting disease gene associations from more than 25
modeling methods can be used to predict the topics of the million PubMed articles. They used the LDA model and
text, helping to recommend interesting articles and videos to network-based analysis because of their ability to detect latent
users. associations within text and reduce noise for large volumes of
In medicine, NLP methods allowed us to transform raw data.
data from unstructured clinical information on patients into a A good use of the LDA model was demonstrated in [3].
structured form [1]. There, the unstructured information about Here, the created LDA model topics were used to conduct
the patients took a long time for doctors to read the ”free” a literature review on papers from online databases such as
text and search for possible symptoms here. Thus, the usage Web of Science, Scopus or Google Scholar. Then the resulting
of NLP helped to solve two important problems: topics were merged into clusters to get top-level topics from
• Big amount of time spent on text analysis of electronic the former ones. It helps to understand the correlation between
health records by physicians on a regular basis, them, so a concept map can be made of the keywords of the
• Possibility of managing and mining large volumes of models, to describe the topics of the models in the most precise
clinical data on large time scales. way.
Topic modeling is a broad field of NLP, and its results are The main objective of this work is the practical application
used in the daily life of people, in their work, and in different of topic modeling methods on real datasets such as text-
fields of science. It helps to analyze large volumes of text, containing news articles. In this work, the process and results

978-1-6654-9209-6/22/$31.00 @2022 IEEE 000249

Authorized licensed use limited to: Slovak University of Technology Trial User. Downloaded on December 08,2022 at 12:26:49 UTC from IEEE Xplore. Restrictions apply.
M. Kretinin and G. Nguyen • Topic Modeling on News Articles using Latent Dirichlet Allocation

of the use of the LDA model are described, and tests are
carried out on models trained with different hyperparameters
to compare their results and choose the most successful model
for future use. Moreover, in this paper the question of the
possibility of using a generic model, trained on the Wikipedia
article dataset [4], [5], instead of a specific model, which was
trained on the Reuters article, is discussed. It is supposed that
accuracy of the Wikipedia model will be lower, but it still has
to be able to precisely predict a document topics and might
be used on almost any text, as it contains words from a wide
range of topics. All used datasets are unlabeled, so the models
are trained with the unsupervised learning approach.
Fig. 1: Words assigned to the topic to which they most likely
III. L ATENT D IRICHLET A LLOCATION
belong
Latent Dirichlet Allocation (LDA) [6], [7] is a topic mod-
eling method, which allows users to get a probabilistic distri-
bution of the topics in the document. Topics are represented The process of creating / training LDA models may be
by keywords, which are the most ”popular” words in the described in five steps, which together create the following
documents assigned to the current topic. algorithm [10]:
1) General requirements to the process: To start with topic 1) Choose number (k) of topics, which should be created
modeling using the LDA model, it should first be trained. by the model.
Therefore, a corpus is needed for the training process. A 2) Distribute these k topic among the document m by as-
corpus is usually represented as a bag-of-words (BoW), or signing a topic to the words in the text. This distribution
a list of pairs ”word:number of occurrences”, which do not is named α.
present the orders and relations of the words, but their count 3) Then we suppose, that for each word w in the text
in the text. has been assigned wrong topic, but every other word
2) Data gathering: First, the data should be acquired. In is assigned the correct topic.
the data collection process, the consideration of general topic 4) Assign word w a topic, basing on probability of two
bias should be taken seriously, especially if the majority of things:
documents collected are from the same closely located source. • What topics are actually in the analyzed document
If such a situation occurs, then the resulting model can easily m.
be overfitted for some particular topic(s). This model cannot • How many times word w has been assigned to the
accurately analyze other topics in the deployment. particular topic z across all of the documents.
3) Corpus and Dictionary: The goal of this step is to 5) Repeat this process for each document, to get k topics
transform the text into the form that the model can use, with assigned words.
be trained on it, or analyze. To obtain the corpus from This educational process is iterative, which means that it
the text, it should first be processed. This step includes the has to be repeated N times to obtain a better result. Executing
removal of special characters, punctuation and stop words, this algorithm only 1-2 times for text should not give us a
and lemmatization [8], [9] of the remaining text. The text very good result compared to a higher number of iterations.
is then transformed into a list of words, which will later In Fig. 2 the relations of the variables can be seen as follows:
be transformed into the corpus. In addition, words with a In the picture 2 we can see the relations of variables, where:
very low number of occurrences can be removed as well, to • α is the per-document topic density,
reduce the corpus and speed up the model training process. • β is the per-topic word density,
However, it can affect the quality of the model. From the • θ is the topic distribution for document m,
corpus we can get the ”word-id” relation, or the dictionary of • η is the word distribution for specific topic,
the corpus, because some implementations require it to work. • z are topics of the document m,
This is the case for the LDA model library gensim (https: • w is the specific word
// radimrehurek.com/ gensim/ models/ ldamulticore.html) which α and β are vectors of real numbers that are usually the
was used in the experiments in this paper. same for all topics/words, respectively.
θ and η are matrices, where θ(i, j) represents the probability
A. Model training
that the i-th document contains the j-th topic and η(i, j)
When the corpus is ready, it can be used to train the represents the probability that the i-th topic contains the j-
model. As a result, the trained model will be able to use the th word [11].
information gained on the topics and the words assigned to Among all these variables, only w is grayed out because
them in the analysis of the unseen text. it is the only observable variable in the system, while the

000250

Authorized licensed use limited to: Slovak University of Technology Trial User. Downloaded on December 08,2022 at 12:26:49 UTC from IEEE Xplore. Restrictions apply.
INES 2022 • 26th IEEE International Conference on Intelligent Engineering Systems • August 12-15, 2022 • Crete, Greece

the model topics. In a symmetric distribution, α is the same


for each topic:
1
α= (1)
number of topics
For an asymmetric setting, the value of α depends on the index
of the topic:
1
α= p (2)
topic index + (number of topics)
After all models were trained, their metrics were compared
to obtain the most suitable model for further use. The results
are presented in Table I

TABLE I: Results of hyperparameter tuning - 12 models


Hyperparameter tuning results
α num topics perplexity CV CU M ass
4 -9.304 0.550 -1.253
Fig. 2: Plate notation representing the LDA model with 6 -9.423 0.579 -1.401
variables and their possible values 8 -9.819 0.572 -1.987
Symmetric
10 -10.556 0.608 -1.735
12 -11.855 0.574 -2.483
14 -13.199 0.577 -2.499
others are hidden from us. In the algorithm above, our goal 4 -9.291 0.541 -1.245
in steps 1-4 is to assume a word for a topic, so we can 6 -9.396 0.550 -1.517
8 -9.806 0.548 -2.067
suggest the word w as the final result of the process. From Asymmetric
10 -10.580 0.579 -1.943
the beginning, the topics z are unknown and need to be filled 12 -11.901 0.536 -2.436
with the words w. Similarly, θ and η are also unknown and 14 -13.153 0.575 -2.147
will be calculated using the words w and the topics z. These
variables are affected by predefined model hyperparameters α Three metrics are used for model evaluation and compari-
(per-document topic density) and β (per-topic word density) son: perplexity in (3), CU M ass (p(rare word | common word))
accordingly. The higher α, the documents consist of more and CV (log (P M I), for P M I in (4)) coherence scores. In
topics (θ), and the higher β, the topics consist of most of short, the coherence score indicates the degree of semantic
the corpus words (η). similarity between words on the topic, while the perplexity
indicates how well a probability model predicts a sample or
how much it is confused with the analyzed content. The best
IV. E XPERIMENTING WITH LDA MODEL score applied to all of these metrics is the highest. Based on the
results, the model with 6 topics and a symmetric α distribution
Our goal is to obtain a precise model that can accurately
is chosen because it obtains the closest results to the best in
predict the topic of the text. Only then will it be able to
all metrics, while the other models have good results (close to
produce reliable statistics on the most popular and discussed
the best ones) only in one or two particular metrics.
topics among the documents in the scraped article set.
Normalized perplexity score:
Therefore, a strong ”basement” should be built, which
means that the trained model should show good metrics. In ln (P (W ) = ln (P (w1 , w2 , ..., wN )) (3)
this way, its words will be distributed in the right way, and
where
the prediction of the unseen text will be more accurate. This
will be achieved in the hyperparameter tuning process. ln (P ) is a normalized perplexity function;
W is a full sentence/text;
wi is an i-th word of the sentence/text W ;
A. Hyperparameter tuning N is a number of words w in the text W ;
To obtain a good model, the grid search is used over P (w1 , w2 , ..., wN ) is a probability that the model
two hyperparameters of the model, number of topics and assigns to text W ;
α (also known as the topic density per document). Totally Pointwise Mutual Information (PMI):
12 models were trained on a 100MB Wikipedia abstract p(wi , wj )
dump dataset (enwiki-latest-pages-articles-multistream24.xml- score = log( ) (4)
p(wi )p(wj )
p56564554p57025655.bz2, Feb-2022). Each model has a dif-
ferent number of topics and a value of α. Number of topics where
was in the range of [4,14] with a step of 2 (4, 6, ..., 14 topics) p(w) is a probability that the word w will be seen
and α was distributed symmetrically / asymmetrically between in a random document,

000251

Authorized licensed use limited to: Slovak University of Technology Trial User. Downloaded on December 08,2022 at 12:26:49 UTC from IEEE Xplore. Restrictions apply.
M. Kretinin and G. Nguyen • Topic Modeling on News Articles using Latent Dirichlet Allocation

p(wi , wj ) is a probability of seeing both wi and wj


words in the same document.
Models are not trained on the dataset of Reuters articles
because they have a high potential to be unbalanced. This may
lead to overfitting the model for certain topics and inaccurate
results in realization. On the contrary, Wikipedia articles are
not grouped by topic, so the Wikipedia dumps used are
considered balanced enough to produce good training results.
B. Topic modeling on the articles datasets
The trained model can finally be used to analyze crawled
articles to get brief information about their content. The most
important information we can obtain from model analysis is
the most popular topic among the given data set. Fig. 3a
and Fig. 3b present document counts from the dataset, which
were assigned to the particular topic of the model. Here, in
Fig. 3a, the most discussed topics were about politics (topic
0, 298 assigned documents) and about people’s life and health
(topic 2, 642 documents, due to the words ”covid”, ”people”,
”climate”). At the same time, in February (Fig. 3b) the war in
Ukraine was a very discussed topic, since multiple topics (0th
and 4th) are related to the war, while the ”covid” topic (2nd)
was still relevant.
This information can be used to help people understand
the topics most discussed today and to compare the topics
discussed in different periods. These data may be used, for
example, by other news websites to get the actual most popular
topics or to understand trends over a particular period of time
and to publish appropriate content that will be interesting to
readers.
Using Intertopic Distance Map, the similarity of the model
topics can be visually displayed by comparing the distance
between them. Examples of such visualization are shown in
Fig. 4, where the topic ”1” is highlighted with its 30 most
relevant terms. The top terms are calculated with the usage
of relevance (λ), with values in the range [0,1]. With λ
= 1 the most relevant terms of the topic will generally be
displayed, even if they occur somewhere else, when λ = 0
will be displayed terms that belong only to this topic.
In Fig. 4, the top terms are the same as those of the topic
”2” in Fig. 3a, as they display information about the same
topic of the model. However, these top keywords do not
appear only in the text of this topic, so we may be interested
in those that were not assigned to any other topic. This case
is shown in Figure 5, where the top 30 terms of the topic
are assigned only to this topic and do not appear anywhere
else. This may give us a more accurate explanation about
the topics of the model, as in most cases these words have a (a) October articles (b) February articles
closer meaning to each other and appear in documents with (11.05.2021) (26.02.2022)
similar content only. Fig. 3: Per-topic documents count

C. Generic versus specific LDA model


Sometimes, it may be essential to train a model on data that
are related to the future data we want to analyze. For example,
if a model is used to analyze medical articles, it should also be

000252

Authorized licensed use limited to: Slovak University of Technology Trial User. Downloaded on December 08,2022 at 12:26:49 UTC from IEEE Xplore. Restrictions apply.
INES 2022 • 26th IEEE International Conference on Intelligent Engineering Systems • August 12-15, 2022 • Crete, Greece

Fig. 4: Intertopic Distance Map for October articles with λ=1


Fig. 5: Intertopic Distance Map for October articles with λ=0

trained on some medical data to be able to separate documents


by medical subtopics and obtain better results. almost any possible topic. To check if the Reuters model will
Here, the model was trained on Wikipedia datasets instead work better on Reuters articles, two models have been trained
of the Reuters news article dataset. The main reason was on Reuters and Wikipedia datasets of similar size. Reuters
a possible unbalance in the Reuters dataset, as the topics training dataset was scrapped on 20.04.2022, so after training
discussed are very often biased, which may lead to model it was biased toward the topics of politics and war in Ukraine.
overfitting over some topics. Wikipedia, in contrast, is a very These models have to analyze 1000 November articles to
generalized dataset, so it may have many terms and even topics check the resulting topics and the distribution of the document
that do not occur very often. This may also lead to inaccurate words. From the test, the Reuters trained model got better
results, but the models will not be biased. Moreover, the results, as expected for a specific model. The documents in
Wikipedia trained model can potentially be used on any testing the test dataset were quite equally distributed by it, in contrast
dataset, as it is generic and will show similar results with to the Wikipedia trained model. However, both models were

000253

Authorized licensed use limited to: Slovak University of Technology Trial User. Downloaded on December 08,2022 at 12:26:49 UTC from IEEE Xplore. Restrictions apply.
M. Kretinin and G. Nguyen • Topic Modeling on News Articles using Latent Dirichlet Allocation

able to successfully predict the main topic of the documents. [3] M. Weiss and S. Muegge, “Conceptualizing a new
Therefore, every approach in model training has its own domain using topic modeling and concept mapping: A
pros and cons, but for this work, balanced but generalized case study of managed security services for small busi-
Wikipedia dumps were enough to experiment with the LDA nesses,” Technology Innovation Management Review,
model, as they were able to accurately predict topics of vol. 9, pp. 55–64, 2019, ISSN: 1927-0321. DOI: http:
documents in the testing dataset. //doi.org/10.22215/timreview/1261. [Online]. Available:
https://timreview.ca/article/1261.
V. C ONCLUSION
[4] E. Cambria and B. White, “Jumping nlp curves: A
In this work, a possible usage of the LDA model has been review of natural language processing research,” IEEE
described to analyze news articles, with preliminary acqui- Computational intelligence magazine, vol. 9, no. 2,
sition, preprocessing, and model training on text data sets. pp. 48–57, 2014. DOI: 10 . 1109 / MCI . 2014 . 2307227.
To achieve better results and higher accuracy, here we used [Online]. Available: https : / / www. gwern . net / docs / ai /
a hyperparameter tuning process, where we chose a model 2014-cambria.pdf.
for dataset analysis. The results demonstrated that with the [5] S. Dlugolinsky, G. Nguyen, M. Laclavik, and M. Se-
statistics acquired on the analyzed datasets, we can understand leng, “Character gazetteer for named entity recogni-
the dominant topics by their keywords. By this way, we can tion with linear matching complexity,” in Third World
compare popular topics over a particular period of time, so Congress on Information and Communication Technolo-
it is possible, for example, to monitor the topic changing gies (WICT 2013), IEEE, 2013, pp. 361–365. DOI: 10.
over time according to the reader’s interest. From the results, 1109/WICT.2013.7113096.
we were able to state that in October there were popular [6] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet
topics about politics and Covid, while in February there were allocation,” the Journal of machine Learning research,
relevant articles about the war in Ukraine and Covid. These vol. 3, pp. 993–1022, 2003. [Online]. Available: https:
topics accurately reflected the real situation in the world and //www.jmlr.org/papers/volume3/blei03a/blei03a.pdf?
reaffirmed the development and change in the relevant topics. TB iframe=true&width=370.8&height=658.8.
Moreover, the acquired Intertopic Distance Maps are able to [7] M. Weiss and S. Muegge, “Conceptualizing a new
separate words that occur only in the current topic from those domain using topic modeling and concept mapping: A
distributed among many topics. With that, it is possible to get case study of managed security services for small busi-
more information on any topic of the model. As an additional nesses,” Technology Innovation Management Review,
way to use acquired topics, their keywords and tags can be vol. 9, no. 8, 2019. [Online]. Available: https://www.
used to search for similar content on the Internet. Therefore, timreview.ca/article/1261.
if the previous user’s search results were analyzed this way, it [8] D. Khyani, B. Siddhartha, N. Niveditha, and B. Divya,
would be possible to find similar content, which would save “An interpretation of lemmatization and stemming in
a lot of time and help find potentially interesting information. natural language processing,” vol. 22, pp. 350–357,
ACKNOWLEDGEMENTS 2020.
[9] D. Maier, A. Waldherr, P. Miltner, et al., “Applying lda
This work is supported by VEGA 2/0125/20 New Meth- topic modeling in communication research: Toward a
ods and Approaches for Distributed Scalable Computing and valid and reliable methodology,” Communication Meth-
the Operational Programme Integrated Infrastructure for the ods and Measures, vol. 12, no. 2-3, pp. 93–118, 2018.
project: International Center of Excellence for Research on DOI : 10.1080/19312458.2018.1430754.
Intelligent and Secure Information and Communication Tech- [10] T. Doll, Lda topic modeling: An explanation, Ac-
nologies and Systems – Phase II (ITMS code: 313021W404), cessed 02.03.2022, 2018. [Online]. Available: https :
co-funded by the European Regional Development Fund / / towardsdatascience . com / lda - topic - modeling - an -
(ERDF). explanation-e184c90aadcd.
R EFERENCES [11] T. Ganegedara, Intuitive guide to latent dirichlet allo-
cation, Accessed 02.03.2022, 2018. [Online]. Available:
[1] K. Kreimeyer, M. Foster, A. Pandey, et al., “Natural
https://towardsdatascience.com/light-on-math-machine-
language processing systems for capturing and stan-
learning- intuitive- guide- to- latent- dirichlet- allocation-
dardizing unstructured clinical information: A system-
437c81220158.
atic review,” Journal of biomedical informatics, vol. 73,
pp. 14–29, 2017. DOI: 10 . 1016 / j . jbi . 2017 . 07 . 012.
[Online]. Available: https : / / www. sciencedirect . com /
science/article/pii/S1532046417301685.
[2] Y. Zhang, F. Shen, M. R. Mojarad, et al., “Systematic
identification of latent disease-gene associations from
pubmed articles,” PloS one, vol. 13, no. 1, e0191568,
2018. DOI: 10 . 1371 / journal . pone . 0191568. [Online].
Available: https://journals.plos.org/plosone/article?id=
10.1371/journal.pone.0191568.

000254

Authorized licensed use limited to: Slovak University of Technology Trial User. Downloaded on December 08,2022 at 12:26:49 UTC from IEEE Xplore. Restrictions apply.
2022 IEEE 26th International Conference on Intelligent Engineering Systems (INES) | 978-1-6654-9209-6/22/$31.00 ©2022 IEEE | DOI: 10.1109/INES56734.2022.9922625

on

http://www.ines-conf.org
INES 2022
26th IEEE International Conference

PROCEEDINGS
Intelligent Engineering Systems 2022

Crete, Greece
August 12-15, 2022
Committees

INES 2022 GENERAL CHAIR


Levente Kovács, Óbuda University, Budapest, Hungary

INES FOUNDING HONORARY CHAIR


Imre J. Rudas, Óbuda University, Budapest, Hungary

INES 2022 TECHNICAL PROGRAM COMMITTEE CHAIRS


Rudolf Andoga, Technical University of Košice, Slovakia
2022 IEEE 26th International Conference on Intelligent Engineering Systems (INES) | 978-1-6654-9209-6/22/$31.00 ©2022 IEEE | DOI: 10.1109/INES56734.2022.9922643

László Szilágyi, Óbuda University, Budapest, Hungary

INES 2022 TECHNICAL PROGRAM COMMITTEE


Norbert Ádám, Technical University of Košice, Slovakia
Ito Atsushi, Utsunomiya University, Japan
Monika Bakosová, Slovak University of Technology, Slovakia
Valentina Balas, “Aurel Vlaicu” University, Arad, Romania
Ildar Batyrshin, CIC, Instituto Politecnico Nacional, Mexico
Barnabás Bede, DigiPen, Seattle, USA
Attila L. Bencsik, Óbuda University, Budapest, Hungary
Balázs Benyó, BME, Hungary
Manuel Berenguel, Universidad de Almería, Spain
Wojciech Bozejko, Wroclaw University os Science and Technology, Poland
Jorge Bondia, Polytechnic University of Valencia, Spain
Julius Butime, Strathmore University, Nairobi, Kenya
Zenon Chaczko, UTS, Sydney, Australia
Chee-Kong Chui, National University of Singapore, Singapore
József Dombi, University of Szeged, Hungary
Dániel Drexler, Óbuda University, Budapest, Hungary
Éva Dulf, University of Cluj-Napoca, Romania
György Eigner, Óbuda University, Budapest, Hungary
Tamás Ferenci, Óbuda University, Budapest, Hungary
Gabor Fichtinger, Queen’s University, Canada
Ladislav Főző, Technical University of Košice, Slovakia
Péter Galambos, Óbuda University, Budapest, Hungary
Tom D. Gedeon, The Australian National University, Canberra, Australia
Tamás Haidegger, Óbuda University, Budapest, Hungary
László Horváth, Óbuda University, Budapest, Hungary
Clara Ionescu, Ghent University, Belgium
Zsolt Csaba Johanyák, John von Neumann Unviversity, Hungary
Ryszard Klempous, Wroclaw University of Science and Technology, Poland
George Kovács, CAI of HAS, Hungary
Szilveszter Kovács, University of Miskolc, Hungary
Miklós Kozlovszky, Óbuda University, Budapest, Hungary
Naoyuki Kubota, Tokyo Metropolitan University, Hino, Japan
Róbert Lovas, SZTAKI, Hungary
Lőrinc Márton, Sapientia University, Târgu-Mures, Romania
Alajos Mészáros, Slovak University of Technology, Bratislava, Slovakia
György Molnár, Óbuda University, Budapest, Hungary
Jan Nikodem, Wroclaw University os Science and Technology, Poland
John Olukuru, Strathmore University, Nairobi, Kenya
Pasquale Palumbo, National Research Council (IASI-CNR), Rome, Italy

iv
Béla Pátkai, Tampere University of Technology, Finland
Emil M. Petriu, University of Ottawa, Canada
Radu-Emil Precup, „Politehnica” University of Timisoara, Romania
Stefan Preitl, „Politehnica” University of Timisoara, Romania
Ales Prochazka, University of Chemistry and Technology & Czech Technical University, Czech Republik
Octavian Prostean, „Politehnica” University of Timisoara, Romania
Ewaryst Rafajlowicz, Wroclaw University os Science and Technology, Poland
Jerzy Rozenblit, University of Arizona, Tucson, USA
Joseph Sevilla, Strathmore University, Nairobi, Kenya
Czeslaw Smutnicki, Wroclaw University os Science and Technology, Poland
Carmen Paz Suarez Araujo, ULPGC, Spain
Miroslav Sveda, Brno University of Technology, Czech Republik
Sándor Szénási, Óbuda University, Budapest, Hungary
Pavol Tanuska, Slovak University of Technology, Slovakia
József K. Tar, Óbuda University, Budapest, Hungary
Andrea Tick, Óbuda University, Budapest, Hungary
Xin Yao, University of Birminghan
Annamária R. Várkonyi-Kóczy, Óbuda University, Budapest, Hungary
Antonio Visioli, University of Brescia, Italy
Bogdan M. Wilamowski, USA

INES FINCANCE CHAIR


Anikó Szakál, Óbuda University, Budapest, Hungary

INES 2022 ORGANIZING COMMITTEE CHAIR


György Eigner, Óbuda University, Budapest, Hungary

INES SERIES LIFE SECRETARY GENERAL


Anikó Szakál, Óbuda University, Budapest, Hungary
E-mail: [email protected]

PROCEEDINGS EDITOR
Anikó Szakál, Óbuda University, Budapest, Hungary

PRODUCTION PUBLISHER
IEEE Hungary Section

v
Organized by

Óbuda University, Budapest, Hungary

IEEE Hungary Section


IEEE Greece Section
2022 IEEE 26th International Conference on Intelligent Engineering Systems (INES) | 978-1-6654-9209-6/22/$31.00 ©2022 IEEE | DOI: 10.1109/INES56734.2022.9922640

Hungarian Fuzzy Association

Sponsored by

IEEE Hungary Section


IEEE Joint Chapter of IES and RAS, Hungary
IEEE SMC Chapter, Hungary
IEEE Control Systems Chapter, Hungary

Technical Co-sponsors Venue

IEEE Industrial Electronics Society Hydramis Palace Beach Resort

IEEE Greece Section

Part Number: CFP22IES-USB (pendrive); CFP22IES-ART (Xplore)


ISBN: 978-1-6654-9208-9 (pendrive); 978-1-6654-9209-6 (Xplore)

Copyright and Reprint Permission: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy be-
yond the limit of U.S. copyright law for private use of patrons those articles in this volume that carry a code at the bottom of the
first page, provided the per-copy fee indicated in the code is paid through Copyright Clearance Center, 222 Rosewood Drive,
Danvers, MA 01923. For reprint or republication permission, email to IEEE Copyrights Manager at [email protected].
All rights reserved. Copyright ©2022 by IEEE.
ii

You might also like