0% found this document useful (0 votes)

16 views

Twitter Topic Modelling Using Latent Dirichlet Allocation Approach

This study aims to apply topic modeling from Twitter data about the Kanjuruhan tragedy, one of the trending topics due to a fatal incident that occurred after a football match at Kanjuruhan Stadium, in Malang, Indonesia. https://cajotas.centralasianstudies.org/index.php/CAJOTAS/article/view/1274/1331 https://cajotas.centralasianstudies.org/index.php/CAJOTAS/article/view/1274

Uploaded by

Central Asian Studies

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Twitter Topic Modelling Using Latent Dirichlet Allocation Approach

Uploaded by

Central Asian Studies

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

CENTRAL ASIAN JOURNAL OF THEORETICAL

AND APPLIED SCIENCES

Volume: 04 Issue: 09 | Sep 2023 ISSN: 2660-5317
https://cajotas.centralasianstudies.org

Twitter Topic Modelling Using Latent Dirichlet Allocation Approach

Uce Indahyanti, Yulian Findawati, Achmad Ariansyah
Faculty of Science and Technology, Universitas Muhammadiyah Sidoarjo, Indonesia
Endah Asmawati
Faculty of Engineering, Universitas Surabaya, Indonesia

Received 4th Jul 2023, Accepted 6th Aug 2023, Online 8th Sep 2023

Abstract: This study aims to apply topic modeling from Twitter data about the Kanjuruhan tragedy, one
of the trending topics due to a fatal incident that occurred after a football match at Kanjuruhan Stadium,
in Malang, Indonesia. The research was conducted using the Latent Dirichlet Allocation (LDA), namely a
text mining method to find certain patterns in a document by producing several different kinds of topics.
The data used consists of 1480 tweets in the Indonesia language that had been pre-processed. This
modeling has produced 5 main topics related to the Kanjuruhan tragedy such as the PSSI (Indonesian
Football Association) investigation, suspects, the Itaewon tragedy, Korean netizens (Knetz), and tear gas.
The implication of this research is not only to provide information about the comments and expectations
of Twitter users regarding the Kanjuruhan tragedy but also to provide considerations for the stakeholder.
Keywords: topic modeling, Twitter data, the Kanjuruhan tragedy, LDA, text mining.
_____________________________________________________________________________________________________

1. Introduction
Social media Twitter is a forum that is widely used by the public to express opinions and comments on
popular issues. Twitter provides an online social networking and microblogging service, which enables its
users to send and read text-based messages. public comments on social media Twitter is a huge body of
text data, which can be mined and analyzed. To obtain hidden topics from the corpus (collection of
natural texts), topic modeling can be applied using the most popular topic modeling approaches are LDA
(Latent Dirichlet Allocation) and Latent Semantic Analysis (LSA). Each topic will represent a variety of
comments discussing the same context.
Several studies related to topic modeling have been applied in various fields, such as bioinformatics [1]
and transportation [2]. Twitter data-based topic modeling using the LDA method has also been carried out
by several previous researchers [3][4][5]. Another study aims to make topic modeling to determine the
topic of tweets about football news in Indonesian, using the LDA method, which has produced several
topics such as pre-match analysis, live match updates, and football club achievements [6].
This study applies topic modeling based on tweet data about the tragedy at the Kanjuruhan Stadium, a
fatal incident that caused hundreds of football spectators to die. The data used was taken from Twitter for
the period of October 2022. We use Latent Dirichlet Allocation (LDA) as a topic modeling method to
determine what topics appear on Twitter. The remainder of this paper consists of Section 2 describing the

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 20

Copyright (c) 2023 Author (s). This is an open-access article distributed under the terms of Creative Commons
Attribution License (CC BY).To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/
CENTRAL ASIAN JOURNAL OF THEORETICAL AND APPLIED SCIENCES
Volume: 04 Issue: 09 | Sep 2023, ISSN: 2660-5317

materials and methods, section 3 describing the results and discussion, and section 4 explaining the
conclusions.
2. Materials and Methods
2.1 Twitter Datasets
Twitter is a website owned and operated by Twitter, Inc. which offers a social network in the form of a
microblog. This site allows users to send and read blog messages as usual but is limited to only 140
characters displayed on the user's profile page. Twitter has unique characteristics and writing formats with
special symbols or rules. Messages on Twitter are known as Tweets. Twitter as one of the popular social
media makes it very easy for its users to access a lot of information and channel their opinions [7].
The use of Twitter soars when an event that attracts public attention occurs, such as the Kanjuruhan
tragedy. Tweet data related to the tragedy reached thousands from various user accounts. This study
retrieved 2000 data from the Kanjuruhan tragedy tweets during the period of October 2022. After cleaning
up the duplicate data, 1480 tweet data remained.
2.2. Topic Modelling
The concept of topic modeling consists of entities namely "word", "document", and "corpora". “Word” is
considered the basic unit of discrete data in a document, defined as an item of vocabulary that is indexed
for each unique word in the document. “Document” is an arrangement of N words. A corpus is a
collection of M documents and corpora is the plural form of corpus. While "topic" is the distribution of
some fixed vocabulary. each document in the corpus contains its own proportion of the topics discussed
according to the words contained therein. Topic modeling has been of interest to most authors from the
fields of Text Mining, Natural Language Processing, and Machine Learning [1].
The purpose of topic modeling is to determine topics automatically from a set of documents that have a
hidden structure in the form of topics, distribution of topics per document, and determination of topics per
word in each document. Topic modeling uses these documents to infer hidden topic structures. The
number of topics to be generated has been determined before the topic modeling process is carried out.
[2].
2.3. Latent Dirichlet Allocation
Latent Dirichlet Allocation (LDA) is a generative probabilistic model of discrete data collections such as
a set of documents (text corpus). In the context of text modeling, topic probabilities provide an explicit
representation of a document. The basic idea of LDA is that a document consists of several topics. The
LDA process is generative through an imaginary random process in a model that assumes that documents
originate from a certain topic, and each topic consists of a distribution of words. The LDA concept is
shown in Figure 1 [8].

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 21

Fig. 1. LDA concept [9]

LDA is also called a text mining method for finding certain patterns in a document by generating several
different types of topics [10]. LDA was chosen because it can analyze large data and documents. LDA
uses the bag of words method to identify hidden topic information in large sets of documents [11].
2.4. Python dan Google Colab
Google colab is an executable document that can be used to store, write, and share programs that have
been written via Google Drive. Google colab is a coding environment in a notebook format that is user-
friendly and can support all needs related to data science and machine learning. This software is similar to
Jupyter Notebook in the form of a cloud that runs using the Google Chrome browser. Meanwhile, Python
is a popular programming language used in the Google Colab environment. Python is an open-source
programming language, easy to use and has many supporting libraries for data science and machine
learning needs. for example, text pre-processing used for topic modeling using the Python programming
language was carried out by [6] [12].
2.5. Modelling Stages
The stages of topic modeling in this study are shown in Figure 2, starting from the data collection stage,
data pre-processing, topic modeling, visualization, until the results analysis.

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 22

Fig. 2. Modelling stages

A. Data Collection
Data retrieval using the Twint library with the keywords "Tragedi Kanjuruhan", language "id", limit 2000,
period 1 October to 31 October 2022, from several accounts, including official accounts such as
detik.com, jawapos.com , and hariankompas.com.
B. Data Pre-processing
The preprocessing stage consists of cleaning data, selecting attributes, case folding (changing into
lowercase), tokenizing (removing unnecessary characters or symbols), stopwords (cleaning text from
words that have no meaning), normalizing (replacing certain words with more appropriate words such as
jatim to Jawa Timur), stemming (cutting affixes to text using the Sastrawi and Swifter packages).
C. LDA Topic Modelling
LDA topic modeling in this study uses the LdaModel library provided by the Gensim library with Python
[6]. We determine five topics as parameters, and the following is a modeling code snippet.
import gensim
from gensim import corpora
Lda = gensim.models.LdaModel
dictionary = corpora.Dictionary(doc_clean)
bow_corpus = [dictionary.doc2bow(doc) for doc in doc_clean]

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 23

total_topics = 5
number_words = 8
D. Visualization
The results of LDA topic modeling are visualized using the Gensim library and pyLDAvis in Python.
PyLDAvis is a web-based interactive topic model visualization using LDA built from LDAvis using a
combination of R and D3 [13]. The pyLDAvis library, used for browsing relationships between topics and
terms to understand LDA model. PyLDAvis has two panels, the distribution map each topic and the most
representative intensity graph terms frequently found in the corpus. The following is a visualization code
snippet.
import pyLDAvis.gensim
import pickle
import pyLDAvis
import os

# Visualize the topics

pyLDAvis.enable_notebook()
LDAvis_data_filepath =
os.path.join('ldavis_prepared_'+str(total_topics))
corpus = [dictionary.doc2bow(text) for text in doc_clean]
if 1 == 1:
LDAvis_prepared= pyLDAvis.gensim.prepare(lda_model, corpus,
dictionary)
with open(LDAvis_data_filepath, 'wb') as f:
pickle.dump(LDAvis_prepared, f)
3. Results and Discussion
This section discusses the results of topic modelling. This modeling has produced 5 main topics related to
the Kanjuruhan tragedy such as the Itaewon tragedy (topic #1), the PSSI investigation (topic #2), suspects
(topic #3), Korean netizens/Knetz (topic #4), and tear gas (topic #5). Table 1 shows the results of the bag
of words weighting. We determine eight words as parameters (K = kata = word) and translate in English
for common words.
Table 1. The results of the bag of words weighting
Topic K1 K2 K3 K4 K5 K6 K7 K8
(Word1)
Topic itaewon october victim halloween people dead stadium closed
#1 (63%) (40%) (31%) (16%) (13%) (12%) (8%) (8%)
Topic indonesia investigate pssi ball (14%) thoroughly suporter stay soccer
#2 (23%) (15%) (14%) (14%) (12%) (12%) (9%)
Topic suspect kapolri pssi permanent lib (19%) pt malang director

#3 (57%) (26%) (23%) (20%) (16%) (14%) (12%)

Topik Country knetz lu indo people gw itaewon yesterday
#4 (22%) (15%) (10%) (10%) (10%) (10%) (8%) (8%)
Topik eye (19%) people itaewon pas gas (11%) water hope victim
#5 (17%) (12%) (12%) (11%) (10%) (10%)
The distance map visualization between topics from this model and the top 30 most prominent words in
the corpus is shown in Figure 3, which is one of the results of modeling visualization. The bar chart in
Figure 3 shows the 30 most prominent words in the corpus on the topic "Itaewon". Figure 3 shows five
topic clusters that can be grouped independently. These clusters cover topics that can be seen from a
distance between clusters, and explain that the distribution and frequency of words within these topics is
very unique. The word "Itaewon" appeared at the top because of the many comments by Indonesian
netizens who replied to comments by Korean netizens. Previously, many Korean netizens commented on
the Kanjuruhan tragedy. Examples of other topic visualizations are shown in Figure 4.

Fig.3. LDA visualization (topic #1 - the Itaewon tragedy)

Figure 4 shows the 30 most prominent words in the corpus on the topic "suspects". The bar chart in Figure
4 illustrates the distribution of words that refer to the topic of the suspect in the Kanjuruhan tragedy.

Fig.4. LDA visualization (topic #2 - the PSSI investigation)

4. Conclusion
This modeling has produced 5 main topics related to the Kanjuruhan tragedy such as the PSSI (Indonesian
Football Association) investigation, suspects, the Itaewon tragedy, Korean netizens (Knetz), and tear gas.
The word "Itaewon" appeared at the top because of the many comments by Indonesian netizens who
replied to comments by Korean netizens. Previously, many Korean netizens commented on the
Kanjuruhan tragedy.
The implication of this research is not only to provide information about the comments and expectations
of Twitter users regarding the Kanjuruhan tragedy but also to provide considerations for the stakeholder.
Meanwhile, this study still needs to be improved such as the use of metric coherence scores in
determining the number of topics. To find out more about the performance of the LDA method in
extracting topics from Bahasa Indonesian text documents, by comparing this method with other non-topic
based methods.
5 Acknowledgements
This research is supported by Universitas Muhammadiyah Sidoarjo (UMSIDA).
References
1. L. Liu, L. Tang, W. Dong, S. Yao, and W. Zhou, “An overview of topic modeling and its current
applications in bioinformatics,” Springerplus, vol. 5, no. 1, 2016, doi: 10.1186/s40064-016-3252-8.
2. L. Sun and Y. Yin, “Discovering themes and trends in transportation research using topic modeling,”
Transp. Res. Part C Emerg. Technol., vol. 77, no. April, pp. 49–66, 2017, doi:
10.1016/j.trc.2017.01.013.
3. G. Lansley and P. A. Longley, “The geography of Twitter topics in London,” Comput. Environ.
Urban Syst., vol. 58, pp. 85–96, 2016, doi: 10.1016/j.compenvurbsys.2016.04.002.
4. A. F. Hidayatullah and and M. R. Ma’arif, “Pre-processing Tasks in Indonesian Twitter Messages,” J.
Phys. Conf. Ser., 2017, doi: 10.1088/1742-6596/755/1/011001.
5. H. Jelodar et al., “Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a
survey,” Multimed. Tools Appl., vol. 78, no. 11, pp. 15169–15211, 2019, doi: 10.1007/s11042-018-
6894-4.
6. A. F. Hidayatullah, E. C. Pembrani, W. Kurniawan, G. Akbar, and R. Pranata, “Twitter Topic
Modeling on Football News,” 2018 3rd Int. Conf. Comput. Commun. Syst. ICCCS 2018, pp. 94–98,
2018, doi: 10.1109/CCOMS.2018.8463231.
7. A. A. Amrullah, A. Tantoni, N. Hamdani, R. T. R. L. Bau, and E. U. Ahsan, Muhammad Rafiqudin,
“Review Atas Analisis Sentimen Pada Twitter Sebagai Representasi Opini Publik Terhadap Bakal
Calon Pemimpin. Prosiding Seminar Nasional Multi Disiplin Ilmu & Call For Papers Unisbank,”
2016.
8. A. Y. N. I. J. David M. Blei, “Latent Dirichlet Allocation: Extracting Topics from Software
Engineering Data,” J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.
9. D. Blei, L. Carin, and D. Dunson, “Probabilistic topic models,” IEEE Signal Process. Mag., vol. 27,
no. 6, pp. 55–65, 2010, doi: 10.1109/MSP.2010.938079.
10. I. M. K. B. Putra and R. P. Kusumawardani, “Analisis Topik Informasi Publik Media Sosial Di
Surabaya Menggunakan Pemodelan Latent Dirichlet Allocation ( LDA ),” J. Tek. Its, vol. 6, no. 2, pp.
2–7, 2017.

11. Y. Sahria, “Analisis Topik Penelitian Kesehatan di Indonesia Menggunakan Metode Topic Modeling
LDA (Latent Dirichlet Allocation),” Resti, vol. 4, no. 2, pp. 336–344, 2020.
12. M. Cendana and S. D. H. Permana, “Pra-Pemrosesan Teks Pada Grup Whatsapp Untuk Pemodelan
Topik,” Junal Mantik Penusa, vol. 3, no. 3, pp. 107–116, 2019.
13. C. Sievert and K. Shirley, “LDAvis: A method for visualizing and interpreting topics,” no. September,
pp. 63–70, 2015, doi: 10.3115/v1/w14-3110.

What Men Dont Want Women To Know - The Secrets, The Lies, The Unspoken Truth - Smith and Doe
90% (20)
What Men Dont Want Women To Know - The Secrets, The Lies, The Unspoken Truth - Smith and Doe
157 pages
Informatics and Nursing
100% (6)
Informatics and Nursing
1,026 pages
7839+ Awesome Deep Web Onion Links List (Uncensored Content) PDF
75% (36)
7839+ Awesome Deep Web Onion Links List (Uncensored Content) PDF
391 pages
Darknet-Bible - Everything You Need To Know To Order Safely at Darknet Markets
100% (3)
Darknet-Bible - Everything You Need To Know To Order Safely at Darknet Markets
218 pages
Organizing Solutions For People With ADHD, 2nd Edition-Revised and Updated
95% (58)
Organizing Solutions For People With ADHD, 2nd Edition-Revised and Updated
221 pages
Sunbeam Popcorn Maker FPSBPP7310 FPSBPP7316
60% (10)
Sunbeam Popcorn Maker FPSBPP7310 FPSBPP7316
9 pages
The Hacking Bible
81% (27)
The Hacking Bible
101 pages
30 Best Tor Sites For Any and Everything You'll Ever Need! - Your Hacker
90% (29)
30 Best Tor Sites For Any and Everything You'll Ever Need! - Your Hacker
8 pages
Social Media Strategy
81% (401)
Social Media Strategy
22 pages
Zero To One
96% (48)
Zero To One
200 pages
Black Book of Crime
75% (12)
Black Book of Crime
39 pages
Side Hustle Bible (James Altucher) (Z-Library)
100% (3)
Side Hustle Bible (James Altucher) (Z-Library)
197 pages
(Destiny) Vendor Ebook
No ratings yet
(Destiny) Vendor Ebook
75 pages
The Hacking Bible - Kevin James
89% (36)
The Hacking Bible - Kevin James
95 pages
7839+ Awesome Deep Web Onion Links List (Uncensored Content)
50% (4)
7839+ Awesome Deep Web Onion Links List (Uncensored Content)
54 pages
200 Underground Cities
91% (56)
200 Underground Cities
10 pages
OSINT Links For Investigators PDF
No ratings yet
OSINT Links For Investigators PDF
2 pages
Basic Legal Research
100% (7)
Basic Legal Research
373 pages
DIGITAL DECLUTTERING For ADHDers
100% (6)
DIGITAL DECLUTTERING For ADHDers
34 pages
Free Mail
93% (14)
Free Mail
22 pages
Rushing V Disney
65% (26)
Rushing V Disney
27 pages
The Art of Invisibility - The World's Most Famous Hacker Teaches You How To Be Safe in The Age of Big Brother and Big Data - PDF Room
100% (2)
The Art of Invisibility - The World's Most Famous Hacker Teaches You How To Be Safe in The Age of Big Brother and Big Data - PDF Room
252 pages
HOW TO FIND GMAIL Password
38% (8)
HOW TO FIND GMAIL Password
3 pages
The Hidden Wiki
100% (12)
The Hidden Wiki
19 pages
Search Tricks For Google
92% (12)
Search Tricks For Google
9 pages
Twitter Topic Modeling On Football News
No ratings yet
Twitter Topic Modeling On Football News
5 pages
Text Mining of Twitter Data Using A Latent Dirichlet Allocation Topic Model and Sentiment Analysis
No ratings yet
Text Mining of Twitter Data Using A Latent Dirichlet Allocation Topic Model and Sentiment Analysis
6 pages
Combine PDF
No ratings yet
Combine PDF
7 pages
Top2vec For Vaksin Hesistancy
No ratings yet
Top2vec For Vaksin Hesistancy
6 pages
A Document Exploring System On Lda Topic Model For Wikipedia Articles
No ratings yet
A Document Exploring System On Lda Topic Model For Wikipedia Articles
13 pages
Topic Modeling MFM
No ratings yet
Topic Modeling MFM
19 pages
Running Head: Topic Model by Using Latent Dirichlet Allocation 1
No ratings yet
Running Head: Topic Model by Using Latent Dirichlet Allocation 1
8 pages
Latent Dirichlet Allocation LDA and Topic Modeling PDF
No ratings yet
Latent Dirichlet Allocation LDA and Topic Modeling PDF
41 pages
1143-Article Text-7844-1-10-20221206
No ratings yet
1143-Article Text-7844-1-10-20221206
10 pages
2019 - Latent Dirichlet Allocation (LDA) and Topic Modeling: Models, Applications, A Survey
No ratings yet
2019 - Latent Dirichlet Allocation (LDA) and Topic Modeling: Models, Applications, A Survey
43 pages
Ultimate Enterprise Data Analysis and Forecasting using Python
From Everand
Ultimate Enterprise Data Analysis and Forecasting using Python
Shanthababu Pandian
No ratings yet
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
From Everand
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
Dr. Gypsy Nandi
No ratings yet
FX RTM
No ratings yet
FX RTM
15 pages
A Review of Approaches For Topic Detection in Twitter
No ratings yet
A Review of Approaches For Topic Detection in Twitter
28 pages
Handbook of Cloud Computing: Basic to Advance research on the concepts and design of Cloud Computing
From Everand
Handbook of Cloud Computing: Basic to Advance research on the concepts and design of Cloud Computing
Dr. Anand Nayyar
No ratings yet
Titov Bunker
No ratings yet
Titov Bunker
8 pages
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
From Everand
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
Fouad Sabry
No ratings yet
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Image Retrieval: Fundamentals and Applications
From Everand
Image Retrieval: Fundamentals and Applications
Fouad Sabry
No ratings yet
Topic Modelling: A Survey of Topic Models: Abstract-In Recent Years We Have Significant Increase
No ratings yet
Topic Modelling: A Survey of Topic Models: Abstract-In Recent Years We Have Significant Increase
12 pages
Image Retrieval: Unlocking the Power of Visual Data
From Everand
Image Retrieval: Unlocking the Power of Visual Data
Fouad Sabry
No ratings yet
A Two Staged NLP Based Framework For Assessing The Sentiments On Indian Supreme Court Judgments
No ratings yet
A Two Staged NLP Based Framework For Assessing The Sentiments On Indian Supreme Court Judgments
10 pages
Sma Exp 4
No ratings yet
Sma Exp 4
3 pages
2.hierarchical Topic Modeling of Twitter Data
No ratings yet
2.hierarchical Topic Modeling of Twitter Data
13 pages
Internet of Things Theory and Practice: Build Smarter Projects to Explore the IoT Architecture and Applications (English Edition)
From Everand
Internet of Things Theory and Practice: Build Smarter Projects to Explore the IoT Architecture and Applications (English Edition)
Amit Kumar Tyagi
No ratings yet
Learning Docker
From Everand
Learning Docker
Pethuru Raj
5/5 (5)
Cloud-Based Multi-Modal Information Analytics
From Everand
Cloud-Based Multi-Modal Information Analytics
Tanushri Kaniyar
No ratings yet
Probabilistic Topic Modeling and Its Variants - A Survey: Padmaja CH V R S Lakshmi Narayana
No ratings yet
Probabilistic Topic Modeling and Its Variants - A Survey: Padmaja CH V R S Lakshmi Narayana
5 pages
Cloud Computing: Master the Concepts, Architecture and Applications with Real-world examples and Case studies
From Everand
Cloud Computing: Master the Concepts, Architecture and Applications with Real-world examples and Case studies
Ruchi Doshi
No ratings yet
Statistical Topic Modeling For Afaan Oromo Document Clustering
No ratings yet
Statistical Topic Modeling For Afaan Oromo Document Clustering
10 pages
Architecting Big Data & Analytics Solutions - Integrated with IoT & Cloud
From Everand
Architecting Big Data & Analytics Solutions - Integrated with IoT & Cloud
Dr Mehmet Yildiz
4.5/5 (2)
Sbalchiero Topicmodelinglongtextsand
No ratings yet
Sbalchiero Topicmodelinglongtextsand
14 pages
Ecosystems Architecture
From Everand
Ecosystems Architecture
Philip Tetlow
No ratings yet
NewSociRank: Recognizing and Ranking Frequent News Topics Using Social Media Factors
No ratings yet
NewSociRank: Recognizing and Ranking Frequent News Topics Using Social Media Factors
4 pages
AI with Azure: Build Smarter Applications
From Everand
AI with Azure: Build Smarter Applications
Kameron Hussain
No ratings yet
SATLabel A Framework For Sentiment and Aspect Terms Based Automatic Topic Labeling
No ratings yet
SATLabel A Framework For Sentiment and Aspect Terms Based Automatic Topic Labeling
12 pages
A Survey of Topic Pattern Mining in Text Mining PDF
No ratings yet
A Survey of Topic Pattern Mining in Text Mining PDF
7 pages
RDBMS In-Depth: Mastering SQL and PL/SQL Concepts, Database Design, ACID Transactions, and Practice Real Implementation of RDBM (English Edition)
From Everand
RDBMS In-Depth: Mastering SQL and PL/SQL Concepts, Database Design, ACID Transactions, and Practice Real Implementation of RDBM (English Edition)
Dr. Madhavi Vaidya
No ratings yet
Library Infrastructures & Citizen Science
From Everand
Library Infrastructures & Citizen Science
Ingram Spark
No ratings yet
Engineering Data Mesh in Azure Cloud: Implement data mesh using Microsoft Azure's Cloud Adoption Framework
From Everand
Engineering Data Mesh in Azure Cloud: Implement data mesh using Microsoft Azure's Cloud Adoption Framework
Aniruddha Deswandikar
No ratings yet
Jipeng Qiang 2019
No ratings yet
Jipeng Qiang 2019
17 pages
Latent Dirichlet Allocation (LDA) and Topic Modeling: Models, Applications, A Survey
No ratings yet
Latent Dirichlet Allocation (LDA) and Topic Modeling: Models, Applications, A Survey
40 pages
An Integrated Clustering and BERT Framework For Improved Topic Modeling
No ratings yet
An Integrated Clustering and BERT Framework For Improved Topic Modeling
9 pages
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
Twitter Data Mining For Sentiment Analysis On Peoples Feedback Against Government Public Policy
100% (2)
Twitter Data Mining For Sentiment Analysis On Peoples Feedback Against Government Public Policy
13 pages
Data Science: Concepts, Strategies, and Applications
From Everand
Data Science: Concepts, Strategies, and Applications
Zemelak Goraga
No ratings yet
Learn C++
From Everand
Learn C++
Aishik Dutta
No ratings yet
Hashtag-Based Tweet Expansion for Improved Topic Modeling
No ratings yet
Hashtag-Based Tweet Expansion for Improved Topic Modeling
19 pages
2014 Vanatteveldt Glasgowbigdata Topics
No ratings yet
2014 Vanatteveldt Glasgowbigdata Topics
15 pages
Big Data for Enterprise Architects
From Everand
Big Data for Enterprise Architects
Dr Mehmet Yildiz
4.5/5 (2)
Essential Federated Learning: AI at the Edge
From Everand
Essential Federated Learning: AI at the Edge
Robert Johnson
No ratings yet
Python Machine Learning Projects: Learn how to build Machine Learning projects from scratch (English Edition)
From Everand
Python Machine Learning Projects: Learn how to build Machine Learning projects from scratch (English Edition)
Dr. Deepali R Vora
No ratings yet
UNIX Programming: UNIX Processes, Memory Management, Process Communication, Networking, and Shell Scripting
From Everand
UNIX Programming: UNIX Processes, Memory Management, Process Communication, Networking, and Shell Scripting
Dr. Vineeta Khemchandani
No ratings yet
Object–Oriented Programming with Swift 2
From Everand
Object–Oriented Programming with Swift 2
Hillar Gastón C.
No ratings yet
Maier 2018
No ratings yet
Maier 2018
27 pages
Mastering Data Mining Techniques
From Everand
Mastering Data Mining Techniques
Dhaanyalakshmi Ahuja
No ratings yet
LOTED: a semantic web portal for the management of tenders from the European Community
From Everand
LOTED: a semantic web portal for the management of tenders from the European Community
Francesco Valle
No ratings yet
Text Mining: Fundamentals and Applications
From Everand
Text Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Digital Humanities Research Methods
From Everand
Digital Humanities Research Methods
Vikrant Iyer
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
The Role of Endothelial Dysfunction Genes in The Development of Postpartum Hemorrhage
No ratings yet
The Role of Endothelial Dysfunction Genes in The Development of Postpartum Hemorrhage
6 pages
Risk Factors For Arterial Hypertension in Young Ages
No ratings yet
Risk Factors For Arterial Hypertension in Young Ages
6 pages
Effectiveness of The Drug L-Montus (Montelukast) in The Treatment of Bronchial Asthma
No ratings yet
Effectiveness of The Drug L-Montus (Montelukast) in The Treatment of Bronchial Asthma
4 pages
Review of Standard and Advanced Surgical Treatments For Invasive Bladder Cancer
No ratings yet
Review of Standard and Advanced Surgical Treatments For Invasive Bladder Cancer
11 pages
Morphological Changes in The Adrenal Glands in Rheumatoid Arthritis
No ratings yet
Morphological Changes in The Adrenal Glands in Rheumatoid Arthritis
6 pages
Volume: 05 Issue: 01 - Jan-Feb 2024
No ratings yet
Volume: 05 Issue: 01 - Jan-Feb 2024
4 pages
Immunological Effectiveness of Immuno - and Biocorrective Treatment of Dental Diseases in Pregnant Women
No ratings yet
Immunological Effectiveness of Immuno - and Biocorrective Treatment of Dental Diseases in Pregnant Women
4 pages
The Importance of Magnetic Resonance Imaging in The Diagnosis and Treatment of Diabetic Foot Syndrome
No ratings yet
The Importance of Magnetic Resonance Imaging in The Diagnosis and Treatment of Diabetic Foot Syndrome
7 pages
Spectroscopy, Thermal Analysis, Bioavailability and Anticancer Activity of Copper (Ii) Complex With Heterocyclic Azo Dye Ligand
No ratings yet
Spectroscopy, Thermal Analysis, Bioavailability and Anticancer Activity of Copper (Ii) Complex With Heterocyclic Azo Dye Ligand
13 pages
Prevention of Dental Diseases
No ratings yet
Prevention of Dental Diseases
4 pages
A Comprehensive Approach To The Treatment of Facial Neuropathy in Children
No ratings yet
A Comprehensive Approach To The Treatment of Facial Neuropathy in Children
6 pages
Volume: 05 Issue: 01 - Jan-Feb 2024
No ratings yet
Volume: 05 Issue: 01 - Jan-Feb 2024
7 pages
Volume: 05 Issue: 01 - Jan-Feb 2024
No ratings yet
Volume: 05 Issue: 01 - Jan-Feb 2024
8 pages
Assessing The Efficiency of Ultrasound For Diagnosing Developmental Dysplasia of The Hip (DDH) in Infants Below 6 Months
No ratings yet
Assessing The Efficiency of Ultrasound For Diagnosing Developmental Dysplasia of The Hip (DDH) in Infants Below 6 Months
8 pages
Effect of The Concept Mapping Strategy On Teaching and Retaining Some Basic Football Skills To Fifth-Grade Middle School Students
No ratings yet
Effect of The Concept Mapping Strategy On Teaching and Retaining Some Basic Football Skills To Fifth-Grade Middle School Students
10 pages
Volume: 05 Issue: 01 - Jan-Feb 2024: Central Asian Journal of Medical and Natural Sciences
No ratings yet
Volume: 05 Issue: 01 - Jan-Feb 2024: Central Asian Journal of Medical and Natural Sciences
14 pages
Analysis of Somatic and Reproductive History of Women With Inflammatory Diseases of The Pelvic Organs Due To Hiv Infection
No ratings yet
Analysis of Somatic and Reproductive History of Women With Inflammatory Diseases of The Pelvic Organs Due To Hiv Infection
9 pages
Comparative Evaluation of The Effectiveness of Ultrasound and X-Ray Imaging in The Diagnosis of Hip Dysplasia in Children Under 6 Months of Age
No ratings yet
Comparative Evaluation of The Effectiveness of Ultrasound and X-Ray Imaging in The Diagnosis of Hip Dysplasia in Children Under 6 Months of Age
8 pages
Volume: 04 Issue: 06 - Nov-Dec 2023
No ratings yet
Volume: 04 Issue: 06 - Nov-Dec 2023
6 pages
Effectiveness of Instruction Program in Improving Balance Level Among Seniors With Osteoporosis
No ratings yet
Effectiveness of Instruction Program in Improving Balance Level Among Seniors With Osteoporosis
8 pages
Needs For Resort and Health Care and Innovative Approaches To Its Meeting
No ratings yet
Needs For Resort and Health Care and Innovative Approaches To Its Meeting
6 pages
Cortical Fixing Screws Are The Method of Choice For Conservative Treatment of Mandibular Fractures
No ratings yet
Cortical Fixing Screws Are The Method of Choice For Conservative Treatment of Mandibular Fractures
13 pages
Structural and Functional Changes in Periodontal Tissues During Prosthetics With Metal-Ceramic and Zirconium Dentures
No ratings yet
Structural and Functional Changes in Periodontal Tissues During Prosthetics With Metal-Ceramic and Zirconium Dentures
11 pages
Changes in SEP Indicators in Patients Using RPMS in Muscle Hypotonia Syndrome
No ratings yet
Changes in SEP Indicators in Patients Using RPMS in Muscle Hypotonia Syndrome
4 pages
Vitamin D Deficiency For Patients With Depression (Case Control Study)
No ratings yet
Vitamin D Deficiency For Patients With Depression (Case Control Study)
15 pages
Pregnancy and Undifferentiated Connective Tissue Dysplasia
No ratings yet
Pregnancy and Undifferentiated Connective Tissue Dysplasia
5 pages
Hygienic Assessment of The Impact of Harmful Substances Formed During The Production of Mineral Fertiliser On The Immune Health Status of Children
No ratings yet
Hygienic Assessment of The Impact of Harmful Substances Formed During The Production of Mineral Fertiliser On The Immune Health Status of Children
6 pages
Iraqibacter As A New Emerging Pathogen
No ratings yet
Iraqibacter As A New Emerging Pathogen
6 pages
The Importance of Microelements in The Development of Chronic Kidney Disease
No ratings yet
The Importance of Microelements in The Development of Chronic Kidney Disease
3 pages
Separable Cubic Stochastic Operators
No ratings yet
Separable Cubic Stochastic Operators
8 pages
10 Useful Websites You Wish You Knew Earlier! 6 (2017)
0% (1)
10 Useful Websites You Wish You Knew Earlier! 6 (2017)
21 pages
Useful Websites
100% (1)
Useful Websites
4 pages
Websites For Free Ebooks
No ratings yet
Websites For Free Ebooks
4 pages
Wicked 2024
No ratings yet
Wicked 2024
37 pages
Mobile Phone Secrets
100% (1)
Mobile Phone Secrets
16 pages

Twitter Topic Modelling Using Latent Dirichlet Allocation Approach

Uploaded by

Twitter Topic Modelling Using Latent Dirichlet Allocation Approach

Uploaded by

CENTRAL ASIAN JOURNAL OF THEORETICAL

AND APPLIED SCIENCES

Twitter Topic Modelling Using Latent Dirichlet Allocation Approach

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 20

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 21

Fig. 1. LDA concept [9]

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 22

Fig. 2. Modelling stages

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 23

# Visualize the topics

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 24

#3 (57%) (26%) (23%) (20%) (16%) (14%) (12%)

Fig.3. LDA visualization (topic #1 - the Itaewon tragedy)

Fig.4. LDA visualization (topic #2 - the PSSI investigation)

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 25

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 26

© 2023, CAJOTAS, Central Asian Studies, All Rights Reserved 27

You might also like