Topic Modeling MFM
Topic Modeling MFM
Perkenalan
● Educational Background
○ Bachelor of Mathematics, Universitas Indonesia 2012-2017
● Working Experience
○ Backend Engineer at Dattabot (2016-2017)
○ Data Analyst at Home Credit Indonesia (2018)
○ Senior Data Engineer at Detik.com (2018-2023)
Pembukaan
● Introduction to Topic Modelling
● Topic Modelling Application
● Document Representation
● Latent Dirichlet Allocation
● LDA Intuition and Idea
● LDA Process
● Tools and Framework
Introduction to Topic Modelling
● Topic Modelling is a type of statistical model for
discovering the abstract "topics" that occur in a
collection of documents.
● Topic Modelling aims to grouping certains of article
by the similarity of their “topics” without knowing
the label of the “topics”
● Topic Modelling itself can be classified as
unsupervised method or commonly as clustering
method but for documents or articles
What is Topics?
Joko Widodo Messi Iphone
pemerintahan Ronaldo ChatGPT
Luhut Binsar Pandjaitan Liga Inggris prosesor
Anies Baswedan Inter Milan deep learning
pemilu gol artificial intelligence
koalisi pinalti Javascript
● Aim :
○ Discover pattern of word use and and
connects documents that exhibit similar
pattern
Latent Dirichlet Allocation (LDA)
●
LDA
LDA Formula
Let θ as is the topic distribution for document i,
α is the parameter of the Dirichlet prior on the per-document topic distributions,
β is the parameter of the Dirichlet prior on the per-topic word distribution,