Input To The LDA Algorithm:: Latent Dirichlet Allocation Using Gibbs Sampling Technique Is A Framework For Analyzing
Input To The LDA Algorithm:: Latent Dirichlet Allocation Using Gibbs Sampling Technique Is A Framework For Analyzing
hidden/latent topic structures of large scale datasets like a collection of text documents.
M - # of Documents
V - vocabulary size
K - number of topics
alpha, beta - LDA hyper parameters
z – Matrix containing topic assignments for words
nw – Matrix containing # of instances of word i to topic I [Size is V x K]
nd – Matrix containing # of words in document i to topic i [Size is M x K]
nwsum – total # of words assigned to topic I [Size is K]
ndsum – total number of words in document i [Size is M]
theta – Matrix having document-topic distributions [Size is M x K]
phi – topic-word distributions [Size K x V]