Latent Dirichlet Allocation


Latent Dirichlet Allocation (LDA) is an unsupervised probabilistic analysis tool for topic modelling unstructured text data. It assumes documents to be made up by a certain proportion of topics and every topic is made up of various words. These assumed random values are priors.




No need for any external dictionary. The process is unsupervised thus, no need for labelled data. Good for big data analysis.


Inputs $\alpha$, $\beta$, number of topics and number of iterations need to be tuned to get better
(Original Paper) Latent Dirichlet Allocation (algorithm) | AISC Foundational
Serrano -- Latent Dirichlet Allocation (Part 1 of 2)
Serrano -- Training Latent Dirichlet Allocation: Gibbs Sampling (Part 2 of 2)

Created: 30 Oct 2022
Last Modified: 30 Oct 2022