All Projects → patrikeh → go-topics

patrikeh / go-topics

Licence: other
Latent Dirichlet Allocation

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to go-topics

BayesianTutorials
Implementing MCMC sampling from scratch in R for various Bayesian models
Stars: ✭ 75 (+226.09%)
Mutual labels:  bayesian, mcmc, gibbs
Shinystan
shinystan R package and ShinyStan GUI
Stars: ✭ 172 (+647.83%)
Mutual labels:  bayesian, mcmc
Gpstuff
GPstuff - Gaussian process models for Bayesian analysis
Stars: ✭ 106 (+360.87%)
Mutual labels:  bayesian, mcmc
ml
machine learning
Stars: ✭ 29 (+26.09%)
Mutual labels:  mcmc, gibbs
Bda py demos
Bayesian Data Analysis demos for Python
Stars: ✭ 781 (+3295.65%)
Mutual labels:  bayesian, mcmc
Posterior
The posterior R package
Stars: ✭ 75 (+226.09%)
Mutual labels:  bayesian, mcmc
smfsb
Documentation, models and code relating to the 3rd edition of the textbook Stochastic Modelling for Systems Biology
Stars: ✭ 27 (+17.39%)
Mutual labels:  inference, mcmc
policy-data-analyzer
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-4.35%)
Mutual labels:  topic, lda
Pydlm
A python library for Bayesian time series modeling
Stars: ✭ 375 (+1530.43%)
Mutual labels:  model, bayesian
Cppflow
Run TensorFlow models in C++ without installation and without Bazel
Stars: ✭ 357 (+1452.17%)
Mutual labels:  model, inference
Bda r demos
Bayesian Data Analysis demos for R
Stars: ✭ 409 (+1678.26%)
Mutual labels:  bayesian, mcmc
binary.com-interview-question
The sample question for Interview a job in Binary options
Stars: ✭ 52 (+126.09%)
Mutual labels:  bayesian, mcmc
Bayadera
High-performance Bayesian Data Analysis on the GPU in Clojure
Stars: ✭ 342 (+1386.96%)
Mutual labels:  bayesian, mcmc
Lda Topic Modeling
A PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (+295.65%)
Mutual labels:  bayesian, lda
Bayesplot
bayesplot R package for plotting Bayesian models
Stars: ✭ 276 (+1100%)
Mutual labels:  bayesian, mcmc
Probabilistic Models
Collection of probabilistic models and inference algorithms
Stars: ✭ 217 (+843.48%)
Mutual labels:  bayesian, mcmc
cmdstanr
CmdStanR: the R interface to CmdStan
Stars: ✭ 82 (+256.52%)
Mutual labels:  bayesian, mcmc
BayesHMM
Full Bayesian Inference for Hidden Markov Models
Stars: ✭ 35 (+52.17%)
Mutual labels:  bayesian, mcmc
Elfi
ELFI - Engine for Likelihood-Free Inference
Stars: ✭ 208 (+804.35%)
Mutual labels:  inference, bayesian
Ldagibbssampling
Open Source Package for Gibbs Sampling of LDA
Stars: ✭ 218 (+847.83%)
Mutual labels:  topic, lda

go-topics

A very basic LDA (Latent Dirichlet Allocation) implementation. Not finished by any means, maybe useful as a starting point.

Usage

Create a processor from a set of transformations of the form func(word string) (new string, keep bool):

processor := topics.NewProcessor(
  topics.Transformations{
    topics.ToLower, 
    topics.Sanitize, 
    topics.MinLen, 
    topics.GetStopwordFilter("../stopwords/en")})

Read data and apply transformations to build a corpus:

var docs = []string{
	"I like to eat broccoli and bananas.",
	"I ate a banana and spinach smoothie for breakfast.",
	"Chinchillas and kittens are cute.",
	"My sister adopted cute kittens yesterday.",
	"Look at this cute hamster munching on a piece of chinchillas.",
}

corpus, err := processor.AddStrings(topics.NewCorpus(), docs)

Run LDA and print the results:

lda := topics.NewLDA(&topics.Configuration{Verbose: true, PrintInterval: 500, PrintNumWords: 8})
err = lda.Init(corpus, 2, 0, 0) // K (number of topics), α, β (Dirichlet distribution smoothing factors)

_, err = lda.Train(1000) // Number of iterations
lda.PrintTopWords(5)

Resulting in something like:

Topic   Tokens  Words
0       9       like(1) eat(1) broccoli(1) bananas(1) ate(1)
1       14      cute(3) kittens(2) chinchillas(2) piece(1) look(1)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].