All Projects → arranger1044 → spyn

arranger1044 / spyn

Licence: GPL-3.0 License
Sum-Product Network learning routines in python

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to spyn

ENCO
Official repository of the paper "Efficient Neural Causal Discovery without Acyclicity Constraints"
Stars: ✭ 52 (+147.62%)
Mutual labels:  structure-learning
gospn
A free, open-source inference and learning library for Sum-Product Networks (SPN)
Stars: ✭ 24 (+14.29%)
Mutual labels:  spn
deeprob-kit
A Python Library for Deep Probabilistic Modeling
Stars: ✭ 32 (+52.38%)
Mutual labels:  sum-product-networks
deepdb-public
Implementation of DeepDB: Learn from Data, not from Queries!
Stars: ✭ 61 (+190.48%)
Mutual labels:  sum-product-networks
Pgmpy
Python Library for learning (Structure and Parameter) and inference (Probabilistic and Causal) in Bayesian Networks.
Stars: ✭ 1,942 (+9147.62%)
Mutual labels:  structure-learning

spyn

Implementing Sum-Product Networks (SPN) in python and providing some routines to do inference and learning.

overview

Implementing LearnSPN and SPN-BTB as presented in:
A. Vergari, N. Di Mauro, andF. Esposito
Simplifying, Regularizing and Strengthening Sum-Product Network Structure Learning at ECML-PKDD 2015.

requirements

spyn is build upon numpy, sklearn, scipy, numba, matplotlib and theano.

usage

Several datasets are provided in the data/ folder.

To run the algorithms and their grid search check the scripts in the bin/ folder.
To learn a single SPN from the training set portion of the nltcs data you can call:

ipython -- bin/learnspn_exp.py nltcs

To get an overview of the possible parameters use -h:

-h, --help            show this help message and exit
-k [N_ROW_CLUSTERS], --n-row-clusters [N_ROW_CLUSTERS]
                      Number of clusters to split rows into (for DPGMM it is
                      the max num of clusters)
-c [CLUSTER_METHOD], --cluster-method [CLUSTER_METHOD]
                      Cluster method to apply on rows ["GMM"|"DPGMM"|"HOEM"]
--seed [SEED]         Seed for the random generator
-o [OUTPUT], --output [OUTPUT]
                      Output dir path
-g G_FACTOR [G_FACTOR ...], --g-factor G_FACTOR [G_FACTOR ...]
                      The "p-value like" for G-Test on columns
-i [N_ITERS], --n-iters [N_ITERS]
                      Number of iterates for the row clustering algo
-r [N_RESTARTS], --n-restarts [N_RESTARTS]
                      Number of restarts for the row clustering algo (only
                      for GMM)
-p CLUSTER_PENALTY [CLUSTER_PENALTY ...], --cluster-penalty CLUSTER_PENALTY [CLUSTER_PENALTY ...]
                      Penalty for the cluster number (i.e. alpha in DPGMM
                      and rho in HOEM, not used in GMM)
 -s [SKLEARN_ARGS], --sklearn-args [SKLEARN_ARGS]
                      Additional sklearn parameters in the for of a list
                      "[name1=val1,..,namek=valk]"
-m MIN_INST_SLICE [MIN_INST_SLICE ...], --min-inst-slice MIN_INST_SLICE [MIN_INST_SLICE ...]
                      Min number of instances in a slice to split by cols
-a ALPHA [ALPHA ...], --alpha ALPHA [ALPHA ...]
                      Smoothing factor for leaf probability estimation
--clt-leaves          Whether to use Chow-Liu trees as leaves
-v [VERBOSE], --verbose [VERBOSE]
                      Verbosity level

To run a grid search you can do:

ipython -- bin/learnspn_exp.py nltcs -k 2 -c GMM -g 5 10 15 20 -m 10 50 100 500 -a 0.1 0.2 0.5 1.0 2.0 -o exp/learnspn-f/
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].