Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Kalman Filter book using Jupyter Notebook. Focuses on building intuition and experience, not formal proofs. Includes Kalman filters,extended Kalman filters, unscented Kalman filters, particle filters, and more. All exercises include solutions.

Stars: ✭ 11,233 (+10300.93%)

Mutual labels: jupyter-notebook

Ml Ai Experiments

All my experiments with AI and ML

Stars: ✭ 107 (-0.93%)

Mutual labels: jupyter-notebook

Getting Started With Google Bert

Build and train state-of-the-art natural language processing models using BERT

Stars: ✭ 107 (-0.93%)

Mutual labels: jupyter-notebook

Texas Hold Em Ai

Research on Texas Hold'em AI

Stars: ✭ 107 (-0.93%)

Mutual labels: jupyter-notebook

Py Wsi

Python package for dealing with whole slide images (.svs) for machine learning, particularly for fast prototyping. Includes patch sampling and storing using OpenSlide. Patches may be stored in LMDB, HDF5 files, or to disk. It is highly recommended to fork and download this repository so that personal customisations can be made for your work.

Stars: ✭ 107 (-0.93%)

Mutual labels: jupyter-notebook

Robustness applications

Notebooks for reproducing the paper "Computer Vision with a Single (Robust) Classifier"

Stars: ✭ 108 (+0%)

Mutual labels: jupyter-notebook

Numpy Ml

Machine learning, in numpy

Stars: ✭ 11,100 (+10177.78%)

Mutual labels: topic-modeling

Prml

PRML algorithms implemented in Python

Stars: ✭ 10,206 (+9350%)

Mutual labels: jupyter-notebook

Facemaskdetection

开源人脸口罩检测模型和数据 Detect faces and determine whether people are wearing mask.

Stars: ✭ 1,677 (+1452.78%)

Mutual labels: jupyter-notebook

Ganlocalediting

Stars: ✭ 108 (+0%)

Mutual labels: jupyter-notebook

Aa228 Notebook

IJulia notebooks for AA228/CS238 Decision Making Under Uncertainty course at Stanford University

Stars: ✭ 107 (-0.93%)

Mutual labels: jupyter-notebook

Ultra96 Pynq

Board files to build Ultra 96 PYNQ image

Stars: ✭ 108 (+0%)

Mutual labels: jupyter-notebook

Tf Mrnn

Re-implementation of the m-RNN model using TensorFLow

Stars: ✭ 107 (-0.93%)

Mutual labels: jupyter-notebook

Pyldavis

Python library for interactive topic model visualization. Port of the R LDAvis package.

Stars: ✭ 1,550 (+1335.19%)

Mutual labels: jupyter-notebook

Tensorflow Examples

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

Stars: ✭ 41,480 (+38307.41%)

Mutual labels: jupyter-notebook

Sw machine learning

machine learning

Stars: ✭ 108 (+0%)

Mutual labels: jupyter-notebook

Ml Demos

Python code examples for the feedly Machine Learning blog (https://blog.feedly.com/category/all/Machine-Learning/)

Stars: ✭ 108 (+0%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

lda2vec

pytorch implementation of Moody's lda2vec, a way of topic modeling using word embeddings.
The original paper: Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec.

Warning: I, personally, believe that it is quite hard to make lda2vec algorithm work.
Sometimes it finds a couple of topics, sometimes not. Usually a lot of found topics are a total mess.
The algorithm is prone to poor local minima. It greatly depends on values of initial topic assignments.

For my results see 20newsgroups/explore_trained_model.ipynb. Also see Implementation details below.

Loss

The training proceeds as follows. First, convert a document corpus to a set of tuples
{(document id, word, the window around the word) | for each word in the corpus}.
Second, for each tuple maximize the following objective function

where c - context vector, w - embedding vector for a word, lambda - positive constant that controls sparsity, i - sum over the window around the word, k - sum over sampled negative words, j - sum over topics, p - probability distribution over topics for a document, t - topic vectors.
When training I also shuffle and batch the tuples.

How to use it

Go to 20newsgroups/.
Run get_windows.ipynb to prepare data.
Run python train.py for training.
Run explore_trained_model.ipynb.

To use this on your data you need to edit get_windows.ipynb. Also there are hyperparameters in 20newsgroups/train.py, utils/training.py, utils/lda2vec_loss.py.

Implementation details

I use vanilla LDA to initialize lda2vec (topic assignments for each document). It is not like in the original paper. It is not how it supposed to work. But without this results are quite bad.
Also I use temperature to smoothen the initialization in the hope that lda2vec will have a chance to find better topic assignments.
I add noise to some gradients while training.
I reweight loss according to document lengths.
Before training lda2vec I train 50-dimensional skip-gram word2vec to initialize the word embeddings.
For text preprocessing:
1. do word lemmatization
2. remove rare and frequent words

Requirements

pytorch 0.2, spacy 1.9, gensim 3.0
numpy, sklearn, tqdm
matplotlib, Multicore-TSNE

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 108

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (11) 🔗