All Projects → mdibaiee → sibe

mdibaiee / sibe

Licence: GPL-3.0 License
Experimental Haskell machine learning library

Programming Languages

haskell
3896 projects
shell
77523 projects

Projects that are alternatives of or similar to sibe

vector space modelling
NLP in python Vector Space Modelling and document classification NLP
Stars: ✭ 16 (-54.29%)
Mutual labels:  word2vec
NMFADMM
A sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).
Stars: ✭ 39 (+11.43%)
Mutual labels:  word2vec
altair
Assessing Source Code Semantic Similarity with Unsupervised Learning
Stars: ✭ 42 (+20%)
Mutual labels:  word2vec
Embedding
Embedding模型代码和学习笔记总结
Stars: ✭ 25 (-28.57%)
Mutual labels:  word2vec
FSCNMF
An implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-54.29%)
Mutual labels:  word2vec
sent2vec
How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.
Stars: ✭ 99 (+182.86%)
Mutual labels:  word2vec
fsauor2018
基于LSTM网络与自注意力机制对中文评论进行细粒度情感分析
Stars: ✭ 36 (+2.86%)
Mutual labels:  word2vec
cade
Compass-aligned Distributional Embeddings. Align embeddings from different corpora
Stars: ✭ 29 (-17.14%)
Mutual labels:  word2vec
SWDM
SIGIR 2017: Embedding-based query expansion for weighted sequential dependence retrieval model
Stars: ✭ 35 (+0%)
Mutual labels:  word2vec
Text-Analysis
Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Stars: ✭ 48 (+37.14%)
Mutual labels:  word2vec
codenames
Codenames AI using Word Vectors
Stars: ✭ 41 (+17.14%)
Mutual labels:  word2vec
word embedding
Sample code for training Word2Vec and FastText using wiki corpus and their pretrained word embedding..
Stars: ✭ 21 (-40%)
Mutual labels:  word2vec
spark-word2vec
A parallel implementation of word2vec based on Spark
Stars: ✭ 24 (-31.43%)
Mutual labels:  word2vec
Product-Categorization-NLP
Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-14.29%)
Mutual labels:  word2vec
wordfish-python
extract relationships from standardized terms from corpus of interest with deep learning 🐟
Stars: ✭ 19 (-45.71%)
Mutual labels:  word2vec
word2vec.r
📐Julia's implementation of word2vec in R
Stars: ✭ 23 (-34.29%)
Mutual labels:  word2vec
reach
Load embeddings and featurize your sentences.
Stars: ✭ 17 (-51.43%)
Mutual labels:  word2vec
go2vec
Read and use word2vec vectors in Go
Stars: ✭ 44 (+25.71%)
Mutual labels:  word2vec
game2vec
TensorFlow implementation of word2vec applied on https://www.kaggle.com/tamber/steam-video-games dataset, using both CBOW and Skip-gram.
Stars: ✭ 62 (+77.14%)
Mutual labels:  word2vec
Persian-Sentiment-Analyzer
Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )
Stars: ✭ 30 (-14.29%)
Mutual labels:  word2vec

sibe

A simple Machine Learning library.

Simple neural network

import Numeric.Sibe

let a = (sigmoid, sigmoid') -- activation function
    -- random network, seed 0, values between -1 and 1,
    -- two inputs, two nodes in hidden layer and a single output
    rnetwork = randomNetwork 0 (-1, 1) 2 [(2, a)] (1, a)

    -- inputs and labels
    inputs = [vector [0, 1], vector [1, 0], vector [1, 1], vector [0, 0]]
    labels = [vector [1], vector [1], vector [0], vector [0]]

    -- define the session which includes parameters
    session = def { network = rnetwork
                  , learningRate = 0.5
                  , epochs = 1000
                  , training = zip inputs labels
                  , test = zip inputs labels
                  , drawChart = True
                  , chartName = "nn.png" -- draws chart of loss over time
                  } :: Session

    initialCost = crossEntropy session

-- run gradient descent
-- you can also use `sgd`, see the notmnist example
newsession <- run gd session

let results = map (`forward` newsession) inputs
    rounded = map (map round . toList) results

    cost = crossEntropy newsession

putStrLn $ "- initial cost (cross-entropy): " ++ show initialCost
putStrLn $ "- actual result: " ++ show results
putStrLn $ "- rounded result: " ++ show rounded
putStrLn $ "- cost (cross-entropy): " ++ show cost

Examples

# neural network examples
stack exec example-xor
stack exec example-424
# notMNIST dataset, achieves ~87.5% accuracy after 9 epochs
stack exec example-notmnist

# Naive Bayes document classifier, using Reuters dataset
# using Porter stemming, stopword elimination and a few custom techniques.
# The dataset is imbalanced which causes the classifier to be biased towards some classes (earn, acq, ...)
# to workaround the imbalanced dataset problem, there is a --top-ten option which classifies only top 10 popular
# classes, with evenly split datasets (100 for each), this increases F Measure significantly, along with ~10% of improved accuracy
# N-Grams don't seem to help us much here (or maybe my implementation is wrong!), using bigrams increases
# accuracy, while decreasing F-Measure slightly.
stack exec example-naivebayes-doc-classifier -- --verbose
stack exec example-naivebayes-doc-classifier -- --verbose --top-ten

notMNIST

notMNIST dataset, sigmoid hidden layer, cross-entropy loss, learning rate decay and sgd (notmnist.hs): notMNIST

notMNIST dataset, relu hidden layer, cross-entropy loss, learning rate decay and sgd (notmnist.hs): notMNIST

Word2Vec

word2vec on a very small sample text:

the king loves the queen
the queen loves the king,
the dwarf hates the king
the queen hates the dwarf
the dwarf poisons the king
the dwarf poisons the queen
the man loves the woman
the woman loves the man,
the thief hates the man
the woman hates the thief
the thief robs the man
the thief robs the woman

The computed vectors are transformed to two dimensions using PCA:

king and queen have a relation with man and woman, love and hate are close to each other, and dwarf and thief have a relation with poisons and robs, also, dwarf is close to queen and king while thief is closer to man and woman. the doesn't relate to anything. word2vec results

You can reproduce this result using these parameters:

let session = def { learningRate = 0.1
                  , batchSize = 1
                  , epochs = 10000
                  , debug = True
                  } :: Session
    w2v = def { docs = ds
              , dimensions = 30
              , method = SkipGram
              , window = 2
              , w2vDrawChart = True
              , w2vChartName = "w2v.png"
              } :: Word2Vec

This is a very small development dataset and I have to test it on larger datasets.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].