Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

The purpose of this tutorial is to learn how to install and prepare TensorFlow framework to train your own convolutional neural network object detection classifier for multiple objects, starting from scratch

Stars: ✭ 113 (-30.25%)

Mutual labels: classifier

Bayesiantracker

Bayesian multi-object tracking

Stars: ✭ 121 (-25.31%)

Mutual labels: bayesian

Naivebayes

📊 Naive Bayes classifier for JavaScript

Stars: ✭ 127 (-21.6%)

Mutual labels: classifier

Url Classification

Machine learning to classify Malicious (Spam)/Benign URL's

Stars: ✭ 95 (-41.36%)

Mutual labels: classifier

Emlearn

Machine Learning inference engine for Microcontrollers and Embedded devices

Stars: ✭ 154 (-4.94%)

Mutual labels: classifier

Statistical Rethinking

An interactive online reading of McElreath's Statistical Rethinking

Stars: ✭ 123 (-24.07%)

Mutual labels: bayesian

Modelselection

Tutorial on model assessment, model selection and inference after model selection

Stars: ✭ 139 (-14.2%)

Mutual labels: bayesian

Psycho.r

An R package for experimental psychologists

Stars: ✭ 113 (-30.25%)

Mutual labels: bayesian

Keras transfer cifar10

Object classification with CIFAR-10 using transfer learning

Stars: ✭ 120 (-25.93%)

Mutual labels: classifier

Dl Uncertainty

"What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?", NIPS 2017 (unofficial code).

Stars: ✭ 130 (-19.75%)

Mutual labels: bayesian

Sytora

A sophisticated smart symptom search engine

Stars: ✭ 111 (-31.48%)

Mutual labels: classifier

Scene Text Recognition

Scene text detection and recognition based on Extremal Region(ER)

Stars: ✭ 146 (-9.88%)

Mutual labels: classifier

Monkeylearn

⛔️ ARCHIVED ⛔️ 🐒 R package for text analysis with Monkeylearn 🐒

Stars: ✭ 95 (-41.36%)

Mutual labels: classifier

Digit Recognizer

A Machine Learning classifier for recognizing the digits for humans.

Stars: ✭ 126 (-22.22%)

Mutual labels: classifier

Speech signal processing and classification

Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].

Stars: ✭ 155 (-4.32%)

Mutual labels: classifier

Miscellaneous R Code

Code that might be useful to others for learning/demonstration purposes, specifically along the lines of modeling and various algorithms. Now almost entirely superseded by the models-by-example repo.

Stars: ✭ 146 (-9.88%)

Mutual labels: bayesian

Pecan

The Predictive Ecosystem Analyzer (PEcAn) is an integrated ecological bioinformatics toolbox.

Stars: ✭ 132 (-18.52%)

Mutual labels: bayesian

View All Similar Projects ➔

Naive Bayesian Classifier

yet another general purpose Naive Bayesian classifier.

##Installation You can install this package using the following pip command:

$ sudo pip install naiveBayesClassifier

##Example

"""
Suppose you have some texts of news and know their categories.
You want to train a system with this pre-categorized/pre-classified 
texts. So, you have better call this data your training set.
"""
from naiveBayesClassifier import tokenizer
from naiveBayesClassifier.trainer import Trainer
from naiveBayesClassifier.classifier import Classifier

newsTrainer = Trainer(tokenizer.Tokenizer(stop_words = [], signs_to_remove = ["?!#%&"]))

# You need to train the system passing each text one by one to the trainer module.
newsSet =[
    {'text': 'not to eat too much is not enough to lose weight', 'category': 'health'},
    {'text': 'Russia is trying to invade Ukraine', 'category': 'politics'},
    {'text': 'do not neglect exercise', 'category': 'health'},
    {'text': 'Syria is the main issue, Obama says', 'category': 'politics'},
    {'text': 'eat to lose weight', 'category': 'health'},
    {'text': 'you should not eat much', 'category': 'health'}
]

for news in newsSet:
    newsTrainer.train(news['text'], news['category'])

# When you have sufficient trained data, you are almost done and can start to use
# a classifier.
newsClassifier = Classifier(newsTrainer.data, tokenizer.Tokenizer(stop_words = [], signs_to_remove = ["?!#%&"]))

# Now you have a classifier which can give a try to classifiy text of news whose
# category is unknown, yet.
unknownInstance = "Even if I eat too much, is not it possible to lose some weight"
classification = newsClassifier.classify(unknownInstance)

# the classification variable holds the possible categories sorted by 
# their probablity value
print classification

Note: Definitely you will need much more training data than the amount in the above example. Really, a few lines of text like in the example is out of the question to be sufficient training set.

##What is the Naive Bayes Theorem and Classifier It is needless to explain everything once again here. Instead, one of the most eloquent explanations is quoted here.

The following explanation is quoted from another Bayes classifier which is written in Go.

BAYESIAN CLASSIFICATION REFRESHER: suppose you have a set of classes (e.g. categories) C := {C_1, ..., C_n}, and a document D consisting of words D := {W_1, ..., W_k}. We wish to ascertain the probability that the document belongs to some class C_j given some set of training data associating documents and classes.

By Bayes' Theorem, we have that
P(C_j|D) = P(D|C_j)*P(C_j)/P(D).
The LHS is the probability that the document belongs to class C_j given the document itself (by which is meant, in practice, the word frequencies occurring in this document), and our program will calculate this probability for each j and spit out the most likely class for this document.

P(C_j) is referred to as the "prior" probability, or the probability that a document belongs to C_j in general, without seeing the document first. P(D|C_j) is the probability of seeing such a document, given that it belongs to C_j. Here, by assuming that words appear independently in documents (this being the "naive" assumption), we can estimate
P(D|C_j) ~= P(W_1|C_j)*...*P(W_k|C_j)
where P(W_i|C_j) is the probability of seeing the given word in a document of the given class. Finally, P(D) can be seen as merely a scaling factor and is not strictly relevant to classificiation, unless you want to normalize the resulting scores and actually see probabilities. In this case, note that
P(D) = SUM_j(P(D|C_j)*P(C_j))
One practical issue with performing these calculations is the possibility of float64 underflow when calculating P(D|C_j), as individual word probabilities can be arbitrarily small, and a document can have an arbitrarily large number of them. A typical method for dealing with this case is to transform the probability to the log domain and perform additions instead of multiplications:

log P(C_j|D) ~ log(P(C_j)) + SUM_i(log P(W_i|C_j))

where i = 1, ..., k. Note that by doing this, we are discarding the scaling factor P(D) and our scores are no longer probabilities; however, the monotonic relationship of the scores is preserved by the log function.

If you are very curious about Naive Bayes Theorem, you may find the following list helpful:

#Improvements This classifier uses a very simple tokenizer which is just a module to split sentences into words. If your training set is large, you can rely on the available tokenizer, otherwise you need to have a better tokenizer specialized to the language of your training texts.

TODO

inline docs
unit-tests

AUTHORS

Mustafa Atik @muatik
Nejdet Yucesoy @nejdetckenobi

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 162

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗