All Projects → andreekeberg → Ml Classify Text Js

andreekeberg / Ml Classify Text Js

Licence: mit
Machine learning based text classification in JavaScript using n-grams and cosine similarity

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Ml Classify Text Js

Neuronblocks
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Stars: ✭ 1,356 (+3468.42%)
Mutual labels:  artificial-intelligence, natural-language-processing, text-classification
awesome-text-classification
Text classification meets word embeddings.
Stars: ✭ 27 (-28.95%)
Mutual labels:  sentiment-analysis, text-classification, classification
Awesome Ai Services
An overview of the AI-as-a-service landscape
Stars: ✭ 133 (+250%)
Mutual labels:  artificial-intelligence, natural-language-processing, sentiment-analysis
Deep Atrous Cnn Sentiment
Deep-Atrous-CNN-Text-Network: End-to-end word level model for sentiment analysis and other text classifications
Stars: ✭ 64 (+68.42%)
Mutual labels:  classification, text-classification, sentiment-analysis
support-tickets-classification
This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+273.68%)
Mutual labels:  classifier, text-classification, classification
Machine Learning From Scratch
Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (+10.53%)
Mutual labels:  artificial-intelligence, classification, sentiment-analysis
Pyss3
A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI
Stars: ✭ 191 (+402.63%)
Mutual labels:  artificial-intelligence, natural-language-processing, text-classification
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+2878.95%)
Mutual labels:  natural-language-processing, text-classification, sentiment-analysis
text2class
Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-60.53%)
Mutual labels:  classifier, text-classification, classification
ML4K-AI-Extension
Use machine learning in AppInventor, with easy training using text, images, or numbers through the Machine Learning for Kids website.
Stars: ✭ 18 (-52.63%)
Mutual labels:  classifier, text-classification, classification
Nlp bahasa resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+315.79%)
Mutual labels:  library, natural-language-processing, sentiment-analysis
Spacy
💫 Industrial-strength Natural Language Processing (NLP) in Python
Stars: ✭ 21,978 (+57736.84%)
Mutual labels:  artificial-intelligence, natural-language-processing, text-classification
Text Classification Keras
📚 Text classification library with Keras
Stars: ✭ 53 (+39.47%)
Mutual labels:  library, text-classification, sentiment-analysis
Ml
A high-level machine learning and deep learning library for the PHP language.
Stars: ✭ 1,270 (+3242.11%)
Mutual labels:  artificial-intelligence, classification, natural-language-processing
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+6526.32%)
Mutual labels:  natural-language-processing, sentiment-analysis, text-classification
Avalanche
Avalanche: a End-to-End Library for Continual Learning.
Stars: ✭ 151 (+297.37%)
Mutual labels:  artificial-intelligence, training, library
Textblob Ar
Arabic support for textblob
Stars: ✭ 60 (+57.89%)
Mutual labels:  natural-language-processing, text-classification, sentiment-analysis
COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers
Rank 1 / 216
Stars: ✭ 24 (-36.84%)
Mutual labels:  sentiment-analysis, text-classification, classification
Text mining resources
Resources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+842.11%)
Mutual labels:  natural-language-processing, text-classification, sentiment-analysis
Nlp.js
An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
Stars: ✭ 4,670 (+12189.47%)
Mutual labels:  natural-language-processing, sentiment-analysis, classifier

📄 ClassifyText (JS)

Version Total Downloads License

Use machine learning to classify text using n-grams and cosine similarity.

Minimal library that can be used both in the browser and in Node.js, that allows you to train a model with a large amount of text samples (and corresponding labels), and then use this model to quickly predict one or more appropriate labels for new text samples.

Installation

Using npm

npm install ml-classify-text

Using yarn

yarn add ml-classify-text

Getting started

Import as an ES6 module

import Classifier from 'ml-classify-text'

Import as a CommonJS module

const { Classifier } = require('ml-classify-text')

Basic usage

Setting up a new Classifier instance

const classifier = new Classifier()

Training a model

let positive = [
    'This is great, so cool!',
    'Wow, I love it!',
    'It really is amazing',
]

let negative = [
    'This is really bad',
    'I hate it with a passion',
    'Just terrible!',
]

classifier.train(positive, 'positive')
classifier.train(negative, 'negative')

Getting a prediction

let predictions = classifier.predict('It sure is pretty great!')

if (predictions.length) {
	predictions.forEach(prediction => {
		console.log(`${prediction.label} (${prediction.confidence})`)
	})
} else {
	console.log('No predictions returned')
}

Returning:

positive (0.5423261445466404)

Advanced usage

Configuration

The following configuration options can be passed both directly to a new Model, or indirectly by passing it to the Classifier constructor.

Options

Property Type Default Description
nGramMin int 1 Minimum n-gram size
nGramMax int 1 Maximum n-gram size
vocabulary Array | Set | false [] Terms mapped to indexes in the model data, set to false to store terms directly in the data entries
data Object {} Key-value store of labels and training data vectors

Using n-grams

The default behavior is to split up texts by single words (known as a bag of words, or unigrams).

This has a few limitations, since by ignoring the order of words, it's impossible to correctly match phrases and expressions.

In comes n-grams, which, when set to use more than one word per term, act like a sliding window that moves across the text — a continuous sequence of words of the specified amount, which can greatly improve the accuracy of predictions.

Example of using n-grams with a size of 2 (bigrams)

const classifier = new Classifier({
	nGramMin: 2,
	nGramMax: 2
})

let tokens = classifier.tokenize('I really dont like it')

console.log(tokens)

Returning:

{
    'i really': 1,
    'really dont': 1,
    'dont like': 1,
    'like it': 1
}

Serializing a model

After training a model with large sets of data, you'll want to store all this data, to allow you to simply set up a new model using this training data at another time, and quickly make predictions.

To do this, simply use the serialize method on your Model, and either save the data structure to a file, send it to a server, or store it in any other way you want.

let model = classifier.model

console.log(model.serialize())

Returning:

{
    nGramMin: 1,
    nGramMax: 1,
    vocabulary: [
    	'this',    'is',      'great',
    	'so',      'cool',    'wow',
    	'i',       'love',    'it',
    	'really',  'amazing', 'bad',
    	'hate',    'with',    'a',
    	'passion', 'just',    'terrible'
    ],
    data: {
        positive: {
            '0': 1, '1': 2, '2': 1,
            '3': 1, '4': 1, '5': 1,
            '6': 1, '7': 1, '8': 2,
            '9': 1, '10': 1
        },
        negative: {
            '0': 1, '1': 1, '6': 1,
            '8': 1, '9': 1, '11': 1,
            '12': 1, '13': 1, '14': 1,
            '15': 1, '16': 1, '17': 1
        }
    }
}

Documentation

Contributing

Read the contribution guidelines.

Changelog

Refer to the changelog for a full history of the project.

License

ClassifyText is licensed under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].