All Projects → sndsabin → Nepali-News-Classifier

sndsabin / Nepali-News-Classifier

Licence: GPL-3.0 license
Text Classification of Nepali Language Document. This Mini Project was done for the partial fulfillment of NLP Course : COMP 473.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Nepali-News-Classifier

chatto
Chatto is a minimal chatbot framework in Go.
Stars: ✭ 98 (+653.85%)
Mutual labels:  classifier, naive-bayes-classifier
support-tickets-classification
This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+992.31%)
Mutual labels:  classifier, text-classification
ML4K-AI-Extension
Use machine learning in AppInventor, with easy training using text, images, or numbers through the Machine Learning for Kids website.
Stars: ✭ 18 (+38.46%)
Mutual labels:  classifier, text-classification
lapis-bayes
Naive Bayes classifier for use in Lua
Stars: ✭ 26 (+100%)
Mutual labels:  classifier, naive-bayes-classifier
20-newsgroups text-classification
"20 newsgroups" dataset - Text Classification using Multinomial Naive Bayes in Python.
Stars: ✭ 41 (+215.38%)
Mutual labels:  text-classification, naive-bayes-classifier
Textvec
Text vectorization tool to outperform TFIDF for classification tasks
Stars: ✭ 167 (+1184.62%)
Mutual labels:  text-classification, tf-idf
text2class
Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (+15.38%)
Mutual labels:  classifier, text-classification
Naive-Bayes-Text-Classifier-in-Java
Naive Bayes Classification used to classify movie reviews as positive or negative
Stars: ✭ 18 (+38.46%)
Mutual labels:  text-classification, naive-bayes-classifier
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (+192.31%)
Mutual labels:  classifier, text-classification
Whatlang Rs
Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/
Stars: ✭ 400 (+2976.92%)
Mutual labels:  classifier, text-classification
Nlp In Practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Stars: ✭ 790 (+5976.92%)
Mutual labels:  text-classification, tf-idf
bayes
naive bayes in php
Stars: ✭ 61 (+369.23%)
Mutual labels:  classifier, naive-bayes-classifier
opentc
OpenTC is a text classification engine using several algorithms in machine learning
Stars: ✭ 27 (+107.69%)
Mutual labels:  text-classification, svm-classifier
sentiment-analysis-using-python
Large Data Analysis Course Project
Stars: ✭ 23 (+76.92%)
Mutual labels:  classifier, naive-bayes-classifier
Binary-Text-Classification-Doc2vec-SVM
A Python implementation of a binary text classifier using Doc2Vec and SVM
Stars: ✭ 16 (+23.08%)
Mutual labels:  text-classification, svm-classifier
naive-bayes-classifier
Implementing Naive Bayes Classification algorithm into PHP to classify given text as ham or spam. This application uses MySql as database.
Stars: ✭ 21 (+61.54%)
Mutual labels:  classifier, naive-bayes-classifier
TextCategorization
⚡ Using deep learning (MLP, CNN, Graph CNN) to classify text in TensorFlow.
Stars: ✭ 30 (+130.77%)
Mutual labels:  text-classification, text-categorization
TextFeatureSelection
Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
Stars: ✭ 42 (+223.08%)
Mutual labels:  text-classification, text-categorization
node-fasttext
Nodejs binding for fasttext representation and classification.
Stars: ✭ 39 (+200%)
Mutual labels:  classifier, text-classification
Fasttext.py
A Python interface for Facebook fastText
Stars: ✭ 1,091 (+8292.31%)
Mutual labels:  classifier, text-classification

16NepaliNews Corpus

The '16 Nepali News' data set is a collection of approximately 14,364 Nepali language news documents, partitioned (unevenly) across 16 different newsgroup: Auto, Bank, Blog, Business Interview, Economy, Employment, Entertainment, Interview, Literature, National News, Opinion, Sports, Technology, Tourism, and World.

This '16 Nepali News' data set was inspired from 20 newsgroups dataset.

Loading the Corpus

MLCOMPDIR = r'LOCATION OF CORPUS'

trainNews = load_mlcomp('16NepaliNews', 'train', mlcomp_root= MLCOMPDIR)
testNews = load_mlcomp('16NepaliNews', 'test', mlcomp_root= MLCOMPDIR)

Or Manually Preparing Training and Test Set

news = load_mlcomp('16NepaliNews', 'raw', mlcomp_root= MLCOMPDIR)

''' Testing and Training Data '''
SPLIT_PERCENT = 0.9

splitSize = int(len(news.data) * SPLIT_PERCENT)
print(splitSize)
xTrain = news.data[:splitSize]
xTest = news.data[splitSize:]
yTrain = news.target[:splitSize]
yTest = news.target[splitSize:]

Executing the code

Before execution, copy the file 'nepali' to the stop words directory of your nltk-data/corpora folder.

License

This '16NepaliNews' corpus is licensed under GPLv3

Author

sndsabin

This Corpus was developed by parsing and scrapping contents published from 2015 on different online news portals. All the news contents belong to their respective owners.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].