All Projects → iamaziz → ar-embeddings

iamaziz / ar-embeddings

Licence: other
Sentiment Analysis for Arabic Text (tweets, reviews, and standard Arabic) using word2vec

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ar-embeddings

tajmeeaton
تجميعة من المشاريع، وخصوصا مفتوحة المصدر، للنهوض باللغة العربية والأمة. 👨‍💻 👨‍🔬👨‍🏫🧕
Stars: ✭ 115 (+38.55%)
Mutual labels:  arabic, arabic-nlp
arabic-sentiment-analysis
Sentiment Analysis in Arabic tweets
Stars: ✭ 64 (-22.89%)
Mutual labels:  sentiment-analysis, arabic-nlp
ATKSpy
this repository is a python package that supports SOAP interface to communicate with the Microsoft ATKS
Stars: ✭ 27 (-67.47%)
Mutual labels:  arabic, arabic-nlp
BasicArabicOCR
A very basic Arabic OCR based on tesseract OCR engine written in Java.
Stars: ✭ 19 (-77.11%)
Mutual labels:  arabic, arabic-nlp
Persian-Sentiment-Analyzer
Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )
Stars: ✭ 30 (-63.86%)
Mutual labels:  sentiment-analysis, embeddings
farasapy
A Python implementation of Farasa toolkit
Stars: ✭ 69 (-16.87%)
Mutual labels:  arabic, arabic-nlp
RadiologyReportEmbedding
Intelligent Word Embeddings of Free-Text Radiology Reports
Stars: ✭ 22 (-73.49%)
Mutual labels:  embeddings, word2vec-model
arabic-tagger
AQMAR Arabic Tagger: Sequence tagger with cost-augmented structured perceptron training
Stars: ✭ 38 (-54.22%)
Mutual labels:  arabic, arabic-nlp
SentimentAnalysis
Sentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (-61.45%)
Mutual labels:  sentiment-analysis, embeddings
sentiment-analysis-of-tweets-in-russian
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Stars: ✭ 51 (-38.55%)
Mutual labels:  sentiment-analysis, embeddings
ArSarcasm
This repository contains the Arabic sarcasm dataset (ArSarcasm)
Stars: ✭ 18 (-78.31%)
Mutual labels:  sentiment-analysis, arabic-nlp
Datastories Semeval2017 Task4
Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".
Stars: ✭ 184 (+121.69%)
Mutual labels:  sentiment-analysis, embeddings
Camel tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Stars: ✭ 124 (+49.4%)
Mutual labels:  sentiment-analysis, arabic
nmatheg
A simple strategy for training and finetuning NLP models for Arabic. Specify the parameters and just wait for the results. A simple design that makes use of the different tools in our NLP pipeline.
Stars: ✭ 19 (-77.11%)
Mutual labels:  arabic, arabic-nlp
Movie-Recommendation-System-with-Sentiment-Analysis
This is a Machine Learning project to create a "Movie Recommender System" and predict user ratings for movies using cosine similarity.
Stars: ✭ 21 (-74.7%)
Mutual labels:  sentiment-analysis
cskg
CSKG: The CommonSense Knowledge Graph
Stars: ✭ 86 (+3.61%)
Mutual labels:  embeddings
AI-Sentiment-Analysis-on-IMDB-Dataset
Sentiment Analysis using Stochastic Gradient Descent on 50,000 Movie Reviews Compiled from the IMDB Dataset
Stars: ✭ 55 (-33.73%)
Mutual labels:  sentiment-analysis
arabic-jekyll
ابدأ بالتدوين باستخدام جيكل بلحضات وبدون لمس سطر الأوامر
Stars: ✭ 36 (-56.63%)
Mutual labels:  arabic
COVID19-FeedbackApplication
A simple application is developed to get feedback from a user and analyzing the text to predict the sentiment.
Stars: ✭ 13 (-84.34%)
Mutual labels:  sentiment-analysis
Multi-Hop-Knowledge-Paths-Human-Needs
Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs
Stars: ✭ 17 (-79.52%)
Mutual labels:  sentiment-analysis

Code, embeddings, and datasets used in the paper:

A. Altowayan and L. Tao "Word Embeddings for Arabic Sentiment Analysis", IEEE BigData 2016 Workshop

How to run:

Make sure to unzip embeddings/arabic-news.tar.gz, then run

$ python asa.py --vectors embeddings/arabic-news.bin --dataset datasets/LABR-book-reviews.csv

[2017-04-08 12:59:20,387] INFO: loading projection weights from embeddings/arabic-news.bin
[2017-04-08 12:59:23,408] INFO: loaded (159175, 300) matrix from embeddings/arabic-news.bin
[2017-04-08 12:59:23,408] INFO: precomputing L2-norms of word weight vectors
[2017-04-08 12:59:24,525] INFO: dataset datasets/LABR-book-reviews.csv (16448, 2). Split: 14803 training and 1645 testing.
[2017-04-08 12:59:24,526] INFO: Tokenizing the training dataset ..
[2017-04-08 12:59:24,950] INFO:  ... total 927007 training tokens.
[2017-04-08 12:59:24,950] INFO: Tokenizing the testing dataset ..
[2017-04-08 12:59:25,003] INFO:  ... total 110705 testing tokens.
[2017-04-08 12:59:25,003] INFO: Vectorizing training tokens ..
[2017-04-08 12:59:27,414] INFO:  ... total 14803 training
[2017-04-08 12:59:27,415] INFO: Vectorizing testing tokens ..
[2017-04-08 12:59:27,723] INFO:  ... total 1645 testing
[2017-04-08 12:59:27,848] INFO: Done loading and vectorizing data.
[2017-04-08 12:59:27,848] INFO: --- Sentiment CLASSIFIERS ---
[2017-04-08 12:59:27,848] INFO: fitting ...
[2017-04-08 13:02:03,397] INFO: results ...
	MacAvg. 80.41% F1. 79.95% P. 81.37 R. 78.58 : LinearSVC
	MacAvg. 77.31% F1. 76.79% P. 78.10 R. 75.52 : RandomForestClassifier
	MacAvg. 63.93% F1. 57.42% P. 72.88 R. 47.37 : GaussianNB
	MacAvg. 80.84% F1. 80.50% P. 81.45 R. 79.56 : NuSVC
	MacAvg. 81.15% F1. 80.77% P. 81.89 R. 79.68 : LogisticRegressionCV
	MacAvg. 78.97% F1. 79.00% P. 78.34 R. 79.68 : SGDClassifier
[2017-04-08 13:02:03,397] INFO: DONE!
Dependencies:

Check out requirements.txt file. To install the dependencies:

$ pip install -r requirements.txt

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].