All Projects → msahamed → yelp_comments_classification_nlp

msahamed / yelp_comments_classification_nlp

Licence: other
Yelp round-10 review comments classification using deep learning (LSTM and CNN) and natural language processing.

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to yelp comments classification nlp

Chameleon recsys
Source code of CHAMELEON - A Deep Learning Meta-Architecture for News Recommender Systems
Stars: ✭ 202 (+180.56%)
Mutual labels:  word-embeddings, lstm-neural-networks
NTUA-slp-nlp
💻Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA
Stars: ✭ 19 (-73.61%)
Mutual labels:  word-embeddings, lstm-neural-networks
contextualLSTM
Contextual LSTM for NLP tasks like word prediction and word embedding creation for Deep Learning
Stars: ✭ 28 (-61.11%)
Mutual labels:  word-embeddings, lstm-neural-networks
Deep-Learning
This repo provides projects on deep-learning mainly using Tensorflow 2.0
Stars: ✭ 22 (-69.44%)
Mutual labels:  lstm-neural-networks
context2vec
PyTorch implementation of context2vec from Melamud et al., CoNLL 2016
Stars: ✭ 18 (-75%)
Mutual labels:  word-embeddings
Naive-Resume-Matching
Text Similarity Applied to resume, to compare Resumes with Job Descriptions and create a score to rank them. Similar to an ATS.
Stars: ✭ 27 (-62.5%)
Mutual labels:  word-embeddings
YelpDatasetSQL
Working with the Yelp Dataset in Azure SQL and SQL Server
Stars: ✭ 16 (-77.78%)
Mutual labels:  yelp-dataset
materials-synthesis-generative-models
Public release of data and code for materials synthesis generation
Stars: ✭ 47 (-34.72%)
Mutual labels:  word-embeddings
textlytics
Text processing library for sentiment analysis and related tasks
Stars: ✭ 25 (-65.28%)
Mutual labels:  word-embeddings
SentimentAnalysis
Sentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (-55.56%)
Mutual labels:  word-embeddings
object-tracking
Multiple Object Tracking System in Keras + (Detection Network - YOLO)
Stars: ✭ 89 (+23.61%)
Mutual labels:  lstm-neural-networks
A-Deep-Learning-Based-Illegal-Insider-Trading-Detection-and-Prediction-Technique-in-Stock-Market
Illegal insider trading of stocks is based on releasing non-public information (e.g., new product launch, quarterly financial report, acquisition or merger plan) before the information is made public. Detecting illegal insider trading is difficult due to the complex, nonlinear, and non-stationary nature of the stock market. In this work, we pres…
Stars: ✭ 66 (-8.33%)
Mutual labels:  lstm-neural-networks
codenames
Codenames AI using Word Vectors
Stars: ✭ 41 (-43.06%)
Mutual labels:  word-embeddings
wikidata-corpus
Train Wikidata with word2vec for word embedding tasks
Stars: ✭ 109 (+51.39%)
Mutual labels:  word-embeddings
SWDM
SIGIR 2017: Embedding-based query expansion for weighted sequential dependence retrieval model
Stars: ✭ 35 (-51.39%)
Mutual labels:  word-embeddings
lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-62.5%)
Mutual labels:  word-embeddings
word embedding
Sample code for training Word2Vec and FastText using wiki corpus and their pretrained word embedding..
Stars: ✭ 21 (-70.83%)
Mutual labels:  word-embeddings
Word-recognition-EmbedNet-CAB
Code implementation for our ICPR, 2020 paper titled "Improving Word Recognition using Multiple Hypotheses and Deep Embeddings"
Stars: ✭ 19 (-73.61%)
Mutual labels:  word-embeddings
TimeSeriesPrediction
Time Series Prediction, Stateful LSTM; 时间序列预测,洗发水销量/股票走势预测,有状态循环神经网络
Stars: ✭ 34 (-52.78%)
Mutual labels:  lstm-neural-networks
nyyelp
predicting yelp review rating using recurrent neural networks
Stars: ✭ 20 (-72.22%)
Mutual labels:  yelp-dataset

Classify yelp reviews

Classify Yelp round-10 reviews/comments

Basic Information:

In this project, I classify Yelp round-10 review datasets. The reviews contain a lot of metadata that can be mined and used to infer meaning, business attributes, and sentiment. For simplicity, I classify the review comments into two class: either as positive or negative. Reviews that have star higher than three are regarded as positive while the reviews with star less than or equal to 3 are negative. Therefore, the problem is a supervised learning. To build and train the model, I first tokenize the text and convert them to sequences. Each review comment is limited to 50 words. As a result, short texts less than 50 words are padded with zeros, and long ones are truncated. After processing the review comments, I trained three model in three different ways:

  • Model-1: In this model, a neural network with LSTM and a single embedding layer were used.
  • Model-2: In Model-1, an extra 1D convolutional layer has been added on top of LSTM layer to reduce the training time.
  • Model-3: In this model, I use the same network architecture as Model-2, but use the pre-trained glove 100 dimension word embeddings as initial input.

    Since there are about 1.6 million input comments, it takes a while to train the models. To reduce the training time step, I limit the training epoch to three. After three epochs,it is evident that Model-2 is better regarding both training time and validation accuracy.

    Codes and Libraies

    All of the projects requires Python 2.7 or 3 I have Used python 3.0. The following Python libraries are also required:

  • NumPy
  • Pandas
  • Matplotlib
  • Scikit-learn
  • Nltk
  • Plotly
  • Keras

    Word embeddings

  • Glove
  • word2vec

    Datasets are not included to this project due to size.

    The Yelp dataset is a subset of our businesses, reviews, and user data for use in personal, educational, and academic purposes. Available in both JSON and SQL files, use it to teach students about databases, to learn NLP, or for sample production data while you learn how to make mobile apps.

    Contributors

    Sabber Ahamed

    License

    MIT

  • Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].