All Projects → willinseu → kesci-urdu-sentiment-analysis

willinseu / kesci-urdu-sentiment-analysis

Licence: other
sentiment-analysis

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to kesci-urdu-sentiment-analysis

CharLSTM
Bidirectional Character LSTM for Sentiment Analysis - Tensorflow Implementation
Stars: ✭ 49 (-30%)
Mutual labels:  sentiment-analysis
MemNet ABSA
No description or website provided.
Stars: ✭ 20 (-71.43%)
Mutual labels:  sentiment-analysis
Stock-Prediction
LSTM RNN for sentiment-based stock prediction
Stars: ✭ 50 (-28.57%)
Mutual labels:  sentiment-analysis
textlytics
Text processing library for sentiment analysis and related tasks
Stars: ✭ 25 (-64.29%)
Mutual labels:  sentiment-analysis
NSP-BERT
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"
Stars: ✭ 166 (+137.14%)
Mutual labels:  sentiment-analysis
LSX
A word embeddings-based semi-supervised model for document scaling
Stars: ✭ 42 (-40%)
Mutual labels:  sentiment-analysis
PBAN-PyTorch
A Position-aware Bidirectional Attention Network for Aspect-level Sentiment Analysis, PyTorch implementation.
Stars: ✭ 33 (-52.86%)
Mutual labels:  sentiment-analysis
Persian-Sentiment-Analyzer
Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )
Stars: ✭ 30 (-57.14%)
Mutual labels:  sentiment-analysis
bert sa
bert sentiment analysis tensorflow serving with RESTful API
Stars: ✭ 35 (-50%)
Mutual labels:  sentiment-analysis
stock-news-sentiment-analysis
This program uses Vader SentimentIntensityAnalyzer to calculate the news headline overall sentiment for a stock
Stars: ✭ 21 (-70%)
Mutual labels:  sentiment-analysis
NewsMTSC
Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k sentences and a state-of-the-art classification model.
Stars: ✭ 54 (-22.86%)
Mutual labels:  sentiment-analysis
lorca
Natural Language Processing for Spanish in Node.js. Stemmer, sentiment analysis, readability, tf-idf with batteries, concordance and more!
Stars: ✭ 95 (+35.71%)
Mutual labels:  sentiment-analysis
Sentic-GCN
[KBS] Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks
Stars: ✭ 19 (-72.86%)
Mutual labels:  sentiment-analysis
DialogueCRN
Source code for ACL-IJCNLP 2021 paper "DialogueCRN: Contextual Reasoning Networks for Emotion Recognition in Conversations"
Stars: ✭ 29 (-58.57%)
Mutual labels:  sentiment-analysis
FinBERT
A Pretrained BERT Model for Financial Communications. https://arxiv.org/abs/2006.08097
Stars: ✭ 193 (+175.71%)
Mutual labels:  sentiment-analysis
LinLP
使用Python进行自然语言处理相关实践,如新词发现,主题模型,隐马尔模型词性标注,Word2Vec,情感分析
Stars: ✭ 43 (-38.57%)
Mutual labels:  sentiment-analysis
Stocksent
A Python library for sentiment analysis of various tickers from the latest news by trusted sources, and tools to plot results. 📈📊📰
Stars: ✭ 35 (-50%)
Mutual labels:  sentiment-analysis
billboard
🎤 Lyrics/associated NLP data for Billboard's Top 100, 1950-2015.
Stars: ✭ 53 (-24.29%)
Mutual labels:  sentiment-analysis
Movie-Recommendation-System-with-Sentiment-Analysis
Content based movie recommendation system with sentiment analysis
Stars: ✭ 44 (-37.14%)
Mutual labels:  sentiment-analysis
Aspect-Based-Sentiment-Analysis
No description or website provided.
Stars: ✭ 29 (-58.57%)
Mutual labels:  sentiment-analysis

kesci-urdu-sentiment-analysis

主要记录kesci的nlp练习赛[Roman Urdu DataSet]的两种解法。机器学习与深度学习lstm的baseline解法

competition link:https://www.kesci.com/home/competition/5c77ab9c1ce0af002b55af86/content/0

some notes:

1.lstm.ipynb:

lstm提交得分在0.83-0.84左右。lstm配套的讲解博客地址:https://blog.csdn.net/ssswill/article/details/88533623

epoch:1~5 is enough

2.SGD.ipynb:

SGD classifier baseline,lb=0.8651。

3.lgb.ipynb:

Lightgbm baseline,lb=0.8447,use bayesian optimization to find hyperparameter for lgbm。

you can improve your score base on this method.

4.一些心得

1.你可以尝试一些简单的模型,效果可能会更好。如朴素贝叶斯,逻辑回归等,至少目前看来是这样的。 2.可以通过一些手段使TF-idf的效果可以进一步提高,比如进一步对语句更细节的清洗,在谷歌上看过一些关于urdu清洗的论文,可以关键词搜索【roman urdu】 3.添加对表情处理我觉得会对你有帮助。 4.尝试CNN,bilstm,attention等模型。 5.虽然是练习赛,所以没有花很多的心思,但是如果钻研的话,我相信可以超过0.9。Good Luck~

5.提升版本

我在另一个数据集上用了更高级的技术来解决nlp问题,有兴趣可以到https://github.com/willinseu/kaggle-Jigsaw-Unintended-Bias-in-Toxicity-Classification-solution 但由于需要写配套的博文,这样才能说清楚,所以进展较慢,所以大家可以保持watch,我会慢慢更新那一个项目。

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].