All Projects → JonathanRaiman → Pytreebank

JonathanRaiman / Pytreebank

Licence: mit
😡😇 Stanford Sentiment Treebank loader in Python

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pytreebank

Nlp bahasa resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+69.89%)
Mutual labels:  dataset, natural-language-processing, sentiment-analysis
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+1117.2%)
Mutual labels:  natural-language-processing, sentiment-analysis, sentiment
Stocksight
Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
Stars: ✭ 1,037 (+1015.05%)
Mutual labels:  natural-language-processing, sentiment-analysis, sentiment
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-59.14%)
Mutual labels:  natural-language-processing, sentiment-analysis
Max Text Sentiment Classifier
Detect the sentiment captured in short pieces of text
Stars: ✭ 35 (-62.37%)
Mutual labels:  natural-language-processing, sentiment
French Sentiment Analysis Dataset
A collection of over 1.5 Million tweets data translated to French, with their sentiment.
Stars: ✭ 35 (-62.37%)
Mutual labels:  dataset, sentiment-analysis
Conv Emotion
This repo contains implementation of different architectures for emotion recognition in conversations.
Stars: ✭ 646 (+594.62%)
Mutual labels:  natural-language-processing, sentiment-analysis
Coarij
Corpus of Annual Reports in Japan
Stars: ✭ 55 (-40.86%)
Mutual labels:  dataset, natural-language-processing
Mtnt
Code for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (-48.39%)
Mutual labels:  dataset, natural-language-processing
Char Rnn Tensorflow
Multi-layer Recurrent Neural Networks for character-level language models implements by TensorFlow
Stars: ✭ 58 (-37.63%)
Mutual labels:  dataset, natural-language-processing
Repo 2017
Python codes in Machine Learning, NLP, Deep Learning and Reinforcement Learning with Keras and Theano
Stars: ✭ 1,123 (+1107.53%)
Mutual labels:  natural-language-processing, sentiment-analysis
Absa Pytorch
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Stars: ✭ 1,181 (+1169.89%)
Mutual labels:  natural-language-processing, sentiment-analysis
Wikisql
A large annotated semantic parsing corpus for developing natural language interfaces.
Stars: ✭ 965 (+937.63%)
Mutual labels:  dataset, natural-language-processing
Nlp With Ruby
Curated List: Practical Natural Language Processing done in Ruby
Stars: ✭ 907 (+875.27%)
Mutual labels:  natural-language-processing, sentiment-analysis
Insuranceqa Corpus Zh
🚁 保险行业语料库,聊天机器人
Stars: ✭ 821 (+782.8%)
Mutual labels:  dataset, natural-language-processing
Pattern
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Stars: ✭ 8,112 (+8622.58%)
Mutual labels:  natural-language-processing, sentiment-analysis
Dialogue Understanding
This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Stars: ✭ 77 (-17.2%)
Mutual labels:  natural-language-processing, sentiment-analysis
Awesome Twitter Data
A list of Twitter datasets and related resources.
Stars: ✭ 533 (+473.12%)
Mutual labels:  dataset, sentiment-analysis
Hate Speech And Offensive Language
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
Stars: ✭ 543 (+483.87%)
Mutual labels:  dataset, natural-language-processing
Textblob Ar
Arabic support for textblob
Stars: ✭ 60 (-35.48%)
Mutual labels:  natural-language-processing, sentiment-analysis

SST Utils

Utilities for downloading, importing, and visualizing the Stanford Sentiment Treebank, a dataset capturing fine-grained sentiment over movie reviews. See examples below for usage. Tested in Python 3.4.3 and 2.7.12.

Jonathan Raiman, author

Javascript code by Jason Chuang and Stanford NLP modified and taken from Stanford NLP Sentiment Analysis demo.

PyPI version Build Status License

Visualization

Allows for visualization using Jason Chuang's Javascript and CSS within an IPython notebook:

import pytreebank
# load the sentiment treebank corpus in the parenthesis format,
# e.g. "(4 (2 very ) (3 good))"
dataset = pytreebank.load_sst()
# add Javascript and CSS to the Ipython notebook
pytreebank.LabeledTree.inject_visualization_javascript()
# select and example to visualize
example = dataset["train"][0]
# display it in the page
example.display()

Example visualization using pytreebank

Lines and Labels

To use the corpus to output spans from the different trees you can call the to_labeled_lines and to_lines method of a LabeledTree. The first returned sentence in those lists is always the root sentence:

import pytreebank
dataset = pytreebank.load_sst()
example = dataset["train"][0]

# extract spans from the tree.
for label, sentence in example.to_labeled_lines():
	print("%s has sentiment label %s" % (
		sentence,
		["very negative", "negative", "neutral", "positive", "very positive"][label]
	))

Download/Loading control:

Change the save/load directory by passing a path (this will look for train.txt, dev.txt and test.txt files under the directory).

dataset = pytreebank.load_sst("/path/to/sentiment/")

To just load a single dataset file:

train_data = pytreebank.import_tree_corpus("/path/to/sentiment/train.txt")
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].