All Projects → WayneDW → Sentiment Analysis In Event Driven Stock Price Movement Prediction

WayneDW / Sentiment Analysis In Event Driven Stock Price Movement Prediction

Licence: mit
Use NLP to predict stock price movement associated with news

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Sentiment Analysis In Event Driven Stock Price Movement Prediction

Stock Prediction
Smart Algorithms to predict buying and selling of stocks on the basis of Mutual Funds Analysis, Stock Trends Analysis and Prediction, Portfolio Risk Factor, Stock and Finance Market News Sentiment Analysis and Selling profit ratio. Project developed as a part of NSE-FutureTech-Hackathon 2018, Mumbai. Team : Semicolon
Stars: ✭ 125 (-76.85%)
Mutual labels:  sentiment-analysis, stock-price-prediction
Stocksight
Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
Stars: ✭ 1,037 (+92.04%)
Mutual labels:  sentiment-analysis, stock-price-prediction
Steward
A stock portfolio manager that provides neural net based short-term predictions for stocks and natural language processing based analysis on community sentiments.
Stars: ✭ 25 (-95.37%)
Mutual labels:  sentiment-analysis, stock-price-prediction
Stock Market Prediction Web App Using Machine Learning And Sentiment Analysis
Stock Market Prediction Web App based on Machine Learning and Sentiment Analysis of Tweets (API keys included in code). The front end of the Web App is based on Flask and Wordpress. The App forecasts stock prices of the next seven days for any given stock under NASDAQ or NSE as input by the user. Predictions are made using three algorithms: ARIMA, LSTM, Linear Regression. The Web App combines the predicted prices of the next seven days with the sentiment analysis of tweets to give recommendation whether the price is going to rise or fall
Stars: ✭ 101 (-81.3%)
Mutual labels:  sentiment-analysis, stock-price-prediction
stocktwits-sentiment
Stocktwits market sentiment analysis in Python with Keras and TensorFlow.
Stars: ✭ 23 (-95.74%)
Mutual labels:  sentiment-analysis, stock-price-prediction
Troll
Language sentiment analysis and neural networks... for trolls.
Stars: ✭ 330 (-38.89%)
Mutual labels:  sentiment-analysis
Deep Convolution Stock Technical Analysis
Uses Deep Convolutional Neural Networks (CNNs) to model the stock market using technical analysis. Predicts the future trend of stock selections.
Stars: ✭ 407 (-24.63%)
Mutual labels:  stock-price-prediction
Aspect Based Sentiment Analysis
A paper list for aspect based sentiment analysis.
Stars: ✭ 311 (-42.41%)
Mutual labels:  sentiment-analysis
Informers
State-of-the-art natural language processing for Ruby
Stars: ✭ 306 (-43.33%)
Mutual labels:  sentiment-analysis
Rust Bert
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
Stars: ✭ 510 (-5.56%)
Mutual labels:  sentiment-analysis
Nlp.js
An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
Stars: ✭ 4,670 (+764.81%)
Mutual labels:  sentiment-analysis
Text mining resources
Resources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (-33.7%)
Mutual labels:  sentiment-analysis
Personality Detection
Implementation of a hierarchical CNN based model to detect Big Five personality traits
Stars: ✭ 338 (-37.41%)
Mutual labels:  sentiment-analysis
Sentimentanalysis
文本情感分析
Stars: ✭ 421 (-22.04%)
Mutual labels:  sentiment-analysis
Sentimentr
Dictionary based sentiment analysis that considers valence shifters
Stars: ✭ 325 (-39.81%)
Mutual labels:  sentiment-analysis
Awesome Sentiment Analysis
Repository with all what is necessary for sentiment analysis and related areas
Stars: ✭ 459 (-15%)
Mutual labels:  sentiment-analysis
Awesome Sentiment Analysis
Reading list for Awesome Sentiment Analysis papers
Stars: ✭ 306 (-43.33%)
Mutual labels:  sentiment-analysis
Stock Prediction Models
Gathers machine learning and deep learning models for Stock forecasting including trading bots and simulations
Stars: ✭ 4,660 (+762.96%)
Mutual labels:  stock-price-prediction
Weibo Analyst
Social media (Weibo) comments analyzing toolbox in Chinese 微博评论分析工具, 实现功能: 1.微博评论数据爬取; 2.分词与关键词提取; 3.词云与词频统计; 4.情感分析; 5.主题聚类
Stars: ✭ 430 (-20.37%)
Mutual labels:  sentiment-analysis
Afinn
AFINN sentiment analysis in Python
Stars: ✭ 356 (-34.07%)
Mutual labels:  sentiment-analysis

Sentiment Analysis for Event-Driven Stock Prediction

Use natural-language processing (NLP) to predict stock price movement based on Reuters News

Website

You are welcome to visit our website: GolenRocks.me. The main purpose of this project is to build the connection between Bayesian DNN and stock price prediction based on News headline.

Methodology

  1. Data Collection and Preprocessing

    1.1 crawl a ticker list to obtain the details of public companies

    1.2 crawl news from Reuters using BeautifulSoup

    1.3 crawl prices using urllib

  2. Feature Engineering (Tokenization)

    2.1 Unify word format: unify tense, singular & plural, remove punctuations & stop words

    2.2 Implement one-hot encoding

    2.3 Pad word sequence (essentially a matrix) to keep the same dimension

  3. Train a set of Bayesian Convolutional Neural Networks using Stochastic Gradient Langevin Dynamics to obtain more robustness

  4. Use thinning models to predict future news

Requirement

  • Python 3
  • PyTorch > 0.4
  • numpy
  • NLTK
  • Crawler tools
    • pip3 install lxml
    • pip3 install bs4
    • pip3 install urllib

Usage

Note: If you don't want to take time to crawl data and train the model, you can also directly go to step 4.

1. Data collection

1.1 Download the ticker list from NASDAQ

$ ./crawler/all_tickers.py 20  # keep the top e.g. 20% marketcap companies

1.2 Use BeautifulSoup to crawl news headlines from Reuters

Note: you may need over one month to fetch the news you want.

Suppose we find a piece of news about COO Lu Qi Resignation on May.18, 2018 at reuters.com

We can use the following script to crawl it and format it to our local file

$ ./crawler/reuters.py # we can relate the news with company and date, this is more precise than Bloomberg News

By brute-force iterating company tickers and dates, we can get the dataset with roughly 400,000 news in the end. Since a company may have multiple news in a single day, the current version will only use topStory news to train our models and ignore the others.

1.3 Use urllib to crawl historical stock prices

Improvement here, use normalized return [5] over S&P 500 instead of return.

$ ./crawler/yahoo_finance.py # generate raw data: stockPrices_raw.json, containing open, close, ..., adjClose
$ ./create_label.py # use raw price data to generate stockReturns.json

2. Feature engineering (Tokenization)

Unify the word format, project word to a word vector, so every sentence results in a matrix.

Detail about unifying word format are: lower case, remove punctuation, get rid of stop words, unify tense and singular & plural.

$ ./tokenize_news.py

3. Train a Bayesian ConvNet to predict the stock price movement.

Type the following to train a set of robust Bayesian models.

$ ./main.py -epochs 500 -static False

4. Prediction and analysis

Let's show one example how the thinning models react to Baidu Lu Qi's resignation

$ ./main.py -predict "Top executive behind Baidu's artificial intelligence drive steps aside"
>> Sell

The prediction makes sense, let's find another one.

Eli Lilly and Co (LLY.N)
FRI, JUN 1 2018
UPDATE 2-Lilly gets U.S. nod for arthritis drug, sets price well below rivals
* Drug priced at $25,000/year, 60 pct lower to AbbVie's Humira
$ ./main.py -predict "UPDATE 2-Lilly gets U.S. nod for arthritis drug  sets price well below rivals"
>> Sell

Lowering down drug prices looks helpful to gain market share in the business, however, they didn't mention too much about the updates of technology, we are inclined to regard it as the virulent price competition, which does not help to the company earnings. Thus it is not a bad decision to sell Eli Lilly stocks.

Next, let's see what the buy options look like:

Alphabet Inc (GOOG.O)
WED, MAY 30 2018
Google launches the second app in China, woos top smartphone market
* BEIJING Alphabet Inc's Google has launched a file managing tool in several Chinese app stores as it 
* looks for fresh inroads into the world's biggest smartphone market, where most of the internet 
* giant's top products remain banned.
$ ./main.py -predict "Google launches the second app in China  woos top smartphone market"
>> Strong Buy

By now, you have basically understood how the models work, let's use backtesting to examine the performance on the news in the past two weeks.

$ ./main.py -eval True
>> Testing    - loss: 0.6761  acc: 58.07%(41.8/72.0) 83.50%(3.3/3.9) 100.00%(0.0/0.0) 0.00%(0.0/0.0) 

Note: the predictions are averaged, which explains why we have float numbers. From left to right, the predictions become more and more confident. 58% is actually much higher than my expectation, I believe when tested on a longer time horizon, the performance gets worse. However, as long as the predictions are better than random guesses (50%), you can't lose money betting on a favorable game (assume no trading cost and liquidity issue).

5. Future works

This is a very rough work. A better label should be based on the comparison of stock price changes between the company and the corresponding industry, instead of the S&P 500, which is in spririt similar to hedging.

By Tim Loughran and Bill McDonald, some words have strong indications of positive and negative effects in finance, e.g. company merger and acquisition. Therefore we need to dig into these words to find more information. In addition, detailed analysis and comparison in each industry are also useful.

Another simple but interesting example can be found in Financial Sentiment Analysis part1, part2.

Since a comprehensive stopword list is quite helpful in improving the prediction power, you are very welcome to build a better stopword list and share it.

References:

  1. Yoon Kim, Convolutional Neural Networks for Sentence Classification, EMNLP, 2014
  2. J Pennington, R Socher, CD Manning, GloVe: Global Vectors for Word Representation, EMNLP, 2014
  3. Max Welling, Yee Whye Teh, Bayesian Learning via Stochastic Gradient Langevin Dynamics, ICML, 2011
  4. Tim Loughran and Bill McDonald, 2011, “When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks,” Journal of Finance, 66:1, 35-65.
  5. H Lee, etc, On the Importance of Text Analysis for Stock Price Prediction, LREC, 2014
  6. Xiao Ding, Deep Learning for Event-Driven Stock Prediction, IJCAI2015
  7. IMPLEMENTING A CNN FOR TEXT CLASSIFICATION IN TENSORFLOW
  8. Keras predict sentiment-movie-reviews using deep learning
  9. Keras sequence-classification-lstm-recurrent-neural-networks
  10. tf-idf + t-sne
  11. Implementation of CNN in sequence classification
  12. Getting Started with Word2Vec and GloVe in Python
  13. PyTorch Implementation of Kim's Convolutional Neural Networks for Sentence Classification
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].