Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → gandalf1819 → Stock Market Sentiment Analysis

gandalf1819 / Stock Market Sentiment Analysis

Licence: gpl-2.0

Identification of trends in the stock prices of a company by performing fundamental analysis of the company. News articles were provided as training data-sets to the model which classified the articles as positive or neutral. Sentiment score was computed by calculating the difference between positive and negative words present in the news article. Comparisons were made between the actual stock prices and the sentiment scores. Naive Bayes, OneR and Random Forest algorithms were used to observe the results of the model using Weka

Programming Languages

7636 projects

Labels

machine-learning stock-market random-forest

Projects that are alternatives of or similar to Stock Market Sentiment Analysis

Jsmlt

🏭 JavaScript Machine Learning Toolkit

Stars: ✭ 22 (-60.71%)

Mutual labels: random-forest

Sibyl

Platform for backtesting and live-trading intraday Stock/ETF/ELW using recurrent neural networks

Stars: ✭ 32 (-42.86%)

Mutual labels: stock-market

Stocktrace

stock market analysis

Stars: ✭ 36 (-35.71%)

Mutual labels: stock-market

Awesome Fraud Detection Papers

A curated list of data mining papers about fraud detection.

Stars: ✭ 843 (+1405.36%)

Mutual labels: random-forest

Fooltrader

quant framework for stock

Stars: ✭ 960 (+1614.29%)

Mutual labels: stock-market

Robin stocks

This is a library to use with Robinhood Financial App. It currently supports trading crypto-currencies, options, and stocks. In addition, it can be used to get real time ticker information, assess the performance of your portfolio, and can also get tax documents, total dividends paid, and more. More info at

Stars: ✭ 967 (+1626.79%)

Mutual labels: stock-market

Exchange Core

Ultra-fast matching engine written in Java based on LMAX Disruptor, Eclipse Collections, Real Logic Agrona, OpenHFT, LZ4 Java, and Adaptive Radix Trees.

Stars: ✭ 801 (+1330.36%)

Mutual labels: stock-market

Tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Stars: ✭ 8,378 (+14860.71%)

Mutual labels: random-forest

Mljar Supervised

Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀

Stars: ✭ 961 (+1616.07%)

Mutual labels: random-forest

Pandas Ta

Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

Stars: ✭ 962 (+1617.86%)

Mutual labels: stock-market

Sina Stock Crawler

Sina stock options crawler with CSV output 新浪上证ETF期权数据爬虫

Stars: ✭ 12 (-78.57%)

Mutual labels: stock-market

Intrinio Realtime Node Sdk

Intrinio NodeJS SDK for Real-Time Stock & Crypto Prices

Stars: ✭ 30 (-46.43%)

Mutual labels: stock-market

Py Sec Edgar

Python application used to download, parse, and extract filings from the SEC Edgar Database (including 10-K, 10-Q, 13-D, S-1, 8-K, etc.)

Stars: ✭ 35 (-37.5%)

Mutual labels: stock-market

Awesome Investing

💸💸 Curated list of investment & finance related resources

Stars: ✭ 935 (+1569.64%)

Mutual labels: stock-market

Netsci Project

Network Analysis for Financial Markets

Stars: ✭ 39 (-30.36%)

Mutual labels: stock-market

Text Classification Benchmark

文本分类基准测试

Stars: ✭ 18 (-67.86%)

Mutual labels: random-forest

Cnn Svm Classifier

Using Tensorflow and a Support Vector Machine to Create an Image Classifications Engine

Stars: ✭ 33 (-41.07%)

Mutual labels: random-forest

25daysinmachinelearning

I will update this repository to learn Machine learning with python with statistics content and materials

Stars: ✭ 53 (-5.36%)

Mutual labels: random-forest

Stocksight

Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis

Stars: ✭ 1,037 (+1751.79%)

Mutual labels: stock-market

Td Ameritrade Client

TD Ameritrade Java Client

Stars: ✭ 35 (-37.5%)

Mutual labels: stock-market

View All Similar Projects ➔

Stock-Market-Sentiment-Analysis

Stock Prices are considered to be very dynamic and susceptible to quick changes because of the underlying nature of the financial domain and in part because of the mix of known parameters (Previous Days Closing Price, P/E Ratio etc.) and unknown factors (like Election Results, Rumors etc.) An intelligent trader would predict the stock price and buy a stock before the price rises, or sell it before its value declines. Though it is very hard to replace the expertise that an experienced trader has gained, an accurate prediction algorithm can directly result into high profits for investment firms, indicating a direct relationship between the accuracy of the prediction algorithm and the profit made from using the algorithm. In practice, there are 2 Stock Prediction Methodologies: Fundamental Analysis: Performed by the Fundamental Analysts, this method is concerned more with the company rather than the actual stock. The analysts make their decisions based on the past performance of the company, the earnings forecast etc. Technical Analysis: Performed by the Technical Analysts, this method deals with the determination of the stock price based on the past patterns of the stock (using time-series analysis.) When applying Machine Learning to Stock Data, we are more interested in doing a Technical Analysis to see if our algorithm can accurately learn the underlying patterns in the stock time series. This said, Machine Learning can also play a major role in evaluating and forecasting the performance of the company and other similar parameters helpful in Fundamental Analysis. In fact, the most successful automated stock prediction and recommendation systems use some sort of a hybrid analysis model involving both Fundamental and Technical Analysis.

In practice, there are 2 Stock Prediction Methodologies:

Fundamental Analysis:

Performed by the Fundamental Analysts, this method is concerned more with the company rather than the actual stock. The analysts make their decisions based on the past performance of the company, the earnings forecast etc.

Technical Analysis:

Performed by the Technical Analysts, this method deals with the determination of the stock price based on the past patterns of the stock (using time-series analysis.)

System Design

News Collection

We collected State Bank India’s (SBI) data for past three months, from 1 Feb 2017 to 30 April 2017. This data includes major key events news articles of the company and also daily stock prices of SBIN for the same time period. Daily stock prices contain six values as Open, High, Low, Close, Adjusted Close, and Volume. For integrity throughout the project, we considered Adjusted Close price as everyday stock price. We have collected this data from major news aggregators such as news.google.com, reauters.com, finance.yahoo.com.

Pre Processing

Text data is unstructured data. So, we cannot provide raw test data to classifier as an input. Firstly, we need to tokenize the document into words to operate on word level. Text data contains more noisy words which are not contributing towards classification. So, we need to drop those words. In addition, text data may contain numbers, more white spaces, tabs, punctuation characters, stop words etc. We also need to clean data by removing all those words. 2.3 Sentiment Detection Algorithm For automatic sentiment detection of news articles, we are following Dictionary based approach which uses Bag of Word technique for text mining. This method is based on the research of J. Bean in his implementation of Twitter sentiment analysis for airline companies. To build the polarity dictionary, we need two types of words collection; i.e. positive words and negative words. Then we can match the article’s words against both these words list and count numbers of words appears in both the dictionaries and calculate the score of that document. We created the polarity words dictionary using general words with positive and negative polarity.

Algorithm:

Tokenize the document into word vector.
Prepare the dictionary which contains words with its polarity (positive or negative)
Check against each word weather it matches with one of the word from positive word dictionary or negative words dictionary.
Count number of words belongs to positive and negative polarity.
Calculate Score of document = count (pos.matches) – count (neg.matches)
If the Score is 0 or more, we consider the document is positive or else, negative.

Classifier Learning

As most of the research shows that ZeroR, Random Forest and Naïve Bayes classification algorithms performs good in text classification. So, we are considering all three algorithms to classify the text and check each algorithm’s accuracy. We can compare all the results such as accuracy, precision, recall and other model evaluation methods. All three classification algorithms are implemented and tested using Weka tool.

Graph Analysis

Sentiment Graph

Emotions Graph

Polarity Vs Date

Closing Price

Average Price

Weka Analysis

Results of Polarity Detection Algorithm for Test Dataset – April’17

01-04-2017 1 Positive
05-04-2017 -3 Negative
19-04-2017 5 Positive
20-04-2017 -10 Negative
23-04-2017 2 Positive
24-04-2017 1 Positive
25-04-2017 -6 Negative
26-04-2017 17 Positive
27-04-2017 0 Positive
28-04-2017 -2 Negative
29-04-2017 -9 Negative
30-04-2017 6 Positive

Naïve Bayes Classification Results:

=== Predictions on test set ===

inst#     actual  predicted error prediction
    1        1:?      2:pos       1 
    2        1:?      2:pos       1 
    3        1:?      1:neg       1 
    4        1:?      2:pos       1 
    5        1:?      2:pos       1 
    6        1:?      1:neg       1 
    7        1:?      2:pos       1 
    8        1:?      2:pos       1 
    9        1:?      1:neg       0.996 
   10        1:?      2:pos       1 
   11        1:?      1:neg       1 
   12        1:?      2:pos       1

ZeroR Classification Results:

=== Predictions on test set ===

inst#     actual  predicted error prediction
    1        1:?      2:pos       0.653 
    2        1:?      2:pos       0.653 
    3        1:?      2:pos       0.653 
    4        1:?      2:pos       0.653 
    5        1:?      2:pos       0.653 
    6        1:?      2:pos       0.653 
    7        1:?      2:pos       0.653 
    8        1:?      2:pos       0.653 
    9        1:?      2:pos       0.653 
   10        1:?      2:pos       0.653 
   11        1:?      2:pos       0.653 
   12        1:?      2:pos       0.653

OneR Classification Results:

=== Predictions on test set ===

inst#     actual  predicted error prediction
    1        1:?      2:pos       1 
    2        1:?      2:pos       1 
    3        1:?      1:neg       1 
    4        1:?      2:pos       1 
    5        1:?      2:pos       1 
    6        1:?      2:pos       1 
    7        1:?      1:neg       1 
    8        1:?      2:pos       1 
    9        1:?      2:pos       1 
   10        1:?      1:neg       1 
   11        1:?      1:neg       1 
   12        1:?      2:pos       1

Random Forest Classification Results:

=== Predictions on test set ===

inst#     actual  predicted error prediction
    1        1:?      2:pos       0.8 
    2        1:?      2:pos       0.72 
    3        1:?      2:pos       0.58 
    4        1:?      2:pos       0.65 
    5        1:?      2:pos       0.77 
    6        1:?      2:pos       0.68 
    7        1:?      2:pos       0.65 
    8        1:?      2:pos       0.73 
    9        1:?      2:pos       0.57 
   10        1:?      2:pos       0.75 
   11        1:?      1:neg       0.51 
   12        1:?      2:pos       0.77

Contributions

Please feel free to create a Pull Request for adding implementations of the algorithms discussed in different frameworks or improving the existing implementations

Support

If you found this useful, please consider starring(★) the repo so that it can reach a broader audience

License

This project is licensed under the MIT License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 56

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗