All Projects → Gustrigos → Eigen-Portfolio

Gustrigos / Eigen-Portfolio

Licence: other
Unsupervised machine learning Principal Component Analysis (PCA) on the Dow Jones Industrial Average index and it's respective 30 stocks to construct an optimized diversified intelligent portfolio.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Eigen-Portfolio

Quantlib Swig
QuantLib wrappers to other languages
Stars: ✭ 176 (+225.93%)
Mutual labels:  quantitative-finance
Tf Quant Finance
High-performance TensorFlow library for quantitative finance.
Stars: ✭ 2,925 (+5316.67%)
Mutual labels:  quantitative-finance
esapp
An unsupervised Chinese word segmentation tool.
Stars: ✭ 13 (-75.93%)
Mutual labels:  unsupervised-learning
Finance
Here you can find all the quantitative finance algorithms that I've worked on and refined over the past year!
Stars: ✭ 194 (+259.26%)
Mutual labels:  quantitative-finance
Optimalportfolio
An open source library for portfolio optimisation
Stars: ✭ 228 (+322.22%)
Mutual labels:  quantitative-finance
Stock Bot
An application that allows you to design and test your own stock trading algorithms in an attempt to beat the market.
Stars: ✭ 240 (+344.44%)
Mutual labels:  quantitative-finance
Presentations
Slide show presentations regarding data driven investing.
Stars: ✭ 162 (+200%)
Mutual labels:  quantitative-finance
Joint-Motion-Estimation-and-Segmentation
[MICCAI'18] Joint Learning of Motion Estimation and Segmentation for Cardiac MR Image Sequences
Stars: ✭ 45 (-16.67%)
Mutual labels:  unsupervised-learning
Quantlib
The QuantLib C++ library
Stars: ✭ 2,944 (+5351.85%)
Mutual labels:  quantitative-finance
TA3N
[ICCV 2019 Oral] TA3N: https://github.com/cmhungsteve/TA3N (Most updated repo)
Stars: ✭ 45 (-16.67%)
Mutual labels:  unsupervised-learning
Mlfinlab
MlFinLab helps portfolio managers and traders who want to leverage the power of machine learning by providing reproducible, interpretable, and easy to use tools.
Stars: ✭ 2,676 (+4855.56%)
Mutual labels:  quantitative-finance
Alphatools
Quantitative finance research tools in Python
Stars: ✭ 226 (+318.52%)
Mutual labels:  quantitative-finance
Trading Backtest
A stock backtesting engine written in modern Java. And a pairs trading (cointegration) strategy implementation using a bayesian kalman filter model
Stars: ✭ 247 (+357.41%)
Mutual labels:  quantitative-finance
Quant Notes
Quantitative Interview Preparation Guide, updated version here ==>
Stars: ✭ 180 (+233.33%)
Mutual labels:  quantitative-finance
CS-DisMo
[ICCVW 2021] Rethinking Content and Style: Exploring Bias for Unsupervised Disentanglement
Stars: ✭ 20 (-62.96%)
Mutual labels:  unsupervised-learning
Alpha Mind
quantitative security portfolio analysis. The analysis pipeline including data storage abstraction, alpha calculation, ML based alpha combining and portfolio calculation.
Stars: ✭ 171 (+216.67%)
Mutual labels:  quantitative-finance
Ezibpy
ezIBpy, a Pythonic Client for Interactive Brokers API
Stars: ✭ 238 (+340.74%)
Mutual labels:  quantitative-finance
unsupervised-qa
Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering
Stars: ✭ 47 (-12.96%)
Mutual labels:  unsupervised-learning
CGMM
Official Repository of "Contextual Graph Markov Model" (ICML 2018 - JMLR 2020)
Stars: ✭ 35 (-35.19%)
Mutual labels:  unsupervised-learning
Darwinexlabs
Datasets, tools and more from Darwinex Labs - Prop Investing Arm & Quant Team @ Darwinex
Stars: ✭ 248 (+359.26%)
Mutual labels:  quantitative-finance

Eigen-Portfolios

Automated unsupervised machine learning Principal Component Analysis (PCA) on the Dow Jones Industrial Average index and it's respective 30 stocks to construct an optimized diversified intelligent portfolio.

Files

  1. tiingoconnect.py: creates a function to extract historical financial data from the financial API Tiingo.
  2. getdata.py: uses three functions to extract the tickers from the components of the DJIA, downloads historical OHLC values for each ticker and the DJIA Index into individual csv batches, and compiles all the adjusted closing price for every ticker into the main csv file to use it as data for ml.py.
  3. ml.py: takes as input the dataset obtained from getdata.py and computes logarithmic average and basic average returns for each company in the DJIA as well as for the index. It then creates a covariance matrix with every ticker and trains it with Principle Component Analysis to extract its principle components. With this, we can then compute the principle components to create Eigen Portfolios. We then pass a function to compute geometric average returns, valatility, and the sharpe ratio to output an optimized portfolio with the highest sharpe ratio (Highest average return and lowest volatility).
  4. ml.ipnb: Jupyter Notebook implementation of ml.py to visualize outputs in order.
  5. data.csv: dataset containing every ticker's adjusted closing price. (You can skip the steps of gathering data).

Data

With the functions on getdata.py, we will be able to parse data with beautifulsoup4 through CNN Money to obtain the tickers of the Dow Jones Industrial Average Index. It will create a list of the 30 tickers and will pass the tickers as inputs for the financial API to download historical values.

Using Tiingo API, which gives you access to countless of historical information on stocks, we can download the necessary data for the Dow Jones Industrial Average. We download the tiingoconnect.py file and import it in the getdata.py file to download all the required information.

After downloading the information, we will have a pandas dataframe with basic information for every ticker (Open, High, Low, Close, Volume, and Adjusted values). With this we will choose all the adjusted closing values of each ticker and we will concatenate them into a big dataframe which we will use as our starting dataset for the ml.py file.

Machine Learning

We import the pandas dataframe containing all the adjusted closing prices for all the companies in the DJIA as well as the DJIA's index. Because our starting date was the year 2000, the adjusted closing prices for Dow Chemicals and Visa appear as Not a Number values. For this we will take away both respective columns. We will end up with 28 columns of companies information and an additional one for the DJIA index.

Logarithmic Returns

The price for an asset i at a given time t is pi,t and the logarithmic returns for each stock i will be written as: ri,t = ln * (pi,t)/pi,t−1.

Linear Returns

The price for an asset i at a given time t is pi,t and the linear returns for each stock i will be written as: ri,t=(pi,t−pi,t−1)/pi,t−1.

Normalized Returns

In order to create a covariance matrix that can be applied for the Principle Component Analysis, we will have to normalize the returns we choose to work with. To normalize we substract the mean of each return and divide the residual by the return's standard deviation.

Data Preprocessing

  1. We preprocess the data by getting rid of all the Not a Number rows in the returns dataframe.
  2. We split the training and testing sets with a predetermined percentage (in this case 80%, but can be modified at will).
  3. We then create the covariance matrix of all the companies exclusing the column of the DJIA Index by computing the pandas .cov() function. With this we will have a 28 x 28 matrix.

Principal Component Analysis

We create a function to compute Principle Component Analysis from Sklearn with a variance explained treshold of 95%. This function computes an inversed elbow chart that shows the amount of principle components and how many of them explain the variance treshold.

Eigen Portfolios

We compute several functions to determine the weights of each principle component. We then visualize a scatterplot that visualizes an organized descending plot with the respective weight of every company at the current chosen principle component.

Sharpe Ratio

The sharpe ratio explains the annualized returns against the annualized volatility of each company in a portfolio. A high sharpe ratio explains higher returns and lower volatility for the specified portfolio.

  1. Annualized Returns: We have to apply the geometric average of all the returns in respect to the periods per year (days of operations in the exchange in a year).
  2. Annualized Volatility: We have to take the standard deviation of the returns and multiply it by the square root of the periods per year.
  3. Annualized Sharpe: we compute the ratio by dividing the annualized returns against the annualized volatility.

Optimized Portfolio

We compute an iterable loop to compute the principle component's weights for each Eigen Portfolio, which then uses the sharpe ratio function to look for the portfolio with the highest sharpe ratio. Once we know which portfolio has the highest sharpe ratio, we can visualize its performance against the DJIA Index to understand how it outperforms it.

Acknowledgements

This project was inspired by the lectures on Portfolio Optimization on the course of Machine Learning in Finance by NYU Tandon School of Engineering. The theoretical basis and thorough implementation was modeled after Avellaneda & Lee (2008) "Statistical Arbitrage in the U.S. Equities Market."

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].