All Projects → algonell → ipo-miner

algonell / ipo-miner

Licence: MIT License
IPO Investment via Text Mining.

Programming Languages

HTML
75241 projects
Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ipo-miner

named-entity-recognition
Notebooks for teaching Named Entity Recognition at the Cultural Heritage Data School, run by Cambridge Digital Humanities
Stars: ✭ 18 (-10%)
Mutual labels:  text-mining, jupyter-notebooks
machine-learning-scripts
Collection of scripts and tools related to machine learning
Stars: ✭ 60 (+200%)
Mutual labels:  jupyter-notebooks
aws-iot-analytics-notebook-containers
An extension for Jupyter notebooks that allows running notebooks inside a Docker container and converting them to runnable Docker images.
Stars: ✭ 25 (+25%)
Mutual labels:  jupyter-notebooks
Quran-and-Arabic-Language-Repository
Projects & Libraries related to Quran & Arabic Language
Stars: ✭ 26 (+30%)
Mutual labels:  text-mining
lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (+35%)
Mutual labels:  text-mining
restaurant-finder-featureReviews
Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (+5%)
Mutual labels:  text-mining
Seminars
Занятия по Machine Learning клуба AI Community Innopolis
Stars: ✭ 62 (+210%)
Mutual labels:  jupyter-notebooks
gofastr
Make a DocumentTermMatrix faster
Stars: ✭ 19 (-5%)
Mutual labels:  text-mining
adam home
ADAM python client and notebooks
Stars: ✭ 12 (-40%)
Mutual labels:  jupyter-notebooks
Introduction-to-text-mining-with-Python
Lectures in Urban Data Science Lab, Seoul
Stars: ✭ 25 (+25%)
Mutual labels:  text-mining
xyz-spaces-python
Manage your XYZ Hub or HERE Data Hub spaces from Python.
Stars: ✭ 29 (+45%)
Mutual labels:  jupyter-notebooks
civicmine
Text mining cancer biomarkers for the CIVIC database
Stars: ✭ 19 (-5%)
Mutual labels:  text-mining
Guten-gutter
Strips boilerplate from Project Gutenberg text files
Stars: ✭ 16 (-20%)
Mutual labels:  text-mining
jupyter-cache
A defined interface for working with a cache of executed jupyter notebooks
Stars: ✭ 28 (+40%)
Mutual labels:  jupyter-notebooks
SparseLSH
A Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
Stars: ✭ 127 (+535%)
Mutual labels:  text-mining
learning2hash.github.io
Website for "A survey of learning to hash for Computer Vision" https://learning2hash.github.io
Stars: ✭ 14 (-30%)
Mutual labels:  text-mining
text-mining-corona-articles
Text Mining for Indonesian Online News Articles About Corona
Stars: ✭ 15 (-25%)
Mutual labels:  text-mining
TextDatasetCleaner
🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (+35%)
Mutual labels:  text-mining
ts-forecasting-ensemble
CentOS based Docker container for Time Series Analysis and Modeling.
Stars: ✭ 19 (-5%)
Mutual labels:  jupyter-notebooks
NEMO-examples
Simple configurations to study specific oceanic physical processes and be used as a tool for training
Stars: ✭ 14 (-30%)
Mutual labels:  jupyter-notebooks

IPOMiner

Python utilities to predict future performance of upcoming IPO (Initial Public Offering).

Checkout the accompanying paper for more details.


What is this project?

This project is a collection of datasets and Python code to perform Text Mining on raw SEC S-1 filings.


What is the goal of this project?

The goal of this project is to apply Text Mining tools and techniques to spot investment opportunities in upcoming IPO. The system is comprised of three main modules. The first module is responsible for IPO data retrieval via EDGAR SEC system. The second module is responsible for Text Mining. The third module is a classifier of upcoming IPO performance.


How does it work?

Jupyter Notebooks are available for data retrieval, summarization, keywords extraction and Machine Learning.

Start by running all cells in the following notebooks:

  • S-1 Downloader.ipynb - Download raw IPO data.
  • Performance Downloader.ipynb - Download historical performance from Yahoo Finance.
  • Summarizer.ipynb - Summarize raw S-1 filings.
  • Keywords Extractor.ipynb - Extract keywords from S-1 filings.

Then run all cells in the following notebooks:

  • 1 Baseline.ipynb - Transform raw IPO listings.
  • 2 Sentiment Analysis.ipynb - Add Sentiment Analysis features.
  • 3 Summarization.ipynb - Add summarization features.
  • 4 Keywords.ipynb - Add keywords analysis.

Making predictions:

  • Run all cells in Predictor.ipynb - Get upcoming IPO and predict performance.

Who will use this project?

This project is intended for traders and researchers as potential fork for alpha generation.


Directories

  • Notebooks - Python scripts and Jupyter notebooks.
  • Data - Raw S-1 SEC filings since 2000. Sample filings are provided.
  • Datasets - CSV files used for training and evaluating Machine Learning models.
  • Keywords - Top keywords for S-1 SEC filings.
  • Summary - Summarized S-1 SEC filings.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].