All Projects → je-suis-tm → machine-learning

je-suis-tm / machine-learning

Licence: Apache-2.0 license
Python machine learning applications in image processing, recommender system, matrix completion, netflix problem and algorithm implementations including Co-clustering, Funk SVD, SVD++, Non-negative Matrix Factorization, Koren Neighborhood Model, Koren Integrated Model, Dawid-Skene, Platt-Burges, Expectation Maximization, Factor Analysis, ISTA, F…

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to machine-learning

grasp
Essential NLP & ML, short & fast pure Python code
Stars: ✭ 58 (-36.26%)
Mutual labels:  naive-bayes, k-nearest-neighbors
ADMM
Implemented ADMM for solving convex optimization problems such as Lasso, Ridge regression
Stars: ✭ 66 (-27.47%)
Mutual labels:  ridge-regression, lasso-regression
topic modelling financial news
Topic modelling on financial news with Natural Language Processing
Stars: ✭ 51 (-43.96%)
Mutual labels:  dbscan, latent-semantic-analysis
models-by-example
By-hand code for models and algorithms. An update to the 'Miscellaneous-R-Code' repo.
Stars: ✭ 43 (-52.75%)
Mutual labels:  expectation-maximization, ridge-regression
classifier
A general purpose text classifier
Stars: ✭ 31 (-65.93%)
Mutual labels:  naive-bayes, k-nearest-neighbors
binary.com-interview-question
The sample question for Interview a job in Binary options
Stars: ✭ 52 (-42.86%)
Mutual labels:  ridge-regression, lasso-regression
machine-learning-implemetation-python
Basic Machine Learning implementation with python
Stars: ✭ 51 (-43.96%)
Mutual labels:  multinomial-naive-bayes, newton-method
AI Learning Hub
AI Learning Hub for Machine Learning, Deep Learning, Computer Vision and Statistics
Stars: ✭ 53 (-41.76%)
Mutual labels:  expectation-maximization-algorithm
kmeans-dbscan-tutorial
A clustering tutorial with scikit-learn for beginners.
Stars: ✭ 20 (-78.02%)
Mutual labels:  dbscan
GaussianNB
Gaussian Naive Bayes (GaussianNB) classifier
Stars: ✭ 17 (-81.32%)
Mutual labels:  naive-bayes
Recommendation.jl
Building recommender systems in Julia
Stars: ✭ 42 (-53.85%)
Mutual labels:  singular-value-decomposition
data-science-popular-algorithms
Data Science algorithms and topics that you must know. (Newly Designed) Recommender Systems, Decision Trees, K-Means, LDA, RFM-Segmentation, XGBoost in Python, R, and Scala.
Stars: ✭ 65 (-28.57%)
Mutual labels:  linear-discriminant-analysis
fmin adam
Matlab implementation of the Adam stochastic gradient descent optimisation algorithm
Stars: ✭ 38 (-58.24%)
Mutual labels:  stochastic-gradient-descent
DUN
Code for "Depth Uncertainty in Neural Networks" (https://arxiv.org/abs/2006.08437)
Stars: ✭ 65 (-28.57%)
Mutual labels:  expectation-maximization
Feature-Engineering-for-Fraud-Detection
Implementation of feature engineering from Feature engineering strategies for credit card fraud
Stars: ✭ 31 (-65.93%)
Mutual labels:  dbscan
ClassifierToolbox
A MATLAB toolbox for classifier: Version 1.0.7
Stars: ✭ 72 (-20.88%)
Mutual labels:  linear-discriminant-analysis
amazon-reviews
Sentiment Analysis & Topic Modeling with Amazon Reviews
Stars: ✭ 26 (-71.43%)
Mutual labels:  multinomial-naive-bayes
hicma
HiCMA: Hierarchical Computations on Manycore Architectures
Stars: ✭ 21 (-76.92%)
Mutual labels:  low-rank-approximation
scoruby
Ruby Scoring API for PMML
Stars: ✭ 69 (-24.18%)
Mutual labels:  naive-bayes
text-classification-cn
中文文本分类实践,基于搜狗新闻语料库,采用传统机器学习方法以及预训练模型等方法
Stars: ✭ 81 (-10.99%)
Mutual labels:  naive-bayes

Machine Learning

Intro

Machine learning is so chic that every programmer even non-programmer starts to learn. After several months of online courses, everyone becomes self-proclaimed data scientist. The managers hold high hopes and deploy data scientists to machine learning this or that. In no time, people run into cul-de-sac, things don't work so well outside of the realm of iris dataset! If you have been to my other repositories like quant trading or graph theory, you must have seen me bashing reckless applications of machine learning. Stop selling AI snake oil! Don't get me wrong. I ain't no machine-learning-sceptic. I see great potential in machine learning but I am merely cynical to the current overstatement of artificial intelligence where it is frankly nowhere in sight.

The most popular supervised learning has very rigid requirement in both data quality and data quantity. Reinforcement learning is a drain on existing hardware. On the contrary, unsupervised learning is something I mess around frequently. It greatly boosts my work efficiency by dimension reduction, although I struggle to interpret the substantial meaning of the clustering pattern from time to time. In short, machine learning is no panacea. Its strongest suit is classification with discrete answers. When it comes to predicting stock price tomorrow or computing basic reproduction number yesterday, we still have to take the conventional path.

This repository is based upon the course material by Stanford University. Professor Andrew Ng may not teach the most comprehensive lectures but he has inspired millions to study data science. This repository attempts to replicate every algorithm mentioned in the course as well as the popular ones outside of the course. The experienced coders urge us not to reinvent the wheel but I firmly believe we never truly understand how a wheel works until we reinvent it. If you only learn OPTICS from some articles on towardsdatascience.com, you would've skipped DBSCAN since OPTICS does not require the key input ε. Well, by reinventing the wheels, you would come to senses that this is purely quid pro quoi. The introduction of new input ξ is crucial to determine the clustering. Yet, few people talk about it. In that sense, data modelling is not really scientific and will never be that way. Machine learning is a state of art where you fine tune the parameters to create discrete answers to the real-life problems. I sincerely hope this repository can help you see that.


Algorithms

Supervised

Unsupervised

Applications

1. Reverse Engineering project

Creating a visualization from data is easy. In Tableau, it's only one click. What happens if you want to extract data from a visualization? A simple google search yields a few reverse engineering tools, yet they share the same malaise – they only work with single curve and require a lot of clicks. This project addresses these issues by incorporating unsupervised learning into image processing. Multiple curves are separated by different color channels with clustering techniques. Data can be easily extracted via computing coordinates of each pixel. A simple conversion from resolution scale to axis scale approximates the coordinates to the original spreadsheet. Voila, no more ridiculous subscription to Statista 😲

alt text

For more details, please refer to the read me page of a separate directory or machine learning section on my personal blog.

2. Wisdom of Crowds project

Every now and then, we read some bulge brackets hit the headline, “XXX will reach 99999€ in 20YY”. Some forecasts hit the bull’s eye but most projections are as accurate as astrology. Price prediction can be easily influenced by the cognitive bias. In the financial market, there is merit to the idea that consensus estimate is the best oracle. By harnessing the power of ensemble learning, we are about to leverage Dawid-Skene model and Platt-Burges model to eliminate the idiosyncratic noise associate with each individual judgement. The end game is to reveal the underlying intrinsic value generated by the collective knowledge of research analysts from different investment banks. Is wisdom of crowds a crystal ball for trading?

alt text

For more details, please refer to the read me page of a separate directory or machine learning section on my personal blog.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].