legolas140 / Competitive Data Science 1
How to Win a Data Science Competition: Learn from Top Kagglers
Stars: ✭ 133
Labels
Projects that are alternatives of or similar to Competitive Data Science 1
Huawei Digix Agegroup
2019 HUAWEI DIGIX Nurbs Solutions
Stars: ✭ 132 (-0.75%)
Mutual labels: jupyter-notebook
Privateml
Various material around private machine learning, some associated with blog
Stars: ✭ 132 (-0.75%)
Mutual labels: jupyter-notebook
Ghost Free Shadow Removal
[AAAI 2020] Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN
Stars: ✭ 133 (+0%)
Mutual labels: jupyter-notebook
Security Api Solutions
Microsoft Graph Security API applications and services.
Stars: ✭ 132 (-0.75%)
Mutual labels: jupyter-notebook
Deeplearningbook Notes
Notes on the Deep Learning book from Ian Goodfellow, Yoshua Bengio and Aaron Courville (2016)
Stars: ✭ 1,672 (+1157.14%)
Mutual labels: jupyter-notebook
Keras Mnist Tutorial
For a mini tutorial at U of T, a tutorial on MNIST classification in Keras.
Stars: ✭ 132 (-0.75%)
Mutual labels: jupyter-notebook
Google Colab Cloudtorrent
Colab Notebook Remote torrent client
Stars: ✭ 132 (-0.75%)
Mutual labels: jupyter-notebook
Automatedstocktrading Deepq Learning
Every day, millions of traders around the world are trying to make money by trading stocks. These days, physical traders are also being replaced by automated trading robots. Algorithmic trading market has experienced significant growth rate and large number of firms are using it. I have tried to build a Deep Q-learning reinforcement agent model to do automated stock trading.
Stars: ✭ 133 (+0%)
Mutual labels: jupyter-notebook
Deep Histopath
A deep learning approach to predicting breast tumor proliferation scores for the TUPAC16 challenge
Stars: ✭ 132 (-0.75%)
Mutual labels: jupyter-notebook
Seq2seq tutorial
Code For Medium Article "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models"
Stars: ✭ 132 (-0.75%)
Mutual labels: jupyter-notebook
Reinforcement Learning Implementation
Reinforcement Learning examples implementation and explanation
Stars: ✭ 131 (-1.5%)
Mutual labels: jupyter-notebook
Tensorflow In Practise Specialization
Four Courses Specialization Tensorflow in practise Specialization
Stars: ✭ 133 (+0%)
Mutual labels: jupyter-notebook
Tensorflow realtime multi Person pose estimation
Multi-Person Pose Estimation project for Tensorflow 2.0 with a small and fast model based on MobilenetV3
Stars: ✭ 129 (-3.01%)
Mutual labels: jupyter-notebook
Ml riskmanagement
Short Course - Applied Machine Learning for Risk Management
Stars: ✭ 132 (-0.75%)
Mutual labels: jupyter-notebook
Repo 2019
BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics
Stars: ✭ 133 (+0%)
Mutual labels: jupyter-notebook
One Network Many Uses
Four-in-one deep network: image search, image captioning, similar words and similar images using a single model
Stars: ✭ 133 (+0%)
Mutual labels: jupyter-notebook
Algobook
A beginner-friendly project to help you in open-source contributions. Data Structures & Algorithms in various programming languages Please leave a star ⭐ to support this project! ✨
Stars: ✭ 132 (-0.75%)
Mutual labels: jupyter-notebook
Additional Materials and Links
Week 1
Recap of main ML algorithms
Overview of methods
- Scikit-Learn (or sklearn) library
- Overview of k-NN (sklearn's documentation)
- Overview of Linear Models (sklearn's documentation)
- Overview of Decision Trees (sklearn's documentation)
- Overview of algorithms and parameters in H2O documentation
Additional Tools
- Vowpal Wabbit repository
- XGBoost repository
- LightGBM repository
- Interactive demo of simple feed-forward Neural Net
- Frameworks for Neural Nets: Keras,PyTorch,TensorFlow,MXNet, Lasagne
- Example from sklearn with different decision surfaces
- Arbitrary order factorization machines
Software/Hardware requirements
StandCloud Computing:
AWS spot option:
Stack and packages:
- Basic SciPy stack (ipython, numpy, pandas, matplotlib)
- Jupyter Notebook
- Stand-alone python tSNE package
- Libraries to work with sparse CTR-like data: LibFM, LibFFM
- Another tree-based method: RGF (implemetation, paper)
- Python distribution with all-included packages: Anaconda
- Blog "datas-frame" (contains posts about effective Pandas usage)
Feature preprocessing and generation with respect to models
Feature preprocessing
- Preprocessing in Sklearn
- Andrew NG about gradient descent and feature scaling
- Feature Scaling and the effect of standardization for machine learning algorithms
Feature generation
- Discover Feature Engineering, How to Engineer Features and How to Get Good at It
- Discussion of feature engineering on Quora
Feature extraction from text and images
Feature extraction from text
Bag of words
Word2vec
- Tutorial to Word2vec
- Tutorial to word2vec usage
- Text Classification With Word2Vec
- Introduction to Word Embedding Models with Word2Vec
NLP Libraries
Feature extraction from images
Pretrained models
Finetuning
- How to Retrain Inception's Final Layer for New Categories in Tensorflow
- Fine-tuning Deep Learning Models in Keras
Week 2
Exploratory data analysis
Visualization tools
Others
Validation
Data leakages
- Perfect score script by Oleg Trott -- used to probe leaderboard
- Page about data leakages on Kaggle
Week 3
Metrics optimization
Classification
- Evaluation Metrics for Classification Problems: Quick Examples + References
- Decision Trees: “Gini” vs. “Entropy” criteria
- Understanding ROC curves
Ranking
- Learning to Rank using Gradient Descent -- original paper about pairwise method for AUC optimization
- Overview of further developments of RankNet
- RankLib (implemtations for the 2 papers from above)
- Learning to Rank Overview
Clustering
Week 4
Hyperparameter tuning
- Tuning the hyper-parameters of an estimator (sklearn)
- Optimizing hyperparameters with hyperopt
- Complete Guide to Parameter Tuning in Gradient Boosting (GBM) in Python
Tips and tricks
- Far0n's framework for Kaggle competitions "kaggletils"
- 28 Jupyter Notebook tips, tricks and shortcuts
Advanced features II
Matrix Factorization:
t-SNE:
- Multicore t-SNE implementation
- Comparison of Manifold Learning methods (sklearn)
- How to Use t-SNE Effectively (distill.pub blog)
- tSNE homepage (Laurens van der Maaten)
- Example: tSNE with different perplexities (sklearn)
Interactions:
- Facebook Research's paper about extracting categorical features from trees
- Example: Feature transformations with ensembles of trees (sklearn)
Ensembling
- Kaggle ensembling guide at MLWave.com (overview of approaches)
- StackNet — a computational, scalable and analytical meta modelling framework (by KazAnova)
- Heamy — a set of useful tools for competitive data science (including ensembling)
Week 5
Competitions go through
You can often find a solution of the competition you're interested on its forum. Here we put links to collections of such solutions that will prove useful to you.
Past solutions
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].