All Projects → yandexdataschool → Sklearn Deeprl

yandexdataschool / Sklearn Deeprl

Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.

Projects that are alternatives of or similar to Sklearn Deeprl

Nn from scratch
Multilayer Neural Network using numpy.
Stars: ✭ 51 (-1.92%)
Mutual labels:  jupyter-notebook
Nlp Various Tutorials
자연어 처리와 관련한 여러 튜토리얼 저장소
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Hsuantienlin Ml Camp
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Ml securityinformatics
Short Course - Applied Machine Learning for Security Informatics
Stars: ✭ 51 (-1.92%)
Mutual labels:  jupyter-notebook
Sirmodel covid 19
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Onnx tflite yolov3
A Conversion tool to convert YOLO v3 Darknet weights to TF Lite model (YOLO v3 PyTorch > ONNX > TensorFlow > TF Lite), and to TensorRT (YOLO v3 Pytorch > ONNX > TensorRT).
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Ner blstm Crf
LSTM-CRF for NER with ConLL-2002 dataset
Stars: ✭ 51 (-1.92%)
Mutual labels:  jupyter-notebook
Cs224n Gpu That Talks
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Continuousparetomtl
[ICML 2020] PyTorch Code for "Efficient Continuous Pareto Exploration in Multi-Task Learning"
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Metrotwitter
What Twitter reveals about the differences between cities and the monoculture of the Bay Area
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Spark Mllib Scala Play
Twitter sentiment analysis based on Apache Spark, MLlib, Scala and Akka.
Stars: ✭ 51 (-1.92%)
Mutual labels:  jupyter-notebook
Coronamaskon
Mask On-Off control with computer vision
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Kaggle Kannada Mnist 3rd Solution
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Tsr Py Faster Rcnn
This repo contains code related to german traffic sign detection and classification using Faster-RCNN
Stars: ✭ 51 (-1.92%)
Mutual labels:  jupyter-notebook
Cvnd Udacity
Computer Vision Nanodegree program from Udacity
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Reinforcement Learning Introduction
Code from my blog post & online course
Stars: ✭ 51 (-1.92%)
Mutual labels:  jupyter-notebook
Feature Selection For Machine Learning
Code Repository for the online course Feature Selection for Machine Learning
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Lung Diseases Classifier
Diseases Detection from NIH Chest X-ray data
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Deeplens Workshops
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook
Bert Stack Overflow
Train a BERT model with TensorFlow 2.0 to automatically tag StackOverflow questions!
Stars: ✭ 52 (+0%)
Mutual labels:  jupyter-notebook

Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.

Dive-in button: Binder

Currently both demos are vanilla crossentropy(CE) method for policy approximated by a neural network. For RL, it boild down to Repeat:

  • Generate N games
  • Take M best
  • Fit to those M best samples

The CE is a very general approach for approximate estimation and maximization tasks, you can read about it here. For reinforcement learning, we use the optimization version, basically trying to fit agent to generating games where reward is high. More on that here.

While this approach falls flat in some cases and it takes black magic to make it work with infinite MDPs or long session lengths, it still works unreasonably well in most cases. One more awesome trait is that it extendds effortlessly to policy approximation (e.g. deep RL), partially observable MDPs and all kinds of weird stuff you see in the wild.

If you want something heavier, take a look at agentnet.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].