1. PycfrA python implementation of Counterfactual Regret Minimization for poker
2. sdpDeep nonparametric estimation of discrete conditional distributions via smoothed dyadic partitioning
3. td cfrAn implementation of Counterfactual Regret Minimization (CFR) via Temporal Difference (TD) learning
4. rl-tictactoeA reinforcement learning agent for tic-tac-toe. Implements the example from Chapter 1 of Sutton and Barto.
6. diffbotA .NET library for the Diffbot Frontpage and Article APIs
7. tstd0An experiment with Thompson sampling and TD(0) on a grid world variant