Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

Stars: ✭ 173 (-16.02%)

Mutual labels: jupyter-notebook, reinforcement-learning

Pytorch Rl

Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]

Stars: ✭ 121 (-41.26%)

Mutual labels: jupyter-notebook, reinforcement-learning

Alpha Zero General

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Stars: ✭ 2,617 (+1170.39%)

Mutual labels: jupyter-notebook, reinforcement-learning

Reinforcementlearning Atarigame

Pytorch LSTM RNN for reinforcement learning to play Atari games from OpenAI Universe. We also use Google Deep Mind's Asynchronous Advantage Actor-Critic (A3C) Algorithm. This is much superior and efficient than DQN and obsoletes it. Can play on many games

Stars: ✭ 118 (-42.72%)

Mutual labels: jupyter-notebook, reinforcement-learning

Multihopkg

Multi-hop knowledge graph reasoning learned via policy gradient with reward shaping and action dropout

Stars: ✭ 202 (-1.94%)

Mutual labels: jupyter-notebook, reinforcement-learning

Coursera reinforcement learning

Coursera Reinforcement Learning Specialization by University of Alberta & Alberta Machine Intelligence Institute

Stars: ✭ 114 (-44.66%)

Mutual labels: jupyter-notebook, reinforcement-learning

Chess Alpha Zero

Chess reinforcement learning by AlphaGo Zero methods.

Stars: ✭ 1,868 (+806.8%)

Mutual labels: jupyter-notebook, reinforcement-learning

Tensorflow2.0 Examples

🙄 Difficult algorithm, Simple code.

Stars: ✭ 1,397 (+578.16%)

Mutual labels: jupyter-notebook, reinforcement-learning

D2l Torch

《动手学深度学习》 PyTorch 版本

Stars: ✭ 105 (-49.03%)

Mutual labels: chinese, jupyter-notebook

Machine Learning And Reinforcement Learning In Finance

Machine Learning and Reinforcement Learning in Finance New York University Tandon School of Engineering

Stars: ✭ 173 (-16.02%)

Mutual labels: jupyter-notebook, reinforcement-learning

Release

Deep Reinforcement Learning for de-novo Drug Design

Stars: ✭ 201 (-2.43%)

Mutual labels: jupyter-notebook, reinforcement-learning

View All Similar Projects ➔

icyChessZero 中国象棋alpha zero

这个项目受到alpha go zero的启发，旨在训练一个中等人类水平或高于中等人类水平的深度神经网络，来完成下中国象棋的任务。目前这个项目仍在积极开发中，并且仍然没有完成全部的开发，欢迎pull request 或者star。然而受到计算资源限制，这样庞大的任务不可能在一台机器上完成训练，这也是我完成了分布式训练代码的原因，希望各位小伙伴能够加入，一起训练这样一个中国象棋alpha go的网络。

我的估计是达到4000～5000elo分数的时候深度网络可以达到目标，现在深度网络已经到了3000分的边缘，达到人类中上水平的目标并不是不可能的。

目前的elo：

详细胜率表：

当然，目前棋力还比较一般，因为是从完全随机开始训练的，比方说某个对局片段(800 playouts)：

加入我们的集群训练(北邮校内only)

目前我们的集群已经有四台gpu机器(两台windows，两台linux)在夜以继日地运行，我们需要更多，如果你恰好有北邮机房内闲置的gpu服务器的权限，希望你加入我们，一起训练中国象棋的alpha go zero。

集群分为master和slave，加入集群的机器均为slave，master和slave分工如下：

slave : 负责自动从master拉取最新模型权重，完成自对弈，并且把棋谱自动上传到master
master: 负责给slave提供权重，并且负责模型的更新，评估，以及从slave接收棋谱

如果你想要加入我们的训练：

首先联系我 qq/微信： 892009517 ;邮箱：[email protected], 由于项目仍然在快速迭代，所以经常需要更新代码，与我联系获得最新的消息或代码更新的时间很必要。
如果你实在不想联系我，那么可以进行如下操作，直接加入集群(不推荐)：

windows 机器加入集群(北邮校内only)

clone工程后在cmd中执行下面命令

cd script

./multithread_start.bat [thread_number] [gpu_core] [python_env]

比如:

./multithread_start.bat 10 0 python3

意味着在0号GPU上用python3环境跑10个进程（一般一个1080ti GPU可以支持到至多24个进程），然而GPU并不是唯一瓶颈，不推荐跑超过物理核数两倍的进程数量。

linux 机器加入集群（北邮校内only）

clone工程后在shell中执行

cd script

比如:

sh multithread_start.sh -t 10 -g 0 -p python3

意味着在0号GPU上用python3环境跑10个进程（与上面windows版本对应)

总之

总之，如果想要加入集群，请先联系我，因为如果直接加入集群，可能会有我这边代码更新了然后部分slave没更新的情况，这样会造成不一致，后果会不可知。

自组集群

如果你在北邮没有机器，然而在校外有一些机器，希望能跑起来这样一个分布式程序，那么请按照下面的步骤做：

确定你要这么做，这是一个耗时，昂贵，不讨好,但是有点意思的工作
推荐你的机器（们）的环境满足推荐配置，并且安装好应该装的包
master 机器一定要是linux（目前没有支持master也是windows）
fork一份icyChessZero的代码,找到 config/conf.py这个文件，把server的ip改成你希望的master的ip
master 和slave分别clone这份fork的代码
在master上cd scripts运行 initize_weight.py 生成第一份随机权重
在master上cd distribute运行 distributed_server.py开启master服务端口
在slave机器上起slave进程的方法同上文"加入集群"
master上如果有空闲的资源可以起几个slave进程
模型更新和validate的方法在scripts/daily_update.sh中，按照你的需求改这个shell文件，并且把它放到crontab中设置为每小时运行一次（它会检查棋谱数量，数量足够后它会执行模型更新和评估工作）

查看棋谱

slave机器运行出来的棋谱在 data/distributed 目录下，是cbf文件，可以通过"象棋桥"软件查看，也可以在 ipynbs/see_gameplay.ipynb 中查看

查看训练状态

master 机器可以在ipynbs/elo_graph.ipynb 中查看集群训练的模型的elo到什么水平了。

没做的事

还有挺多东西可以做的，工程也还在快速开发,比如：

给棋谱加上一些meta，比如每一步的mcts分析，方便查个别case

~~2.长将和长捉的判断还没有做~~

给代码加上版本限制，master只接受与自己版本相同的slave的棋谱
专门搞一个web ui实时展示elo和棋谱等
readme写清楚模块划分 .....

等等等等如果你发现有你想做的，提提pull request或者联系我撒

这个work的一些细节已经以草稿的形式发布在了： http://icybee.cn/article/69.html

联系方式：

QQ/wechat : 892009517
邮箱 : [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 206

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

bupticybee / Icychesszero

Labels

Projects that are alternatives of or similar to Icychesszero

icyChessZero 中国象棋alpha zero

推荐代码运行环境

推荐机器环境

加入我们的集群训练(北邮校内only)

windows 机器加入集群(北邮校内only)

linux 机器加入集群（北邮校内only）

总之

自组集群

查看棋谱

查看训练状态

没做的事

联系方式：