All Projects β†’ psavine42 β†’ fun-with-dnc

psavine42 / fun-with-dnc

Licence: other
Pytorch Implementation of Deepmind's 'Hybrid computing using a neural network with dynamic external memory' (Differentiable Neural Computer) + some applications

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to fun-with-dnc

Robotics Coursework
πŸ€– Places where you can learn robotics (and stuff like that) online πŸ€–
Stars: ✭ 1,810 (+9955.56%)
Mutual labels:  udacity
Frontend Nanodegree Styleguide Zh
优达学城(UdacityοΌ‰ε‰η«―ζ ·εΌζŒ‡ε—
Stars: ✭ 188 (+944.44%)
Mutual labels:  udacity
pm-discussify
Discussify's project management repository
Stars: ✭ 17 (-5.56%)
Mutual labels:  planning
Udacity Nanodegrees Resources
Resources for Udacity Students and Alumni! πŸ‘πŸ’»
Stars: ✭ 118 (+555.56%)
Mutual labels:  udacity
Udacity Machine Learning Nanodegree
All projects and lecture notes of the Udacity Machine Learning Engineer Nanodegree.
Stars: ✭ 171 (+850%)
Mutual labels:  udacity
Online Courses Learning
Contains the online course about Data Science, Machine Learning, Programming Language, Operating System, Mechanial Engineering, Mathematics and Robotics provided by Coursera, Udacity, Linkedin Learning, Udemy and edX.
Stars: ✭ 193 (+972.22%)
Mutual labels:  udacity
Yolo resnet
Implementing YOLO using ResNet as the feature extraction network
Stars: ✭ 82 (+355.56%)
Mutual labels:  udacity
wtm-udacity-scholars-nanodegree-resources
A List of Resources for Udacity Nanodegrees
Stars: ✭ 15 (-16.67%)
Mutual labels:  udacity
Android basics nanodegree by google My 10 projects
All of my completed Android Basics Nanodegree by Google projects.
Stars: ✭ 178 (+888.89%)
Mutual labels:  udacity
Timetable-App
This is a timetable App for android phones.
Stars: ✭ 19 (+5.56%)
Mutual labels:  planning
Carnd Project5 Vehicle detection and tracking
Vehicle Detection with Convolutional Neural Network
Stars: ✭ 134 (+644.44%)
Mutual labels:  udacity
A To Z Resources For Students
βœ… Curated list of resources for college students
Stars: ✭ 13,155 (+72983.33%)
Mutual labels:  udacity
Aind Nlp
Coding exercises for the Natural Language Processing concentration, part of Udacity's AIND program.
Stars: ✭ 202 (+1022.22%)
Mutual labels:  udacity
Udacity
➿ πŸ’‘ My Udacity projects that I have made to improve my skills and complete my nanodegree. Please don't use it to copy the projects. Submit the PR if you want something to be added to this repository.
Stars: ✭ 113 (+527.78%)
Mutual labels:  udacity
autonomous-delivery-robot
Repository for Autonomous Delivery Robot project of IvLabs, VNIT
Stars: ✭ 65 (+261.11%)
Mutual labels:  planning
Faqs pytorch scholarship
FAQs in Slack channel of Udacity's PyTorch Scholarship Challenge.
Stars: ✭ 97 (+438.89%)
Mutual labels:  udacity
Deep Learning Notes
My personal notes, presentations, and notebooks on everything Deep Learning.
Stars: ✭ 191 (+961.11%)
Mutual labels:  udacity
tf-semantic-segmentation-FCN-VGG16
Semantic segmentation for classifying road. "Fully Convolutional Networks for Semantic Segmentation (2015)" implemented using TF
Stars: ✭ 30 (+66.67%)
Mutual labels:  udacity
udacity-baking-app-tasty
Udacity Android Developer Nanodegree project baking app
Stars: ✭ 33 (+83.33%)
Mutual labels:  udacity
Stanford Cs231
Resources for students in the Udacity's Machine Learning Engineer Nanodegree to work through Stanford's Convolutional Neural Networks for Visual Recognition course (CS231n).
Stars: ✭ 249 (+1283.33%)
Mutual labels:  udacity

Fun-With-Dnc (Differentiable Neural Computing)

Pytorch implementation of deepmind paper [Hybrid computing using a neural network with dynamic external memory]: https://pdfs.semanticscholar.org/7635/78fa9003f6c0f735bc3250fc2116f6100463.pdf. The code is based on the tensorflow implementation [here]: https://github.com/deepmind/dnc.

Todo finish retraining, and better writeup.

Problems and Expirements

There are a few tasks setup. One is the "Air Cargo Prolbem" from Arificial Intelligence (Russell & Norvig). The origional code for the problem is based on the [Udacity Implementation]: https://github.com/udacity/AIND-Planning , and the full description is in the problem repo.

The Air Cargo problem can be seen as structured prediction, (every prediction step can be seen as changing the state of the problem). The algorithms used to solve it in the book included graphplan, and Astar search of the state space, putting it in the same family of problems as the Blocks Problem (SHRDLU) solved in the origional paper.

Installation

Requires pytorch (no cuda)

    conda install pytorch torchvision cuda80 -c soumith

Tensorboard

Run tensorboard with "--log 10" flag (the number is logging frequency). Below is a shot during training. Losses are recorded seperately for each entity and type, as well as the action for more addictive monitoring.

alt text alt text

To run with tensorboard: pip install tensorboardX (for tensorboard) pip install tensorflow (for tensorboard web server)

Training Scenarios

Planning

The code implements a training schedule as in the paper. Start small with the minimum sized problem (2 entities of each kind)

    python run.py --act plan --iters 1000  --ret_graph 1 --zero_at step --n_phases 20 --opt_at step
    python run.py --act plan --iters 1000  --ret_graph 1 --zero_at step --n_phases 20 --opt_at step --save opt_zero_step
    python run.py --act plan --iters 1000  --ret_graph 0 --opt_at problem --save opt_problem_plan --n_phases 20

We humans would think about the problem in terms of actions and type, so I thought the first thing the DNC would start getting correct would be the (Action, typeofthing1, typeofthing2, typeofthing3) 'tuple', since those must be correct in order to reliably get the instance correct. This was indeed the case as can be seen on the 'accuracies' plots during training. By the scemantics of the problem, the last 'type' is always Airplane, so that goes to 100% accuracy immediately. The next chunk of 1/3td of the training bumps up the types to 0.9-1.0 range. Only then does the loss for the entities themselves start dropping consistently. Even then, the ent1 and ent3 were coupled, which in the logic of the problem...

To show details at each step of what was predicted vs best moves, specify the --detail flag. You will get something like this:

    trial 978, step 19514 trial accy: 6/7, 0.86, pass total 296/978, running avg 0.7463, loss 0.0774  
    best    Load    ['C1', 'P1', 'A1'], Fly ['P1', 'A1', 'A0']
    chosen: Load    ['C1', 'P1', 'A1'], guided True,    prob 0.25, T? True  ---loss 0.2553
    best    Fly     ['P1', 'A1', 'A0'], Unload ['C1', 'P1', 'A0']
    chosen: Fly     ['P1', 'A1', 'A0'], guided True,    prob 0.33, T? True  ---loss 0.0784
    best    Unload  ['C1', 'P1', 'A0'], Fly ['P0', 'A0', 'A1']
    chosen: Unload  ['C1', 'P1', 'A0'], guided True,    prob 0.33, T? True  ---loss 0.0830
    best    Fly     ['P0', 'A0', 'A1'], Load ['C0', 'P0', 'A1']
    chosen: Fly     ['C0', 'P0', 'A1'], guided True,    prob 0.25, T? False ---loss 0.3716
    best    Load    ['C0', 'P0', 'A1'], Fly ['P0', 'A1', 'A0']
    chosen: Load    ['C0', 'P0', 'A1'], guided True,    prob 0.25, T? True  ---loss 0.1288
    best    Fly     ['P0', 'A1', 'A0'], Unload ['C0', 'P0', 'A0']
    chosen: Fly     ['P0', 'A1', 'A0'], guided False,   prob 0.25, T? True  ---loss 1.1554
    best    Fly     ['P0', 'A1', 'A0'], Unload ['C0', 'P0', 'A0']
    chosen: Unload  ['C0', 'P1', 'A0'], guided True,    prob 0.33, T? False ---loss 0.9901
    best    Unload  ['C0', 'P0', 'A0']
    chosen: Unload  ['C0', 'P0', 'A0'], guided False,   prob 0.33, T? True  ---loss 0.8087
    best    Fly     ['P0', 'A1', 'A0'], Unload ['C0', 'P0', 'A0']
    chosen: Unload  ['C0', 'P0', 'A0'], guided True,    prob 0.33, T? True  ---loss 0.7677

The best actions are what was deterimined by the problem heurstics (not always optimal to save time). The chosen action is what the DNC ended up chosing. 'Guided' refers to Beta from the paper. 'Prob' is the chance of chosing that action (out of all legal actions), and the loss is there as well.

Question Answering

Another task that would be interesting I figured would be to give the DNC a problem (initial state, and goal), then make some moves, and ask where a certain Cargo is (which airport is it in? is it in a plane? which plane?). This did not work too well. See the run.py train_qa function.

    python run.py --act qa --iters 1000 --n_phases 20

Training Misc

Other Problems

In an initial pass I tested with the sequence Memorization task from the deepmind repo. I have not tested it recently and I doubt it works (see todo). To run this specify the wit the problem

Other Setups

The DNC was tested against vanilla Lstms. The Lstm appears to get stuck on air cargo problem at ~40%. To run the training with LSTM only specify with '--algo LSTM' flag like so:

    python run.py --act plan --algo lstm --iters 1000 --n_phases 20 

Misc

Training at each 'level' took 20K steps. This is way more than reported in the paper. On my crappy home CPU, this meant about a day, aka forever. Since I also lost my computer, causing me to need to retrain everything, I only got through the first level of training before having to submit (2 airports, 2 cargos, 2 planes).

Differences from Original

There was some expirementation here, so there are a bunch of flags on when to optimize. In the paper they calculated loss at end of each problem. This did not work for me, so I ended up with running the optimzer after each response.

Loading Previous Run

    python run.py --act plan --iters 1000 --n_phases 20 --load the_saved_name_or_path --save the_new

Flags

Running on floydhub

Set the --env flag to floyd. When it gets up there, the script will create all the directories in /output. Tensorboard for pytorch does not appear to work on there for reasons I do not understand.

    floyd run --env pytorch-0.2 --tensorboard "bash setup.sh && python run.py --act dag --iters 1000 --env floyd"

Todo

Upload best models Test the sequence memorization task. probably does not work.

Implement with GPU.

Faster problem generator

fix tensorboard issues

gradient clipping

visualization of what dnc is doing internally (per paper)

penalty for bad actions when not using the beta coefficient for forcing

losses by prediction (fast loss)

run whole lstm on input and goal state?

Document args in argparse

Testing on moar problems.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].