All Projects β†’ denizyuret β†’ Autograd.jl

denizyuret / Autograd.jl

Licence: other
Julia port of the Python autograd package.

Programming Languages

julia
2034 projects

Projects that are alternatives of or similar to Autograd.jl

Ai Platform
An open-source platform for automating tasks using machine learning models
Stars: ✭ 61 (-58.5%)
Mutual labels:  data-science, neural-networks
Vvedenie Mashinnoe Obuchenie
πŸ“ ΠŸΠΎΠ΄Π±ΠΎΡ€ΠΊΠ° рСсурсов ΠΏΠΎ ΠΌΠ°ΡˆΠΈΠ½Π½ΠΎΠΌΡƒ ΠΎΠ±ΡƒΡ‡Π΅Π½ΠΈΡŽ
Stars: ✭ 1,282 (+772.11%)
Mutual labels:  data-science, neural-networks
Mit Deep Learning
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
Stars: ✭ 8,912 (+5962.59%)
Mutual labels:  data-science, neural-networks
Qualia2.0
Qualia is a deep learning framework deeply integrated with automatic differentiation and dynamic graphing with CUDA acceleration. Qualia was built from scratch.
Stars: ✭ 41 (-72.11%)
Mutual labels:  neural-networks, automatic-differentiation
Adcme.jl
Automatic Differentiation Library for Computational and Mathematical Engineering
Stars: ✭ 106 (-27.89%)
Mutual labels:  neural-networks, automatic-differentiation
Machine Learning From Scratch
Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (-71.43%)
Mutual labels:  data-science, neural-networks
Knet.jl
Koç University deep learning framework.
Stars: ✭ 1,260 (+757.14%)
Mutual labels:  data-science, neural-networks
Deep learning and the game of go
Code and other material for the book "Deep Learning and the Game of Go"
Stars: ✭ 677 (+360.54%)
Mutual labels:  data-science, neural-networks
Sigmoidal ai
Tutoriais de Python, Data Science, Machine Learning e Deep Learning - Sigmoidal
Stars: ✭ 103 (-29.93%)
Mutual labels:  data-science, neural-networks
Codesearchnet
Datasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (+837.41%)
Mutual labels:  data-science, neural-networks
Pyclustering
pyclustring is a Python, C++ data mining library.
Stars: ✭ 806 (+448.3%)
Mutual labels:  data-science, neural-networks
Learn Machine Learning
Learn to Build a Machine Learning Application from Top Articles
Stars: ✭ 116 (-21.09%)
Mutual labels:  data-science, neural-networks
Arraymancer
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+439.46%)
Mutual labels:  neural-networks, automatic-differentiation
Mckinsey Smartcities Traffic Prediction
Adventure into using multi attention recurrent neural networks for time-series (city traffic) for the 2017-11-18 McKinsey IronMan (24h non-stop) prediction challenge
Stars: ✭ 49 (-66.67%)
Mutual labels:  data-science, neural-networks
Sciblog support
Support content for my blog
Stars: ✭ 694 (+372.11%)
Mutual labels:  data-science, neural-networks
Dltk
Deep Learning Toolkit for Medical Image Analysis
Stars: ✭ 1,249 (+749.66%)
Mutual labels:  data-science, neural-networks
Edward
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
Stars: ✭ 4,674 (+3079.59%)
Mutual labels:  data-science, neural-networks
Test Tube
Python library to easily log experiments and parallelize hyperparameter search for neural networks
Stars: ✭ 663 (+351.02%)
Mutual labels:  data-science, neural-networks
Tageditor
πŸ–TagEditor - Annotation tool for spaCy
Stars: ✭ 92 (-37.41%)
Mutual labels:  data-science, neural-networks
Keras Contrib
Keras community contributions
Stars: ✭ 1,532 (+942.18%)
Mutual labels:  data-science, neural-networks

AutoGrad

Build Status coveralls codecov

AutoGrad.jl is an automatic differentiation package for Julia. It started as a port of the popular Python autograd package and forms the foundation of the Knet Julia deep learning framework. AutoGrad can differentiate regular Julia code that includes loops, conditionals, helper functions, closures etc. by keeping track of the primitive operations and using this execution trace to compute gradients. It uses reverse mode differentiation (a.k.a. backpropagation) so it can efficiently handle functions with large array inputs and scalar outputs. It can compute gradients of gradients to handle higher order derivatives.

Installation

You can install AutoGrad in Julia using:

julia> using Pkg; Pkg.add("AutoGrad")

In order to use it in your code start with:

using AutoGrad

Interface

x = Param([1,2,3])		# user declares parameters
x => P([1,2,3])			# they are wrapped in a struct
value(x) => [1,2,3]		# we can get the original value
sum(abs2,x) => 14		# they act like regular values outside of differentiation
y = @diff sum(abs2,x)	        # if you want the gradients
y => T(14)			# you get another struct
value(y) => 14			# which represents the same value
grad(y,x) => [2,4,6]	        # but also contains gradients for all Params

Old Interface

Pre v1.1 AutoGrad only supported the following grad interface. This is still supported.

x = [1,2,3]
f(x) = sum(abs2,x)
g = grad(f)
f(x) => 14
g(x) => [2,4,6]

Example

Here is a linear regression example using callable objects:

struct Linear; w; b; end		# user defines a model
(f::Linear)(x) = (f.w * x .+ f.b)

# Initialize a model as a callable object with parameters:
f = Linear(Param(randn(10,100)), Param(randn(10)))

# SGD training loop:
for (x,y) in data
    loss = @diff sum(abs2,f(x)-y)
    for w in params(f)
        g = grad(loss,w)
	axpy!(-0.01, g, w)
    end
end

See the examples directory for more examples.

Extending AutoGrad

AutoGrad can only handle a function if the primitives it uses have known gradients. You can add your own primitives with gradients using the @primitive and @zerograd macros in macros.jl Here is an example:

@primitive log(x),dy,y  (dy .* (1 ./ x))

The @primitive macro marks the log(::Any) method as a new primitive and the next expression defines a gradient function wrt the first argument. The gradient expressions can refer to the parameter(s) x, the return variable y and its gradient dy (optionally indicated after the argument list) in the method declaration. For functions with multiple inputs multiple gradient expressions may be given. Non-existent or zero gradients can be specified by omitting a gradient expression or using nothing in place of one. By default the broadcasting version log.(x) is also defined as a primitive, use the @primitive1 macro if you don't want this.

Note that Julia supports multiple-dispatch, i.e. a function may have multiple methods each supporting different argument types. For example log(::Float32) and log(::BigFloat) are two different log methods. In AutoGrad.jl each method can be defined independently as a primitive and can have its own specific gradient. Generally AutoGrad defines gradients without using argument types to keep the rules generic.

Debugging and Profiling

To view the contents of the computational graph after differentiating a function you can use the following:

julia> AutoGrad.gcnode(::AutoGrad.Node)=nothing  # without this some values may be lost
julia> w = Param(rand(2,3)); b = Param(rand(2,1)); x = rand(3,4); y = rand(2,4);
julia> J = @diff sum(abs2, w*x .+ b - y)
T(14.695603907991153)
julia> [J]  # displaying J in an Array causes pretty printing
1. P(Array{Float64,2}(2,3)) βˆ‡=Array{Float64,2}(2,3)
2. Array{Float64,2}(2,4) = *(Array{Float64,2}(2,3), Array{Float64,2}(3,4))) βˆ‡=Array{Float64,2}(2,4)
3. P(Array{Float64,2}(2,1)) βˆ‡=Array{Float64,2}(2,1)
4. Array{Float64,2}(2,4) = broadcast(+, Array{Float64,2}(2,4), Array{Float64,2}(2,1))) βˆ‡=Array{Float64,2}(2,4)
5. Array{Float64,2}(2,4) = -(Array{Float64,2}(2,4), Array{Float64,2}(2,4))) βˆ‡=Array{Float64,2}(2,4)
6. 14.695603907991153 = sum(abs2, Array{Float64,2}(2,4))) βˆ‡=1.0
julia> z = collect(J.list)  # collect creates a Node array with reverse order
julia> dump(z[5], maxdepth=1)  # allowing you to look at individual Nodes and Values
AutoGrad.Node
  Value: AutoGrad.Result{Array{Float64,2}}
  parents: Array{AutoGrad.Node}((2,))
  children: Array{AutoGrad.Node}((1,))
  outgrad: Array{Float64}((2, 4)) [3.82753 2.19124 3.26769 3.0075; 2.81565 2.3903 1.84373 1.60228]
  cdr: AutoGrad.Node
julia> dump(z[5].Value, maxdepth=2)
AutoGrad.Result{Array{Float64,2}}
  value: Array{Float64}((2, 4)) [1.16724 1.07224 0.935047 0.895262; 0.687182 0.589704 0.517114 0.495718]
  func: * (function of type typeof(*))
  args: Tuple{Param{Array{Float64,2}},Array{Float64,2}}
    1: Param{Array{Float64,2}}
    2: Array{Float64}((3, 4)) [0.515282 0.257471 0.140791 0.127632; 0.705288 0.783289 0.361965 0.311965; 0.780549 0.691645 0.853317 0.843374]
  kwargs: Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}
    data: NamedTuple{(),Tuple{}} NamedTuple()
    itr: Tuple{} ()

To profile AutoGrad using TimerOutputs.jl, set the environment variable ENV["AUTOGRAD_TIMER"]="true" and rebuild AutoGrad with Pkg.build("AutoGrad"), before evaluating using AutoGrad. The environment variable AUTOGRAD_TIMER is only checked at compile time, not at run time for performance reasons. This will collect detailed timing information but slows the code down, when you are done don't forget to delete!(ENV,"AUTOGRAD_TIMER") and rebuild AutoGrad. In the example below, the symbol sum indicates the time spent on the forward pass of the sum function and sum[2] indicates the time spent on the backward pass for the second argument. record and sum_outgrads are functions internal to AutoGrad.

julia> ENV["AUTOGRAD_TIMER"]="true"
julia> using Pkg; Pkg.build("AutoGrad")
julia> using AutoGrad, TimerOutputs
julia> reset_timer!(AutoGrad.to)
julia> w = Param(rand(2,3)); b = Param(rand(2,1)); x = rand(3,4); y = rand(2,4);
julia> J = @diff sum(abs2, w*x .+ b - y)
julia> AutoGrad.to
───────────────────────────────────────────────────────────────────────
                                Time                   Allocations      
                        ──────────────────────   ───────────────────────
    Tot / % measured:        4.62s / 30.4%            546MiB / 25.0%    

 Section        ncalls     time   %tot     avg     alloc   %tot      avg
 ───────────────────────────────────────────────────────────────────────
 +.[2]               1    328ms  23.3%   328ms   46.4MiB  34.1%  46.4MiB
 sum[2]              1    288ms  20.5%   288ms   40.0MiB  29.4%  40.0MiB
   *                 1   38.8ms  2.76%  38.8ms    595KiB  0.43%   595KiB
 *                   1    269ms  19.2%   269ms    955KiB  0.68%   955KiB
 +.                  1    139ms  9.92%   139ms   20.4MiB  15.0%  20.4MiB
 *[1]                1    117ms  8.33%   117ms   9.41MiB  6.90%  9.41MiB
 record              4   88.7ms  6.31%  22.2ms   3.49MiB  2.56%   894KiB
 -[1]                1   65.9ms  4.69%  65.9ms   10.0MiB  7.32%  10.0MiB
 -                   1   55.8ms  3.97%  55.8ms    929KiB  0.67%   929KiB
 sum                 1   50.0ms  3.56%  50.0ms   4.68MiB  3.44%  4.68MiB
 +.[1]               1   1.78ms  0.13%  1.78ms   37.7KiB  0.03%  37.7KiB
 sum_outgrads        5   1.41ms  0.10%   282ΞΌs   28.2KiB  0.02%  5.64KiB
 ───────────────────────────────────────────────────────────────────────

Code structure

core.jl implements the main functionality and acts as the main documentation source. macros.jl has some support functions to define and test new primitives. getindex.jl, iterate.jl and cat.jl set up support for common data structures including Arrays, Tuples, and Dictionaries. The numerical gradients are defined in files such as base.jl and math.jl.

Current status and future work

The gradient coverage and unit testing are spotty, I am still adding more gradients and tests to cover the Julia base. Documentation needs to be improved. Overwriting functions (e.g. setindex!) are not supported. Efficiency could be improved by reducing runtime compilation, memoization, and support for static computation.

Acknowledgments and references

AutoGrad.jl was written by Deniz Yuret. Parts of the code were initially ported from the Python autograd package. I'd like to thank autograd author Dougal Maclaurin for his support. See (Baydin et al. 2015) for a general review of automatic differentiation, autograd tutorial for some Python examples, and Dougal's PhD thesis for design principles. JuliaDiff and FluxML have alternative differentiation tools for Julia. I would like to thank the current contributors:

  • Carlo Lucibello
  • Ekin AkyΓΌrek
  • Emre Yolcu
  • Jarrett Revels
  • Mike Innes
  • Ozan Arkan Can
  • Rene Donner

The suggested citation for AutoGrad is:

@inproceedings{knet2016mlsys,
  author={Yuret, Deniz},
  title={Knet: beginning deep learning with 100 lines of Julia},
  year={2016},
  booktitle={Machine Learning Systems Workshop at NIPS 2016}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].