All Projects → AStupidBear → FluxUtils.jl

AStupidBear / FluxUtils.jl

Licence: other
Sklearn Interface and Distributed Training for Flux.jl

Programming Languages

julia
2034 projects

Projects that are alternatives of or similar to FluxUtils.jl

t8code
Parallel algorithms and data structures for tree-based AMR with arbitrary element shapes.
Stars: ✭ 37 (+208.33%)
Mutual labels:  mpi
hpc
Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )
Stars: ✭ 39 (+225%)
Mutual labels:  mpi
sst-core
SST Structural Simulation Toolkit Parallel Discrete Event Core and Services
Stars: ✭ 82 (+583.33%)
Mutual labels:  mpi
ravel
Ravel MPI trace visualization tool
Stars: ✭ 26 (+116.67%)
Mutual labels:  mpi
scr
SCR caches checkpoint data in storage on the compute nodes of a Linux cluster to provide a fast, scalable checkpoint / restart capability for MPI codes.
Stars: ✭ 84 (+600%)
Mutual labels:  mpi
Keras catVSdog tf estimator
Source for post "An Easy Guide to build new TensorFlow Datasets and Estimator with Keras Model"
Stars: ✭ 32 (+166.67%)
Mutual labels:  estimator
ParMmg
Distributed parallelization of 3D volume mesh adaptation
Stars: ✭ 19 (+58.33%)
Mutual labels:  mpi
XH5For
XDMF parallel partitioned mesh I/O on top of HDF5
Stars: ✭ 23 (+91.67%)
Mutual labels:  mpi
mpiBench
MPI benchmark to test and measure collective performance
Stars: ✭ 39 (+225%)
Mutual labels:  mpi
gslib
sparse communication library
Stars: ✭ 22 (+83.33%)
Mutual labels:  mpi
Singularity-tutorial
Singularity 101
Stars: ✭ 31 (+158.33%)
Mutual labels:  mpi
arbor
The Arbor multi-compartment neural network simulation library.
Stars: ✭ 87 (+625%)
Mutual labels:  mpi
raptor
General, high performance algebraic multigrid solver
Stars: ✭ 50 (+316.67%)
Mutual labels:  mpi
Theano-MPI
MPI Parallel framework for training deep learning models built in Theano
Stars: ✭ 55 (+358.33%)
Mutual labels:  mpi
faabric
Messaging and state layer for distributed serverless applications
Stars: ✭ 39 (+225%)
Mutual labels:  mpi
az-hop
The Azure HPC On-Demand Platform provides an HPC Cluster Ready solution
Stars: ✭ 33 (+175%)
Mutual labels:  mpi
Galaxy
Galaxy is an asynchronous parallel visualization ray tracer for performant rendering in distributed computing environments. Galaxy builds upon Intel OSPRay and Intel Embree, including ray queueing and sending logic inspired by TACC GraviT.
Stars: ✭ 18 (+50%)
Mutual labels:  mpi
SWCaffe
A Deep Learning Framework customized for Sunway TaihuLight
Stars: ✭ 37 (+208.33%)
Mutual labels:  mpi
EDLib
Exact diagonalization solver for quantum electron models
Stars: ✭ 18 (+50%)
Mutual labels:  mpi
SIRIUS
Domain specific library for electronic structure calculations
Stars: ✭ 87 (+625%)
Mutual labels:  mpi

Sklearn Interface and Distributed Training for Flux.jl

Installation

using Pkg
pkg"add FluxUtils"

Usage

using Flux, FluxUtils
using FluxUtils: fit!, predict!

Sklearn Interface for Time Series Prediction

First, define a simple LSTM network model.

model = Chain(LSTM(10, 10), Dense(10, 1)) |> gpu

Then, sepify the loss function for this model. xs/ys is a Vector of AbstractArrays of length seqsize.

loss = function (m, xs, ys)
    l, seqsize = 0f0, length(xs)
    for t in 1:seqsize
        x, y = xs[t], ys[t]
        l += mse(m(x), y)
    end
    return l / Float32(seqsize)
end
# The above is equivalent to 
# loss = seqloss(mse)

A spec is also need, which can be a NamedTuple, Dict or custom struct with at least three fileds defined.

spec = (epochs = 2, batchsize = 20, seqsize = 10)

Finally, create the optimizer opt and estimator est.

opt = ADAMW(1f-3)
est = Estimator(model, loss, opt, spec)

You can use fit! to fit this estimator just like fit in sklearn, with minibatching, logging, parameter syncronization, callbacks all handled internally. fit! will first create an minibatch sequence iterator from x and y with batch size @unpack batchsize = spec and sequence length @unpack seqsize = spec (truncated backpropagation).

F = 10     # feature dimension
N = 10000  # batch dimension
T = 1000   # time dimension
x = zeros(Float32, F, N, T)
y = zeros(Float32, 1, N, T)
fit!(est, x, y)

After the model is trained, you can use predict! to fill in the preallocated with predictions of est on x (because it's difficult to infer the output shape of a model wihtout running it).

= fill!(similar(y), 0)
predict!(ŷ, est, x)

Note that the type of x, y or is not restricted to AbstractArray, it can be Tuples of AbstractArrays. This is similar to the notion of multiple inputs and outputs in Keras.

If you are not dealing with time series problems, just add a dummy time dimension to your data. If your input is multidimensional, for example size(x) == (W, H, C, N, T), you can reshape it to be of three dimensions (F, N, T) and reshape back in the definition of m(x) like this

function (m::MyModel)(x)
    x = reshape(x, W, H, C, N)
    ...
end

Distributed Training with MPI

Distributed training can be achived with MPI with just a couple lines of code needed to be added. fit! internally will intercept Flux's parameter updating step, apply Allreduce to average gradients from diffrent processes, then continue the updating. It will also synchronize parameters by broadcasting parameters from rank 0 to the rest before backpropagation starts.

If you want to train on NVIDIA GPUs, make sure you have built MPI with CUDA support (see link).

A template may be like this (run with mpirun -np 4 julia *)

using MPI
MPI.Init()
# ... code to load data
# ... code to define est
fit!(est, x, y)
# ... code to predict
MPI.Finalize()

The complete example is located at test/mpi.jl.

Dealing with Big Data

Because data is only lazily subsliced in the training process, you can use memory mapping to read large datasets. HDF5.jl is recommended for this kind of usage. The function h5concat in HDF5Utils.jl can help you concatenate a large amount of files into a single file efficiently.

using HDF5
x, y = open("data.h5", "r") do fid
    readmmap(fid["x"]), 
    readmmap(fid["y"])
end

Utility Functions

m = Chain(LSTM(10, 10), Dense(10, 1))

untrack all tracked objects in m

notrack(m)

concatenate all parameters of a model to a single vector

v = net2vec(m)

copy v to parameters of m

vec2net!(m, v)

concatenate all gradients of a model to a single vector

net2grad(m)

get all parameters of a model with names

namedparams(m)

get all states of a model

s = states(m)

load states s back into model m

loadstates!(m, s)

get all weights of a model (without biases), useful for regularization

weights(m)

batch matrix-matrix product (can be differentiated)

bmm(A, B)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].