All Projects → rocmsys → RET

rocmsys / RET

Licence: other
ROCm Machine Learning and HPC Stack installer

Programming Languages

shell
77523 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to RET

realcaffe2
The repo is obsolete. Use at your own risk.
Stars: ✭ 12 (-57.14%)
Mutual labels:  hpc, amd, rocm
rocPRIM
ROCm Parallel Primitives
Stars: ✭ 95 (+239.29%)
Mutual labels:  amd, rocm
Parenchyma
An extensible HPC framework for CUDA, OpenCL and native CPU.
Stars: ✭ 71 (+153.57%)
Mutual labels:  hpc, amd
LvArray
Portable HPC Containers (C++)
Stars: ✭ 37 (+32.14%)
Mutual labels:  hpc
hpc
Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )
Stars: ✭ 39 (+39.29%)
Mutual labels:  hpc
ripples
A C++ Library for Influence Maximization
Stars: ✭ 18 (-35.71%)
Mutual labels:  hpc
Kinetic.jl
Universal modeling and simulation of fluid dynamics upon machine learning
Stars: ✭ 82 (+192.86%)
Mutual labels:  hpc
community datasets
Example datasets and dashboards known to work well in OmniSci
Stars: ✭ 14 (-50%)
Mutual labels:  hpc
julea
A Flexible Storage Framework for HPC
Stars: ✭ 25 (-10.71%)
Mutual labels:  hpc
luna
Provisioning tool for clusters
Stars: ✭ 58 (+107.14%)
Mutual labels:  hpc
HPC
A collection of various resources, examples, and executables for the general NREL HPC user community's benefit. Use the following website for accessing documentation.
Stars: ✭ 64 (+128.57%)
Mutual labels:  hpc
Laghos
High-order Lagrangian Hydrodynamics Miniapp
Stars: ✭ 131 (+367.86%)
Mutual labels:  hpc
rTRNG
R package providing access and examples to TRNG C++ library
Stars: ✭ 17 (-39.29%)
Mutual labels:  hpc
nix-install-vendor-gl
Ensure that a system-compatible OpenGL driver is available for `nix-shell`-encapsulated programs.
Stars: ✭ 22 (-21.43%)
Mutual labels:  amd
pytokio
[READ ONLY] Refer to gitlab repo for updated version - Total Knowledge of I/O Reference Implementation. Please see wiki for contribution guidelines.
Stars: ✭ 20 (-28.57%)
Mutual labels:  hpc
awesome-aws-research
A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources for Academic Researchers new to AWS
Stars: ✭ 41 (+46.43%)
Mutual labels:  hpc
gslib
sparse communication library
Stars: ✭ 22 (-21.43%)
Mutual labels:  hpc
CARE
CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.
Stars: ✭ 22 (-21.43%)
Mutual labels:  hpc
waldur-mastermind
Waldur MasterMind is a hybrid cloud orchestrator.
Stars: ✭ 37 (+32.14%)
Mutual labels:  hpc
RamaNet
Preforms De novo protein design using machine learning and PyRosetta to generate a novel protein structure
Stars: ✭ 41 (+46.43%)
Mutual labels:  hpc

Welcome to RET (ROCm Enablement Tool) Status

RET is a comprehensive checking, set up, installation, testing and benchmarking tool which does carry out the installation of ROCm suite ranging from dependencies, drivers and toolchain to framework and benchmark. RET makes the process of carrying out automated ROCm installation incredibly simple and provides a more user friendly and faster installation experience.

  • Install Linux OS
  • Run ret
  • Run your TensorFlow benchmark OR Train your own model with TensorFlow

Hardware Support and supported GPU

please refer to ROCm main repository at ROCmInstall.

Getting started

Supported OS

  • Ubuntu:
    • 16.04
    • 18.04
  • CentOS 7.6 (TensorFlow run on Docker)

Prerequisites

Note: it is required to start with a clean system

Formatting a hard drive along with the install of a new OS is the best option after the installation you will need git to download the RET source

  sudo apt -y install git
  git clone https://github.com/rocmsys/RET.git

Usage

sudo ./ret  <command> [<option>]
e.g.
sudo ./ret install rocm or sudo ./ret install tensorflow

Command options

Command:
              [install]   <Package>              : Install ROCm or ML Framework TF/PT
              [remove]    <Package>              : Remove ROCm or ML Framework TF/PT
              [benchmark] <Packages> <Model>     : Run benchmark for specific ML Framework
              [build] <Container> <ImageName>    : Build ROCm Container either with Docker or Singularity

   Packages:
              [rocm]                             : ROCm-dkms packages
              [tensorflow]                       : TensorFlow framework

   Model:
              [resnet56]                         : ResNet-56 model. Default Model
   Container:
              [docker]                           : Build Docker Container
              [singularity]                      : Build Singularity Container
              [ImageName]                        : Choosing an OS Base Image. Default is [ubuntu:18.04]
    
Options:
              [-py2|-py3]                        : Python version. Default is Python3
              [-h|--help]                        : Show this help message
              [-v|--version]                     : Show version of this package
              [-V|--verbose]                     : Be verbose
              [-d|--debug]                       : Enable Debug Mode
              [-y|--yes]                         : Skip confirmation message
              [-ns|--nsc]                        : Skip system check steps
              [-nv|--nov]                        : Skip verification steps
              [-ic|--incontainer]                : Run RET on top of Container

RUN RET

   cd RET
   sudo ./ret install rocm         # install ROCm stack
   sudo reboot
   sudo ./ret install tensorflow   # install TensorFlow

TensorFlow's benchmarks

Details on the benchmarks can be found at this Link.

Here are the basic instructions to run ResNet-56 benchmark:

sudo ./ret benchmark tensorflow resnet56

You can also use the TensorFlow benchmarks:

Download tensorflow benchmark

git clone https://github.com/reger-men/tensorflow_benchmark.git

Run the training benchmark (e.g. ResNet-56)

python3 train.py

Note: You may need to add your GPU number --num_gpus=YOUR_GPU_NUMBER

ToDo Checklist

  • Support Ubuntu 16.04
  • Support Ubuntu 18.04
  • Support CentOS 7.6
  • Support RHEL 7.6
  • tensorflow on Ubuntu
  • tensorflow on CentOS
  • tensorflow on RHEL
  • pytorch on Ubuntu
  • pytorch on CentOS
  • pytorch on RHEL
  • Check System Compatibility
  • Check HW Compatibility
  • Adapt RET on top of Docker Container
  • Cloud Support

Project Stats

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].