All Projects → NVIDIA → Amgx

NVIDIA / Amgx

Licence: bsd-3-clause
Distributed multigrid linear solver library on GPU

Labels

Projects that are alternatives of or similar to Amgx

Dragon
Dragon: A Computation Graph Virtual Machine Based Deep Learning Framework.
Stars: ✭ 168 (-18.84%)
Mutual labels:  cuda
Nvidia Docker
Build and run Docker containers leveraging NVIDIA GPUs
Stars: ✭ 13,961 (+6644.44%)
Mutual labels:  cuda
Msn Point Cloud Completion
Morphing and Sampling Network for Dense Point Cloud Completion (AAAI2020)
Stars: ✭ 196 (-5.31%)
Mutual labels:  cuda
Cuda freshman
Stars: ✭ 168 (-18.84%)
Mutual labels:  cuda
Cuml
cuML - RAPIDS Machine Learning Library
Stars: ✭ 2,504 (+1109.66%)
Mutual labels:  cuda
Pytorch Spynet
a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch
Stars: ✭ 190 (-8.21%)
Mutual labels:  cuda
Floor
A C++ Compute/Graphics Library and Toolchain enabling same-source CUDA/Host/Metal/OpenCL/Vulkan C++ programming and execution.
Stars: ✭ 166 (-19.81%)
Mutual labels:  cuda
Oneflow
OneFlow is a performance-centered and open-source deep learning framework.
Stars: ✭ 2,868 (+1285.51%)
Mutual labels:  cuda
Hybridizer Basic Samples
Examples of C# code compiled to GPU by hybridizer
Stars: ✭ 186 (-10.14%)
Mutual labels:  cuda
Viseron
Self-hosted NVR with object detection
Stars: ✭ 192 (-7.25%)
Mutual labels:  cuda
Creepminer
Burstcoin C++ CPU and GPU Miner
Stars: ✭ 169 (-18.36%)
Mutual labels:  cuda
Ssd Gpu Dma
Build userspace NVMe drivers and storage applications with CUDA support
Stars: ✭ 172 (-16.91%)
Mutual labels:  cuda
Ck Caffe
Collective Knowledge workflow for Caffe to automate installation across diverse platforms and to collaboratively evaluate and optimize Caffe-based workloads across diverse hardware, software and data sets (compilers, libraries, tools, models, inputs):
Stars: ✭ 192 (-7.25%)
Mutual labels:  cuda
Cuda programming
Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch
Stars: ✭ 169 (-18.36%)
Mutual labels:  cuda
Simplegpuhashtable
A simple GPU hash table implemented in CUDA using lock free techniques
Stars: ✭ 198 (-4.35%)
Mutual labels:  cuda
Deformable Kernels
Deforming kernels to adapt towards object deformation. In ICLR 2020.
Stars: ✭ 166 (-19.81%)
Mutual labels:  cuda
Macos Egpu Cuda Guide
Set up CUDA for machine learning (and gaming) on macOS using a NVIDIA eGPU
Stars: ✭ 187 (-9.66%)
Mutual labels:  cuda
Cunn
Stars: ✭ 205 (-0.97%)
Mutual labels:  cuda
Pine
🌲 Aimbot powered by real-time object detection with neural networks, GPU accelerated with Nvidia. Optimized for use with CS:GO.
Stars: ✭ 202 (-2.42%)
Mutual labels:  cuda
Timemory
Modular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template API is essentially a framework to creating tools: it is designed to provide a unifying interface for recording various performance measurements alongside data logging and interfaces to other tools.
Stars: ✭ 192 (-7.25%)
Mutual labels:  cuda

Algebraic Multigrid Solver (AmgX) Library

AmgX is a GPU accelerated core solver library that speeds up computationally intense linear solver portion of simulations. The library includes a flexible solver composition system that allows a user to easily construct complex nested solvers and preconditioners. The library is well suited for implicit unstructured methods. The AmgX library offers optimized methods for massive parallelism, the flexibility to choose how the solvers are constructed, and is accessible through a simple C API that abstracts the parallelism and scale across a single or multiple GPUs using user provided MPI.

This is the source of the AMGX library on the NVIDIA Registered Developer Program portal.

Key features of the library include:

  • fp32, fp64 and mixed precision solve
  • Complex datatype support (currently limited)
  • Scalar or coupled block systems
  • Distributed solvers using provided MPI
  • Flexible configuration allows for nested solvers, smoothers and preconditioners
  • Classical (Ruge-Steuben) and Unsmoothed Aggregation algebraic multigrid
  • Krylov methods: CG, BiCGSTAB, GMRES, etc. with optional preconditioning
  • Various smoother: Jacobi, Gauss-Seidel, Incomplete LU, Chebyshev Polynomial, etc.
  • A lot of exposed parameters for algorithms via solver configuration in JSON format
  • Modular structure for easy implementation of your own methods
  • Linux and Windows support

Check out these case studies and white papers:

Table of Contents

Quickstart

Here are the instructions on how to build library and run an example solver on the matrix in the Matrix Market format file. By default provided examples use vector of ones as RHS of the linear system and vector of zeros as initial solution. In order to provide you own values for RHS and initial solution edit the examples.

Dependencies and requirements

In order to build project you would need CMake and CUDA Toolkit. If you want to try distributed version of AMGX library you will also need MPI implementation, such as OpenMPI for Linux or MPICH for Windows. You will need compiler with c++11 support (for example GCC 4.8 or MSVC 14.0). You also need NVIDIA GPU with Compute Capability >=3.0, check to see if your GPU supports this here.

Building

Typical build commands from the project root:

mkdir build
cd build
cmake ../
make -j16 all

Therer are few custom CMake flags that you could use:

  • CUDA_ARCH: List of virtual architectures values that in the CMakeLists file is translated to the corresponding nvcc flags. For example:
cmake ....  -DCUDA_ARCH="35 52 60" ....
  • CMAKE_NO_MPI: Boolean value. If True then non-MPI (single GPU) build will be forced. Results in smaller sized library which could be run on systems without MPI installed. If not specified then MPI build would be enabled if FindMPI script found any MPI installation.
  • AMGX_NO_RPATH: Boolean value. By default CMake adds -rpath flags to binaries. Setting this flag to True tell CMake to not do that - useful for controlling execution environment.
  • MKL_ROOT_DIR and MAGMA_ROOT_DIR: string values. MAGMA/MKL functionality is used to accelerate some of the AMGX eigensolvers. Those solvers will return error 'not supported' if AMGX was not build with MKL/MAGMA support.

CMakeLists uses FindCUDA and FindMPI module scripts to locate corresponding software so refer to those scripts from your CMake installation for module-specific flags.

Artifacts of the build are shared and static libraries (libamgxsh.so or amgxsh.dll and libamgx.a or amgx.lib) and few binaries from 'examples' directory that give you examples of using various AMGX C API. MPI examples are built only if MPI build was enabled.

Running examples

Sample input matrix matrix.mtx is in the examples directory. Sample AMGX solvers configurations are located in the core/configs directory in the root folder. Make sure that examples are able to find AMGX shared library - by default -rpath flag is used for binaries, but you might specify path manually in the environment variable: LD_LIBRARY_PATH for Linux and PATH for Windows.

Running single GPU example from the build directory:

> examples/amgx_capi -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json
AMGX version 2.0.0-public-build125
Built on Oct  7 2017, 04:51:11
Compiled with CUDA Runtime 9.0, using CUDA driver 9.0
Warning: No mode specified, using dDDI by default.
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
AMG Grid:
         Number of Levels: 1
            LVL         ROWS               NNZ    SPRSTY       Mem (GB)
         --------------------------------------------------------------
           0(D)           12                61     0.424       8.75e-07
         --------------------------------------------------------------
         Grid Complexity: 1
         Operator Complexity: 1
         Total Memory Usage: 8.75443e-07 GB
         --------------------------------------------------------------
           iter      Mem Usage (GB)       residual           rate
         --------------------------------------------------------------
            Ini            0.403564   3.464102e+00
              0            0.403564   1.619840e-14         0.0000
         --------------------------------------------------------------
         Total Iterations: 1
         Avg Convergence Rate:               0.0000
         Final Residual:           1.619840e-14
         Total Reduction in Residual:      4.676075e-15
         Maximum Memory Usage:                0.404 GB
         --------------------------------------------------------------
Total Time: 0.00169123
    setup: 0.00100198 s
    solve: 0.000689248 s
    solve(per iteration): 0.000689248 s

Running multi GPU example from the build directory:

> mpirun -n 2 examples/amgx_mpi_capi.exe -m ../examples/matrix.mtx -c ../core/configs/FGMRES_AGGREGATION.json
Process 0 selecting device 0
Process 1 selecting device 0
AMGX version 2.0.0-public-build125
Built on Oct  7 2017, 04:51:11
Compiled with CUDA Runtime 9.0, using CUDA driver 9.0
Warning: No mode specified, using dDDI by default.
Warning: No mode specified, using dDDI by default.
Cannot read file as JSON object, trying as AMGX config
Converting config string to current config version
Parsing configuration string: exception_handling=1 ; 
Using Normal MPI (Hostbuffer) communicator...
Reading matrix dimensions in file: ../examples/matrix.mtx
Reading data...
RHS vector was not found. Using RHS b=[1,…,1]^T
Solution vector was not found. Setting initial solution to x=[0,…,0]^T
Finished reading
Using Normal MPI (Hostbuffer) communicator...
Using Normal MPI (Hostbuffer) communicator...
Using Normal MPI (Hostbuffer) communicator...
AMG Grid:
         Number of Levels: 1
            LVL         ROWS               NNZ    SPRSTY       Mem (GB)
         --------------------------------------------------------------
           0(D)           12                61     0.424        1.1e-06
         --------------------------------------------------------------
         Grid Complexity: 1
         Operator Complexity: 1
         Total Memory Usage: 1.09896e-06 GB
         --------------------------------------------------------------
           iter      Mem Usage (GB)       residual           rate
         --------------------------------------------------------------
            Ini             0.79834   3.464102e+00
              0             0.79834   3.166381e+00         0.9141
              1              0.7983   3.046277e+00         0.9621
              2              0.7983   2.804132e+00         0.9205
              3              0.7983   2.596292e+00         0.9259
              4              0.7983   2.593806e+00         0.9990
              5              0.7983   3.124839e-01         0.1205
              6              0.7983   5.373423e-02         0.1720
              7              0.7983   9.795357e-04         0.0182
              8              0.7983   1.651436e-13         0.0000
         --------------------------------------------------------------
         Total Iterations: 9
         Avg Convergence Rate:               0.0331
         Final Residual:           1.651436e-13
         Total Reduction in Residual:      4.767284e-14
         Maximum Memory Usage:                0.798 GB
         --------------------------------------------------------------
Total Time: 0.0170917
    setup: 0.00145344 s
    solve: 0.0156382 s
    solve(per iteration): 0.00173758 s

Further reading

Plugins and bindings to other software

User @shwina built python bindings to AMGX, check out following repository: https://github.com/shwina/pyamgx.

User @piyueh provided link to their work on PETSc wrapper plugins for AMGX: https://github.com/barbagroup/AmgXWrapper

See API reference doc for detailed description of the interface. In the next few weeks we will be providing more information and details on the project such as:

  • Plans on the project development and priorities
  • Issues
  • Information on contributing
  • Information on solver configurations
  • Information on the code and algorithms
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].