All Projects → clang-omp → Libomptarget

clang-omp / Libomptarget

Licence: other

Labels

Projects that are alternatives of or similar to Libomptarget

Gunrock
High-Performance Graph Primitives on GPUs
Stars: ✭ 718 (+3888.89%)
Mutual labels:  cuda
Pyopencl
OpenCL integration for Python, plus shiny features
Stars: ✭ 790 (+4288.89%)
Mutual labels:  cuda
Cudadbclustering
Clustering via Graphics Processor, using NVIDIA CUDA sdk to preform database clustering on the massively parallel graphics card processor
Stars: ✭ 6 (-66.67%)
Mutual labels:  cuda
Kintinuous
Real-time large scale dense visual SLAM system
Stars: ✭ 740 (+4011.11%)
Mutual labels:  cuda
Marian
Fast Neural Machine Translation in C++
Stars: ✭ 777 (+4216.67%)
Mutual labels:  cuda
Blocksparse
Efficient GPU kernels for block-sparse matrix multiplication and convolution
Stars: ✭ 797 (+4327.78%)
Mutual labels:  cuda
Warp Ctc
Pytorch Bindings for warp-ctc
Stars: ✭ 684 (+3700%)
Mutual labels:  cuda
Wheels
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Stars: ✭ 891 (+4850%)
Mutual labels:  cuda
Numba
NumPy aware dynamic Python compiler using LLVM
Stars: ✭ 7,090 (+39288.89%)
Mutual labels:  cuda
Libcudarange
An interval arithmetic and affine arithmetic library for NVIDIA CUDA
Stars: ✭ 5 (-72.22%)
Mutual labels:  cuda
Juice
The Hacker's Machine Learning Engine
Stars: ✭ 743 (+4027.78%)
Mutual labels:  cuda
Ethereum nvidia miner
💰 USB flash drive ISO image for Ethereum, Zcash and Monero mining with NVIDIA graphics cards and Ubuntu GNU/Linux (headless)
Stars: ✭ 772 (+4188.89%)
Mutual labels:  cuda
Scikit Cuda
Python interface to GPU-powered libraries
Stars: ✭ 803 (+4361.11%)
Mutual labels:  cuda
Deep Painterly Harmonization
Code and data for paper "Deep Painterly Harmonization": https://arxiv.org/abs/1804.03189
Stars: ✭ 6,027 (+33383.33%)
Mutual labels:  cuda
Ddsh Tip2018
source code for paper "Deep Discrete Supervised Hashing"
Stars: ✭ 16 (-11.11%)
Mutual labels:  cuda
Cuda Convnet2
Automatically exported from code.google.com/p/cuda-convnet2
Stars: ✭ 690 (+3733.33%)
Mutual labels:  cuda
Arraymancer
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+4305.56%)
Mutual labels:  cuda
Neuralsuperresolution
Real-time video quality improvement for applications such as video-chat using Perceptual Losses
Stars: ✭ 18 (+0%)
Mutual labels:  cuda
Gmatrix
R package for unleashing the power of NVIDIA GPU's
Stars: ✭ 16 (-11.11%)
Mutual labels:  cuda
Pytorch Loss
label-smooth, amsoftmax, focal-loss, triplet-loss, lovasz-softmax. Maybe useful
Stars: ✭ 812 (+4411.11%)
Mutual labels:  cuda

libomptarget - OpenMP offloading runtime libraries for Clang

This is a prototype implementation of the OpenMP offloading library to be supported by Clang. The current implementation has been tested in Linux so far.

The current implementation of this library can be classified into three components: target agnostic offloading, target specific offloading plugins, and target specific runtime library.

In order to build the libraries run:

make

or

mkdir build
cd build
cmake ..
make

Both build systems are prepared to detect the host machine automatically as well as the target devices by looking for the correspondent toolkit (e.g. CUDA). If a supported toolkit is detect, the makefiles will create a library for it. However, it is possible that some systems require adjustments in the look-up paths.

All the libraries will be created in ./lib folder. Typically, you will get:

libomptarget.so - The main and target agnostic library

libomptarget.rtl.[toolkit name].so - The target specific plugins. These plugins are loaded at runtime by libomptarget.so to interact with a given device.

libomptarget-[target name].[a,so] - The target specific runtime libraries. These libraries should be passed to the target linker as they implement the runtime calls produced by Clang during code generation.

Note that the interface of all the libraries in this project is likely to change in the future.

Target agnostic offloading - libomptarget.so

This component contains the logic to launch the initialization of the devices supported by the current program, create device data environments and launch executions of kernels (OpenMP target regions). In order to deal with a specific device this component detects and loads the corresponding plugin.

This component has been tested for:

  • powerpc64-ibm-linux-gnu
  • powerpc64le-ibm-linux-gnu
  • x86_64-pc-linux-gnu

The code of this component is under ./src

Target specific plugins - libomptarget.rtl.[toolkit name].so

These plugins are used by libomptarget.so to deal with a given target. They all use the same interface and implement basic functionality like device initialization, data movement to/from device and kernel launching.

The current implementation supports the following plugins:

  • generic 64-bit - this implementation is suitable for powerpc64, powerpc64le and x86_64 target

  • cuda - plugin for Nvidia GPUs implemented on top of the CUDA device runtime library

The code for this component is under ./RTLs

Target specific runtime libraries - libomptarget-[target name].[a,so]

These libraries implement the OpenMP runtime calls used by a given device during execution.

The current implementation includes a library for:

  • nvptx: library written in CUDA for Nvidia GPUs. Tested with CUDA compilation tools V7.0.27. The CUDA architecture can be set using cmake by setting OMPTARGET_NVPTX_SM to a comma separated list of target architectures. For example, to compile for sm_30 and sm_35 one can define -DOMPTARGET_NVPTX_SM=30,35 when calling cmake. If not using cmake the same goal can be achieved by passing OMPTARGET_NVPTX_SM=30,35 to make. In order to use this library with Clang the user has to set LIBRARY_PATH to point to ./lib so that Clang passes the right information to the target linker.

For powerpc64, powerpc64le and x86_64 devices, existing host runtime libraries (e.g. openmp.llvm.org) can be used for when these devices are used as OpenMP targets.

The code for this component is under ./DevRTLs

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].