Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → celerity → Celerity Runtime

celerity / Celerity Runtime

Licence: mit

High-level C++ for Accelerator Clusters

Labels

hpc gpgpu

Projects that are alternatives of or similar to Celerity Runtime

Arrayfire Rust

Rust wrapper for ArrayFire

Stars: ✭ 525 (+609.46%)

Mutual labels: gpgpu, hpc

Futhark

💥💻💥 A data-parallel functional programming language

Stars: ✭ 1,641 (+2117.57%)

Mutual labels: gpgpu, hpc

Compute

A C++ GPU Computing Library for OpenCL

Stars: ✭ 1,192 (+1510.81%)

Mutual labels: gpgpu, hpc

arrayfire-java

Java wrapper for ArrayFire

Stars: ✭ 34 (-54.05%)

Mutual labels: hpc, gpgpu

Arrayfire

ArrayFire: a general purpose GPU library.

Stars: ✭ 3,693 (+4890.54%)

Mutual labels: gpgpu, hpc

MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

Stars: ✭ 418 (+464.86%)

Mutual labels: hpc, gpgpu

Occa

JIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal

Stars: ✭ 230 (+210.81%)

Mutual labels: gpgpu, hpc

Arrayfire Python

Python bindings for ArrayFire: A general purpose GPU library.

Stars: ✭ 358 (+383.78%)

Mutual labels: gpgpu, hpc

Parenchyma

An extensible HPC framework for CUDA, OpenCL and native CPU.

Stars: ✭ 71 (-4.05%)

Mutual labels: gpgpu, hpc

Sos

Sandia OpenSHMEM is an implementation of the OpenSHMEM specification over multiple Networking APIs, including Portals 4, the Open Fabric Interface (OFI), and UCX. Please click on the Wiki tab for help with building and using SOS.

Stars: ✭ 34 (-54.05%)

Mutual labels: hpc

Fgci Ansible

🔬 Collection of the Finnish Grid and Cloud Infrastructure Ansible playbooks

Stars: ✭ 49 (-33.78%)

Mutual labels: hpc

Svm kernel

x86_64 AMD kernel optimized for performance & hypervisor usage

Stars: ✭ 32 (-56.76%)

Mutual labels: hpc

Sst Elements

SST Architectural Simulation Components and Libraries

Stars: ✭ 36 (-51.35%)

Mutual labels: hpc

Cbrain

CBRAIN is a flexible Ruby on Rails framework for accessing and processing of large data on high-performance computing infrastructures.

Stars: ✭ 51 (-31.08%)

Mutual labels: hpc

Ktt

Kernel Tuning Toolkit

Stars: ✭ 33 (-55.41%)

Mutual labels: hpc

Sycl Dnn

SYCL-DNN is a library implementing neural network algorithms written using SYCL

Stars: ✭ 67 (-9.46%)

Mutual labels: gpgpu

Computeshader Unity Macos

Stars: ✭ 31 (-58.11%)

Mutual labels: gpgpu

Wfl

A Simple Way of Creating Job Workflows in Go running in Processes, Containers, Tasks, Pods, or Jobs

Stars: ✭ 30 (-59.46%)

Mutual labels: hpc

Maestrowf

A tool to easily orchestrate general computational workflows both locally and on supercomputers

Stars: ✭ 72 (-2.7%)

Mutual labels: hpc

Slurm In Docker

Slurm in Docker - Exploring Slurm using CentOS 7 based Docker images

Stars: ✭ 63 (-14.86%)

Mutual labels: hpc

View All Similar Projects ➔

Celerity Runtime -

The Celerity distributed runtime and API aims to bring the power and ease of use of SYCL to distributed memory clusters.

If you want a step-by-step introduction on how to set up dependencies and implement your first Celerity application, check out the tutorial!

Overview

Programming modern accelerators is already challenging in and of itself. Combine it with the distributed memory semantics of a cluster, and the complexity can become so daunting that many leave it unattempted. Celerity wants to relieve you of some of this burden, allowing you to target accelerator clusters with programs that look like they are written for a single device.

High-level API based on SYCL

Celerity makes it a priority to stay as close to the SYCL API as possible. If you have an existing SYCL application, you should be able to migrate it to Celerity without much hassle. If you know SYCL already, this will probably look very familiar to you:

celerity::buffer<float, 1> buf(sycl::range<1>(1024));
queue.submit([=](celerity::handler& cgh) {
  auto acc = buf.get_access<sycl::access::mode::discard_write>(
    cgh,
    celerity::access::one_to_one<1>()           // 1
  );
  cgh.parallel_for<class MyKernel>(
    sycl::range<1>(1024),                       // 2
    [=](sycl::item<1> item) {                   // 3
      acc[item] = sycl::sin(item[0] / 1024.f);  // 4
    });
});

Provide a range-mapper to tell Celerity which parts of the buffer will be accessed by the kernel.
Submit a kernel to be executed by 1024 parallel work items. This kernel may be split across any number of nodes.
Kernels can be expressed as C++11 lambda functions, just like in SYCL. In fact, no changes to your existing kernels are required *.
Access your buffers as if they reside on a single device -- even though they might be scattered throughout the cluster.

* There are currently some limitations to what types of kernels Celerity supports - see Issues & Limitations.

Run it like any other MPI application

The kernel shown above can be run on a single GPU, just like in SYCL, or on a whole cluster -- without having to change anything about the program itself.

For example, if we were to run it on two GPUs using mpirun -n 2 ./my_example, the first GPU might compute the range 0-512 of the kernel, while the second one computes 512-1024. However, as the user, you don't have to care how exactly your computation is being split up.

To see how you can use the result of your computation, look at some of our fully-fledged examples, or follow the tutorial!

Building Celerity

Celerity uses CMake as its build system. The build process itself is rather simple, however you have to make sure that you have a few dependencies installed first.

Dependencies

A supported SYCL implementation, either
- hipSYCL, or
- ComputeCpp
Boost (we recommended version 1.65 - 1.68)
- If you use hipSYCL to target the CUDA platform, you may run into issues with newer versions of Boost.
A MPI 2 implementation (tested with OpenMPI 4.0, MPICH 3.3 should work as well)
CMake (3.5.1 or newer)
A C++17 compiler

Building can be as simple as calling cmake && make, depending on your setup you might however also have to provide some library paths etc. See our installation guide for more information.

The runtime comes with several examples that are built automatically when the CELERITY_BUILD_EXAMPLES CMake option is set (true by default).

Using Celerity as a Library

Simply run make install (or equivalent, depending on build system) to copy all relevant header files and libraries to the CMAKE_INSTALL_PREFIX. This includes a CMake package configuration file which is placed inside the lib/cmake directory. You can then use find_package(Celerity CONFIG) to include Celerity into your CMake project. Once included, you can use the add_celerity_to_target(TARGET target SOURCES source1 source2...) function to set up the required dependencies for a target (no need to link manually).

Running a Celerity Application

Celerity is built on top of MPI, which means a Celerity application can be executed like any other MPI application (i.e., using mpirun or equivalent). There are several environment variables that you can use to influence Celerity's runtime behavior:

Environment Variables

CELERITY_LOG_LEVEL controls the logging output level. One of trace, debug, info, warn, err, critical, or off.
CELERITY_DEVICES can be used to assign different compute devices to Celerity worker nodes on a single host. The syntax is as follows: CELERITY_DEVICES="<platform_id> <first device_id> <second device_id> ... <nth device_id>". Note that this should normally not be required, as Celerity will attempt to automatically assign a unique device to each worker on a host.
CELERITY_FORCE_WG=<work_group_size> can be used to force a particular work group size for every kernel and every dimension. This currently exists as a workaround until Celerity supports ND-range kernels.
CELERITY_PROFILE_OCL controls whether OpenCL-level profiling information should be queried (currently not supported when using hipSYCL).

Disclaimer

Celerity is a research project first and foremost, and is still in early development. While it does work for certain applications, it probably does not fully support your use case just yet. We'd however love for you to give it a try and tell us about how you could imagine using Celerity for your projects in the future!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 74

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗