Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → penn-graphics-research → Claymore

penn-graphics-research / Claymore

Licence: mit

Programming Languages

cpp14

131 projects

Labels

cuda high-performance-computing gpu-computing

Projects that are alternatives of or similar to Claymore

Arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

Stars: ✭ 793 (+487.41%)

Mutual labels: cuda, high-performance-computing, gpu-computing

Neanderthal

Fast Clojure Matrix Library

Stars: ✭ 927 (+586.67%)

Mutual labels: cuda, high-performance-computing, gpu-computing

Hipsycl

Implementation of SYCL for CPUs, AMD GPUs, NVIDIA GPUs

Stars: ✭ 377 (+179.26%)

Mutual labels: cuda, high-performance-computing, gpu-computing

GOSH

An ultra-fast, GPU-based large graph embedding algorithm utilizing a novel coarsening algorithm requiring not more than a single GPU.

Stars: ✭ 12 (-91.11%)

Mutual labels: cuda, high-performance-computing, gpu-computing

Bayadera

High-performance Bayesian Data Analysis on the GPU in Clojure

Stars: ✭ 342 (+153.33%)

Mutual labels: cuda, high-performance-computing, gpu-computing

rbcuda

CUDA bindings for Ruby

Stars: ✭ 57 (-57.78%)

Mutual labels: cuda, high-performance-computing, gpu-computing

Cuda Api Wrappers

Thin C++-flavored wrappers for the CUDA Runtime API

Stars: ✭ 362 (+168.15%)

Mutual labels: cuda, gpu-computing

Stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

Stars: ✭ 531 (+293.33%)

Mutual labels: cuda, gpu-computing

Luxcore

LuxCore source repository

Stars: ✭ 601 (+345.19%)

Mutual labels: cuda, gpu-computing

Clojurecl

ClojureCL is a Clojure library for parallel computations with OpenCL.

Stars: ✭ 266 (+97.04%)

Mutual labels: high-performance-computing, gpu-computing

Accelerate

Embedded language for high-performance array computations

Stars: ✭ 751 (+456.3%)

Mutual labels: cuda, gpu-computing

Accelerate Llvm

LLVM backend for Accelerate

Stars: ✭ 134 (-0.74%)

Mutual labels: cuda, gpu-computing

Tutorials

Some basic programming tutorials

Stars: ✭ 353 (+161.48%)

Mutual labels: cuda, gpu-computing

Taskflow

A General-purpose Parallel and Heterogeneous Task Programming System

Stars: ✭ 6,128 (+4439.26%)

Mutual labels: cuda, high-performance-computing

Heteroflow

Concurrent CPU-GPU Programming using Task Models

Stars: ✭ 57 (-57.78%)

Mutual labels: cuda, gpu-computing

Sixtyfour

How fast can we brute force a 64-bit comparison?

Stars: ✭ 41 (-69.63%)

Mutual labels: cuda, gpu-computing

Pycuda

CUDA integration for Python, plus shiny features

Stars: ✭ 1,112 (+723.7%)

Mutual labels: cuda, gpu-computing

Deepnet

Deep.Net machine learning framework for F#

Stars: ✭ 99 (-26.67%)

Mutual labels: cuda, gpu-computing

MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

Stars: ✭ 418 (+209.63%)

Mutual labels: cuda, gpu-computing

Vuh

Vulkan compute for people

Stars: ✭ 264 (+95.56%)

Mutual labels: high-performance-computing, gpu-computing

View All Similar Projects ➔

A Massively Parallel and Scalable Multi-GPU Material Point Method

Documentation

Description

This is the opensource code for the SIGGRAPH 2020 paper:

A Massively Parallel and Scalable Multi-GPU Material Point Method

page, pdf, supp, video

Authors: Xinlei Wang*, Yuxing Qiu*, Stuart R. Slattery, Yu Fang, Minchen Li, Song-Chun Zhu, Yixin Zhu, Min Tang, Dinesh Manocha Chenfanfu Jiang (* Equal contributions)

Harnessing the power of modern multi-GPU architectures, we present a massively parallel simulation system based on the Material Point Method (MPM) for simulating physical behaviors of materials undergoing complex topological changes, self-collision, and large deformations. Our system makes three critical contributions. First, we introduce a new particle data structure that promotes coalesced memory access patterns on the GPU and eliminates the need for complex atomic operations on the memory hierarchy when writing particle data to the grid. Second, we propose a kernel fusion approach using a new Grid-to-Particles-to-Grid (G2P2G) scheme, which efficiently reduces GPU kernel launches, improves latency, and significantly reduces the amount of global memory needed to store particle data. Finally, we introduce optimized algorithmic designs that allow for efficient sparse grids in a shared memory context, enabling us to best utilize modern multi-GPU computational platforms for hybrid Lagrangian-Eulerian computational patterns. We demonstrate the effectiveness of our method with extensive benchmarks, evaluations, and dynamic simulations with elastoplasticity, granular media, and fluid dynamics. In comparisons against an open-source and heavily optimized CPU-based MPM codebase on an elastic sphere colliding scene with particle counts ranging from 5 to 40 million, our GPU MPM achieves over 100X per-time-step speedup on a workstation with an Intel 8086K CPU and a single Quadro P6000 GPU, exposing exciting possibilities for future MPM simulations in computer graphics and computational science. Moreover, compared to the state-of-the-art GPU MPM method, we not only achieve 2X acceleration on a single GPU but our kernel fusion strategy and Array-of-Structs-of-Array (AoSoA) data structure design also generalizes to multi-GPU systems. Our multi-GPU MPM exhibits near-perfect weak and strong scaling with 4 GPUs, enabling performant and large-scale simulations on a 1024x1024x1024 grid with close to 100 million particles with less than 4 minutes per frame on a single 4-GPU workstation and 134 million particles with less than 1 minute per frame on an 8-GPU workstation.

Compilation

This is a cross-platform C++/CUDA cmake project. The minimum version requirement of cmake is 3.15, yet the latest version is generally recommended. The required CUDA version is 10.2 or 11.

Currently, supported OS includes Windows 10 and Ubuntu (>=18.04), and tested compilers includes gcc8.4, msvc v142, clang-9 (includes msvc version).

Build

Run the following command in the root directory. Note that adding "--config Release" to the last command is needed when compiling using msvc.

cd build
cmake ..
cmake --build .

Or configure the project using the CMake Tools extension in Visual Studio Code (recommended).

Data

Currently, binary position data and the level-set (signed distance field) data are accepted as input files for particles. Uniformly sampling particles from analytic geometries is another viable way for the initialization of models.

Run Demos

The project provides the following GPU-based schemes for MPM:

GMPM: improved single-GPU pipeline
MGSP: static geometry (particle) partitioning multi-GPU pipeline

Go to Projects/**, run the executable.

Code Usage

Use the codebase in another cmake c++ project.

Directly include the codebase as a submodule, and follow the examples in the Projects.

Develop upon the codebase.

Create a sub-folder in Projects with a cmake file at its root.

Bibtex

Please cite our paper if you use this code for your research:

@article{Wang2020multiGMPM,
    author = {Xinlei Wang* and Yuxing Qiu* and Stuart R. Slattery and Yu Fang and Minchen Li and Song-Chun Zhu and Yixin Zhu and Min Tang and Dinesh Manocha and Chenfanfu Jiang},
    title = {A Massively Parallel and Scalable Multi-GPU Material Point Method},
    journal = {ACM Transactions on Graphics},
    year = {2020},
    volume = {39},
    number = {4},
    articleno = {Article 30}
}

Credits

This project draws inspirations from Taichi, GMPM.

Acknowledgement

We thank Yuanming Hu for useful discussions and proofreading, Feng Gao for his help on configuring workstations. We appreciate Prof. Chenfanfu Jiang and Yuanming Hu for their insightful advice on the documentation.

Dependencies

The following libraries are adopted in our project development:

cub (now replaced by Thrust)
fmt

For particle data IO and generation, we use these two libraries in addition:

Due to the C++ standard requirement (at most C++14) for compiling CUDA (10.2) code, we import these following libraries as well:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 135

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗