All Projects → HiPerCoRe → Ktt

HiPerCoRe / Ktt

Licence: mit
Kernel Tuning Toolkit

Programming Languages

cpp
1120 projects

Projects that are alternatives of or similar to Ktt

Arrayfire
ArrayFire: a general purpose GPU library.
Stars: ✭ 3,693 (+11090.91%)
Mutual labels:  opencl, cuda, hpc
Arrayfire Rust
Rust wrapper for ArrayFire
Stars: ✭ 525 (+1490.91%)
Mutual labels:  opencl, cuda, hpc
Parenchyma
An extensible HPC framework for CUDA, OpenCL and native CPU.
Stars: ✭ 71 (+115.15%)
Mutual labels:  opencl, cuda, hpc
Occa
JIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal
Stars: ✭ 230 (+596.97%)
Mutual labels:  opencl, cuda, hpc
Floor
A C++ Compute/Graphics Library and Toolchain enabling same-source CUDA/Host/Metal/OpenCL/Vulkan C++ programming and execution.
Stars: ✭ 166 (+403.03%)
Mutual labels:  vulkan, opencl, cuda
Soul Engine
Physically based renderer and simulation engine for real-time applications.
Stars: ✭ 37 (+12.12%)
Mutual labels:  vulkan, opencl, cuda
Futhark
💥💻💥 A data-parallel functional programming language
Stars: ✭ 1,641 (+4872.73%)
Mutual labels:  opencl, cuda, hpc
hpc
Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )
Stars: ✭ 39 (+18.18%)
Mutual labels:  hpc, vulkan, opencl
Arrayfire Python
Python bindings for ArrayFire: A general purpose GPU library.
Stars: ✭ 358 (+984.85%)
Mutual labels:  opencl, cuda, hpc
Clspv
Clspv is a prototype compiler for a subset of OpenCL C to Vulkan compute shaders
Stars: ✭ 381 (+1054.55%)
Mutual labels:  vulkan, opencl
Amgcl
C++ library for solving large sparse linear systems with algebraic multigrid method
Stars: ✭ 390 (+1081.82%)
Mutual labels:  opencl, cuda
Xray Oxygen
🌀 Oxygen Engine 2.0. [Preview] Discord: https://discord.gg/P3aMf66
Stars: ✭ 481 (+1357.58%)
Mutual labels:  opencl, cuda
Hipsycl
Implementation of SYCL for CPUs, AMD GPUs, NVIDIA GPUs
Stars: ✭ 377 (+1042.42%)
Mutual labels:  opencl, cuda
Ilgpu
ILGPU JIT Compiler for high-performance .Net GPU programs
Stars: ✭ 374 (+1033.33%)
Mutual labels:  opencl, cuda
Bitcracker
BitCracker is the first open source password cracking tool for memory units encrypted with BitLocker
Stars: ✭ 463 (+1303.03%)
Mutual labels:  opencl, cuda
Vuda
VUDA is a header-only library based on Vulkan that provides a CUDA Runtime API interface for writing GPU-accelerated applications.
Stars: ✭ 373 (+1030.3%)
Mutual labels:  vulkan, cuda
Loopy
A code generator for array-based code on CPUs and GPUs
Stars: ✭ 367 (+1012.12%)
Mutual labels:  opencl, cuda
Silk.net
The high-speed OpenAL, OpenGL, Vulkan, and GLFW bindings library your mother warned you about.
Stars: ✭ 534 (+1518.18%)
Mutual labels:  vulkan, opencl
Vkfft
Vulkan Fast Fourier Transform library
Stars: ✭ 594 (+1700%)
Mutual labels:  vulkan, hpc
Luxcore
LuxCore source repository
Stars: ✭ 601 (+1721.21%)
Mutual labels:  opencl, cuda

KTT - Kernel Tuning Toolkit

KTT is a autotuning framework for OpenCL, CUDA kernels and GLSL compute shaders. Version 1.3 which introduces public searcher API and user-provided compute queues and buffers is now available.

Main features

  • Ability to define kernel tuning parameters such as kernel thread sizes, vector data types and loop unroll factors in order to optimize computation for a particular device
  • Support for iterative kernel launches and composite kernels
  • Support for multiple compute queues and asynchronous operations
  • Support for online auto-tuning - kernel tuning combined with regular kernel running
  • Ability to automatically ensure correctness of tuned computation with reference kernel or C++ function
  • Support for multiple compute APIs, switching between CUDA, OpenCL and Vulkan requires only minor changes in C++ code (e.g. changing the kernel source file), no library recompilation is needed
  • Large number of customization options, including support for kernel arguments with user-defined data types, ability to change kernel compiler flags and more

Getting started

  • Documentation for KTT API can be found here.
  • The newest release of KTT framework can be found here.
  • Prebuilt binaries are not provided due to many different combinations of compute APIs and build options available. Please check the Building KTT section for detailed instructions on how to perform a build.

Tutorials

Tutorials are short examples which serve as an introduction to KTT framework. Each tutorial covers a specific part of the API. All tutorials are available for both OpenCL and CUDA backends. Most of the tutorials are also available for Vulkan. Tutorials assume that reader has some knowledge about C++ and GPU programming. List of the currently available tutorials:

  • compute_api_info: Retrieving information about compute API platforms and devices through KTT API.
  • running_kernel: Running simple kernel with KTT framework and retrieving output.
  • tuning_kernel_simple: Simple kernel tuning using small number of tuning parameters and reference class to ensure correctness of computation.
  • custom_kernel_arguments: Usage of kernel arguments with custom data types and validating the output with argument comparator.
  • user_tuner_initializer: Providing tuner with custom compute context, queues and buffers.

Examples

Examples showcase how KTT framework could be utilized in real-world scenarios. They are more complex than tutorials and assume that reader is familiar with KTT API. List of some of the currently available examples:

  • coulomb_sum_2d: Tuning of electrostatic potential map computation, focuses on a single slice.
  • coulomb_sum_3d_iterative: 3D version of previous example, utilizes kernel from 2D version and launches it iteratively.
  • coulomb_sum_3d: Alternative to iterative version, utilizes kernel which computes entire map in single invocation.
  • nbody: Tuning of N-body simulation.
  • reduction: Tuning of vector reduction, launches a kernel iteratively.
  • sort: Radix sort example, combines multiple kernels into kernel composition.
  • bicg: Biconjugate gradients method example, features reference class, kernel compositions and constraints.

Building KTT

  • KTT can be built as a dynamic (shared) library using command line build tool Premake. Currently supported operating systems are Linux and Windows.

  • The prerequisites to build KTT are:

    • C++14 compiler, for example Clang 3.5, GCC 5.0, MSVC 19.0 (Visual Studio 2015) or newer
    • OpenCL, CUDA or Vulkan library, supported SDKs are AMD APP SDK, Intel SDK for OpenCL, NVIDIA CUDA Toolkit and Vulkan SDK
    • Premake 5
  • Build under Linux (inside KTT root folder):

    • ensure that path to vendor SDK is correctly set in the environment variables
    • run premake5 gmake to generate makefile
    • run cd build to get inside build directory
    • afterwards run make config={configuration}_{architecture} to build the project (e.g. make config=release_x86_64)
  • Build under Windows (inside KTT root folder):

    • ensure that path to vendor SDK is correctly set in the environment variables, this should be done automatically during SDK installation
    • run premake5.exe vs20xx (e.g. premake5.exe vs2019) to generate Visual Studio project files
    • open generated solution file and build the project inside Visual Studio
  • The following build options are available:

    • --outdir=path specifies custom build directory, default build directory is build
    • --platform=vendor specifies SDK used for building KTT, useful when multiple SDKs are installed
    • --profiling=library enables compilation of kernel profiling functionality using specified library
    • --vulkan enables compilation of experimental Vulkan backend
    • --no-examples disables compilation of examples
    • --no-tutorials disables compilation of tutorials
    • --tests enables compilation of unit tests
    • --no-cuda disables inclusion of CUDA API during compilation, only affects Nvidia platform
    • --no-opencl disables inclusion of OpenCL API during compilation
  • KTT and applications utilizing it rely on external dynamic (shared) libraries in order to work correctly. There are multiple ways to provide access to these libraries, e.g. copying given library inside application folder or adding the containing folder to library path (example for Linux: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/shared/library). Libraries which are bundled with device drivers are usually visible by default. List of the libraries currently utilized by KTT:

    • OpenCL distributed with specific device drivers (OpenCL only)
    • cuda distributed with specific device drivers (CUDA only)
    • nvrtc distributed with specific device drivers (CUDA only)
    • cupti bundled with Nvidia CUDA Toolkit (CUDA profiling only)
    • vulkan distributed with specific device drivers (Vulkan only)
    • shaderc_shared bundled with KTT distribution (Vulkan only)

Original project

KTT is based on CLTune project. Some parts of KTT API are similar to CLTune API, however internal structure was almost completely rewritten from scratch. Portions of code for following features were ported from CLTune:

  • Annealing searcher
  • Generating of kernel configurations
  • Tuning parameter constraints
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].