All Projects → oneapi-src → Onednn

oneapi-src / Onednn

Licence: apache-2.0
oneAPI Deep Neural Network Library (oneDNN)

Programming Languages

C++
36643 projects - #6 most used programming language
c
50402 projects - #5 most used programming language
CMake
9771 projects
python
139335 projects - #7 most used programming language
shell
77523 projects
Batchfile
5799 projects

Projects that are alternatives of or similar to Onednn

Unisimd Assembler
SIMD macro assembler unified for ARM, MIPS, PPC and x86
Stars: ✭ 63 (-97.58%)
Mutual labels:  x86-64, aarch64, avx2, avx512
Sse Popcount
SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html
Stars: ✭ 226 (-91.31%)
Mutual labels:  aarch64, avx2, avx512
Ctranslate2
Fast inference engine for OpenNMT models
Stars: ✭ 140 (-94.62%)
Mutual labels:  openmp, deep-neural-networks, avx2
profiler-api
The portable version of JetBrains profiler API for .NET Framework / .NET Core / .NET / .NET Standard / Mono
Stars: ✭ 21 (-99.19%)
Mutual labels:  x64, x86-64, aarch64
Nsimd
Agenium Scale vectorization library for CPUs and GPUs
Stars: ✭ 138 (-94.69%)
Mutual labels:  aarch64, avx2, avx512
simdutf8
SIMD-accelerated UTF-8 validation for Rust.
Stars: ✭ 426 (-83.62%)
Mutual labels:  avx2, sse41, aarch64
Boost.simd
Boost SIMD
Stars: ✭ 238 (-90.85%)
Mutual labels:  aarch64, avx2, avx512
Corrfunc
⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
Stars: ✭ 114 (-95.62%)
Mutual labels:  openmp, avx2, avx512
Asm Dude
Visual Studio extension for assembly syntax highlighting and code completion in assembly files and the disassembly window
Stars: ✭ 3,898 (+49.92%)
Mutual labels:  x86-64, avx2, avx512
Rappel
A linux-based assembly REPL for x86, amd64, armv7, and armv8
Stars: ✭ 818 (-68.54%)
Mutual labels:  x86-64, aarch64, x64
Md5 Simd
Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.
Stars: ✭ 71 (-97.27%)
Mutual labels:  performance, avx2, avx512
Simdjson
Parsing gigabytes of JSON per second
Stars: ✭ 15,115 (+481.35%)
Mutual labels:  aarch64, x64, avx2
Asm
Assembly Tutorial for DOS
Stars: ✭ 125 (-95.19%)
Mutual labels:  x86-64, x64
Babelstream
STREAM, for lots of devices written in many programming models
Stars: ✭ 121 (-95.35%)
Mutual labels:  openmp, opencl
Rcore
Rust version of THU uCore OS. Linux compatible.
Stars: ✭ 2,175 (-16.35%)
Mutual labels:  x86-64, aarch64
Docker Homebridge
Homebridge Docker. HomeKit support for the impatient using Docker on x86_64, Raspberry Pi (armhf) and ARM64. Includes ffmpeg + libfdk-aac.
Stars: ✭ 1,847 (-28.96%)
Mutual labels:  x86-64, aarch64
Osaca
Open Source Architecture Code Analyzer
Stars: ✭ 162 (-93.77%)
Mutual labels:  avx2, avx512
Computelibrary
The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
Stars: ✭ 2,123 (-18.35%)
Mutual labels:  aarch64, opencl
Mcsema
Framework for lifting x86, amd64, aarch64, sparc32, and sparc64 program binaries to LLVM bitcode
Stars: ✭ 2,198 (-15.46%)
Mutual labels:  x86-64, aarch64
Sse4 Strstr
SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification
Stars: ✭ 115 (-95.58%)
Mutual labels:  avx2, avx512

oneAPI Deep Neural Network Library (oneDNN)

This software was previously known as Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN) and Deep Neural Network Library (DNNL).

oneAPI logo

oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform performance library of basic building blocks for deep learning applications. oneDNN is part of oneAPI. The library is optimized for Intel(R) Architecture Processors, Intel Processor Graphics and Xe Architecture graphics. oneDNN has experimental support for the following architectures: Arm* 64-bit Architecture (AArch64), NVIDIA* GPU, OpenPOWER* Power ISA (PPC64), IBMz* (s390x), and RISC-V.

oneDNN is intended for deep learning applications and framework developers interested in improving application performance on Intel CPUs and GPUs. Deep learning practitioners should use one of the applications enabled with oneDNN.

Table of Contents

Documentation

  • Developer guide explains programming model, supported functionality, and implementation details, and includes annotated examples.
  • API reference provides a comprehensive reference of the library API.

Installation

Binary distribution of this software is available in:

The packages do not include library dependencies and these need to be resolved in the application at build time. See the System Requirements section below and the Build Options section in the developer guide for more details on CPU and GPU runtimes.

If the configuration you need is not available, you can build the library from source.

System Requirements

oneDNN supports platforms based on the following architectures:

WARNING

Arm 64-bit Architecture (AArch64), Power ISA (PPC64), IBMz (s390x), and RISC-V (RV64) support is experimental with limited testing validation.

The library is optimized for the following CPUs:

  • Intel Atom(R) processors (at least Intel SSE4.1 support is required)
  • Intel Core(TM) processors (at least Intel SSE4.1 support is required)
  • Intel Xeon(R) processor E3, E5, and E7 family (formerly Sandy Bridge, Ivy Bridge, Haswell, and Broadwell)
  • Intel Xeon Phi(TM) processor (formerly Knights Landing and Knights Mill)
  • Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, and Cooper Lake)
  • future Intel Xeon Scalable processor (code name Sapphire Rapids)

On a CPU based on Intel 64 or on AMD64 architecture, oneDNN detects the instruction set architecture (ISA) at runtime and uses just-in-time (JIT) code generation to deploy the code optimized for the latest supported ISA. Future ISAs may have initial support in the library disabled by default and require the use of run-time controls to enable them. See CPU dispatcher control for more details.

On a CPU based on Arm AArch64 architecture, oneDNN can be built with Arm Compute Library integration. Compute Library is an open-source library for machine learning applications and provides AArch64 optimized implementations of core functions. This functionality currently requires that Compute Library is downloaded and built separately, see Build from Source. oneDNN is only compatible with Compute Library versions 21.11 or later.

WARNING

On macOS, applications that use oneDNN may need to request special entitlements if they use the hardened runtime. See the linking guide for more details.

The library is optimized for the following GPUs:

  • Intel Processor Graphics based on Gen9, Gen9.5 and Gen11, and Gen12 architectures
  • Intel Iris(R) Xe graphics (formerly DG1)
  • future Intel Arc(TM) graphics (code name Alchemist and DG2)

Requirements for Building from Source

oneDNN supports systems meeting the following requirements:

  • Operating system with Intel 64 / Arm 64 / Power / IBMz architecture support
  • C++ compiler with C++11 standard support
  • CMake 2.8.12 or later
  • Arm Compute Library for builds using Compute Library on AArch64.

The following tools are required to build oneDNN documentation:

Configurations of CPU and GPU engines may introduce additional build time dependencies.

CPU Engine

oneDNN CPU engine is used to execute primitives on Intel Architecture Processors, 64-bit Arm Architecture (AArch64) processors, 64-bit Power ISA (PPC64) processors, IBMz (s390x), and compatible devices.

The CPU engine is built by default but can be disabled at build time by setting DNNL_CPU_RUNTIME to NONE. In this case, GPU engine must be enabled. The CPU engine can be configured to use the OpenMP, TBB or DPCPP runtime. The following additional requirements apply:

Some implementations rely on OpenMP 4.0 SIMD extensions. For the best performance results on Intel Architecture Processors we recommend using the Intel C++ Compiler.

GPU Engine

Intel Processor Graphics and Xe Architecture graphics are supported by the oneDNN GPU engine. The GPU engine is disabled in the default build configuration. The following additional requirements apply when GPU engine is enabled:

  • OpenCL runtime requires
    • OpenCL* runtime library (OpenCL version 1.2 or later)
    • OpenCL driver (with kernel language support for OpenCL C 2.0 or later) with Intel subgroups and USM extensions support
  • DPCPP runtime requires
  • DPCPP runtime with NVIDIA GPU support requires

WARNING

NVIDIA GPU support is experimental. General information, build instructions and implementation limitations is available in NVIDIA backend readme.

Runtime Dependencies

When oneDNN is built from source, the library runtime dependencies and specific versions are defined by the build environment.

Linux

Common dependencies:

  • GNU C Library (libc.so)
  • GNU Standard C++ Library v3 (libstdc++.so)
  • Dynamic Linking Library (libdl.so)
  • C Math Library (libm.so)
  • POSIX Threads Library (libpthread.so)

Runtime-specific dependencies:

Runtime configuration Compiler Dependency
DNNL_CPU_RUNTIME=OMP GCC GNU OpenMP runtime (libgomp.so)
DNNL_CPU_RUNTIME=OMP Intel C/C++ Compiler Intel OpenMP runtime (libiomp5.so)
DNNL_CPU_RUNTIME=OMP Clang Intel OpenMP runtime (libiomp5.so)
DNNL_CPU_RUNTIME=TBB any TBB (libtbb.so)
DNNL_CPU_RUNTIME=DPCPP Intel oneAPI DPC++ Compiler Intel oneAPI DPC++ Compiler runtime (libsycl.so), TBB (libtbb.so), OpenCL loader (libOpenCL.so)
DNNL_GPU_RUNTIME=OCL any OpenCL loader (libOpenCL.so)
DNNL_GPU_RUNTIME=DPCPP Intel oneAPI DPC++ Compiler Intel oneAPI DPC++ Compiler runtime (libsycl.so), OpenCL loader (libOpenCL.so), oneAPI Level Zero loader (libze_loader.so)

Windows

Common dependencies:

  • Microsoft Visual C++ Redistributable (msvcrt.dll)

Runtime-specific dependencies:

Runtime configuration Compiler Dependency
DNNL_CPU_RUNTIME=OMP Microsoft Visual C++ Compiler No additional requirements
DNNL_CPU_RUNTIME=OMP Intel C/C++ Compiler Intel OpenMP runtime (iomp5.dll)
DNNL_CPU_RUNTIME=TBB any TBB (tbb.dll)
DNNL_CPU_RUNTIME=DPCPP Intel oneAPI DPC++ Compiler Intel oneAPI DPC++ Compiler runtime (sycl.dll), TBB (tbb.dll), OpenCL loader (OpenCL.dll)
DNNL_GPU_RUNTIME=OCL any OpenCL loader (OpenCL.dll)
DNNL_GPU_RUNTIME=DPCPP Intel oneAPI DPC++ Compiler Intel oneAPI DPC++ Compiler runtime (sycl.dll), OpenCL loader (OpenCL.dll), oneAPI Level Zero loader (ze_loader.dll)

macOS

Common dependencies:

  • System C/C++ runtime (libc++.dylib, libSystem.dylib)

Runtime-specific dependencies:

Runtime configuration Compiler Dependency
DNNL_CPU_RUNTIME=OMP Intel C/C++ Compiler Intel OpenMP runtime (libiomp5.dylib)
DNNL_CPU_RUNTIME=TBB any TBB (libtbb.dylib)

Validated Configurations

CPU engine was validated on RedHat* Enterprise Linux 7 with

on Windows Server* 2016 with

on macOS 10.13 (High Sierra) with

GPU engine was validated on Ubuntu* 20.04 with

on Windows Server 2019 with

Requirements for Pre-built Binaries

See the README included in the corresponding binary package.

Applications Enabled with oneDNN

Support

Please submit your questions, feature requests, and bug reports on the GitHub issues page.

You may reach out to project maintainers privately at [email protected].

Contributing

We welcome community contributions to oneDNN. If you have an idea on how to improve the library:

For additional details, see contribution guidelines.

This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

oneDNN is licensed under Apache License Version 2.0. Refer to the "LICENSE" file for the full license text and copyright notice.

This distribution includes third party software governed by separate license terms.

3-clause BSD license:

2-clause BSD license:

Apache License Version 2.0:

Boost Software License, Version 1.0:

MIT License:

This third party software, even if included with the distribution of the Intel software, may be governed by separate license terms, including without limitation, third party license terms, other Intel software license terms, and open source software license terms. These separate license terms govern your use of the third party programs as set forth in the "THIRD-PARTY-PROGRAMS" file.

Security

See Intel's Security Center for information on how to report a potential security issue or vulnerability.

See also: Security Policy

Trademark Information

Intel, the Intel logo, Arc, Intel Atom, Intel Core, Intel Xeon Phi, Iris, OpenVINO, the OpenVINO logo, Pentium, VTune, and Xeon are trademarks of Intel Corporation or its subsidiaries.

* Other names and brands may be claimed as the property of others.

Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation in the United States and/or other countries.

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.

(C) Intel Corporation

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].