All Projects → libmir → mir-glas

libmir / mir-glas

Licence: other
[Experimental] LLVM-accelerated Generic Linear Algebra Subprograms

Programming Languages

d
599 projects
c
50402 projects - #5 most used programming language
shell
77523 projects
Makefile
30231 projects

Projects that are alternatives of or similar to mir-glas

Libxsmm
Library for specialized dense and sparse matrix operations, and deep learning primitives.
Stars: ✭ 518 (+423.23%)
Mutual labels:  matrix, simd, blas
Tensor
A library and extension that provides objects for scientific computing in PHP.
Stars: ✭ 146 (+47.47%)
Mutual labels:  matrix, matrix-multiplication, lapack
sparse
Sparse matrix formats for linear algebra supporting scientific and machine learning applications
Stars: ✭ 136 (+37.37%)
Mutual labels:  matrix, matrix-multiplication, blas
monolish
monolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
Stars: ✭ 166 (+67.68%)
Mutual labels:  matrix, blas, lapack
Eigen Git Mirror
THIS MIRROR IS DEPRECATED -- New url: https://gitlab.com/libeigen/eigen
Stars: ✭ 1,659 (+1575.76%)
Mutual labels:  matrix, blas, lapack
Mathematics for Machine Learning
Learn mathematics behind machine learning and explore different mathematics in machine learning.
Stars: ✭ 28 (-71.72%)
Mutual labels:  algebra, matrix
Cmathtuts
trying to collect all useful tutorials for famous C math and linear algebra libraries such as CBLAS, CLAPACK, GSL...
Stars: ✭ 266 (+168.69%)
Mutual labels:  algebra, blas
GenericTensor
The only library allowing to create Tensors (matrices extension) with custom types
Stars: ✭ 42 (-57.58%)
Mutual labels:  matrix, matrix-multiplication
Q.js
Quantum computing in your browser.
Stars: ✭ 158 (+59.6%)
Mutual labels:  algebra, matrix
Ugm
Ubpa Graphics Mathematics
Stars: ✭ 178 (+79.8%)
Mutual labels:  matrix, simd
Algebra
means completeness and balancing, from the Arabic word الجبر
Stars: ✭ 92 (-7.07%)
Mutual labels:  algebra, matrix
Nalgebra
Linear algebra library for Rust.
Stars: ✭ 2,433 (+2357.58%)
Mutual labels:  algebra, matrix
oxygenjs
This a JavaScript Library for the Numerical Javascript and Machine Learning
Stars: ✭ 13 (-86.87%)
Mutual labels:  algebra, matrix
Blasjs
Pure Javascript manually written 👌 implementation of BLAS, Many numerical software applications use BLAS computations, including Armadillo, LAPACK, LINPACK, GNU Octave, Mathematica, MATLAB, NumPy, R, and Julia.
Stars: ✭ 241 (+143.43%)
Mutual labels:  matrix, blas
Klein
P(R*_{3, 0, 1}) specialized SIMD Geometric Algebra Library
Stars: ✭ 463 (+367.68%)
Mutual labels:  algebra, simd
Decomposed
CATransform3D manipulation made easy.
Stars: ✭ 184 (+85.86%)
Mutual labels:  matrix, simd
Math Php
Powerful modern math library for PHP: Features descriptive statistics and regressions; Continuous and discrete probability distributions; Linear algebra with matrices and vectors, Numerical analysis; special mathematical functions; Algebra
Stars: ✭ 2,009 (+1929.29%)
Mutual labels:  algebra, matrix
linnea
Linnea is an experimental tool for the automatic generation of optimized code for linear algebra problems.
Stars: ✭ 60 (-39.39%)
Mutual labels:  blas, lapack
linalg
Linear algebra library based on LAPACK
Stars: ✭ 42 (-57.58%)
Mutual labels:  matrix, lapack
mfi
Modern Fortran Interfaces to BLAS and LAPACK
Stars: ✭ 31 (-68.69%)
Mutual labels:  blas, lapack

Dub downloads License Gitter

Latest version

Circle CI Build Status

Benchmarks

glas

LLVM-accelerated Generic Linear Algebra Subprograms (GLAS)

Description

GLAS is a C library written in Dlang. No C++/D runtime is required but libc, which is available everywhere.

The library provides

  1. BLAS (Basic Linear Algebra Subprograms) API.
  2. GLAS (Generic Linear Algebra Subprograms) API.

CBLAS API can be provided by linking with Netlib's CBLAS library.

dub

GLAS can be used with DMD and LDC but LDC (LLVM D Compiler) >= 1.1.0 beta 6 should be installed in common path anyway.

Note performance issue #18.

GLAS can be included automatically in a project using dub (the D package manager). DUB will build GLAS and CPUID manually with LDC.

{
   ...
   "dependencies": {
      "mir-glas": "~><current_mir-glas_version>",
      "mir-cpuid": "~><current_mir-cpuid_version>"
   },
   "lflags": ["-L$MIR_GLAS_PACKAGE_DIR", "-L$MIR_CPUID_PACKAGE_DIR"]
}

$MIR_GLAS_PACKAGE_DIR and $MIR_CPUID_PACKAGE_DIR will be replaced automatically by DUB to appropriate directories.

Usage

mir-glas can be used like a common C library. It should be linked with mir-cpuid. A compiler, for example GCC, may require mir-cpuid to be passed after mir-glas: -lmir-glas -lmir-cpuid.

GLAS API

GLAS API is based on the new ndslice from mir-algorithm. Other languages can use simple structure definition. Examples are available for C and for Dlang.

Headers

C/C++ headers are located in include/. D headers are located in source/.

There are two files:

  1. glas/fortran.h / glas/fortran.d - for Netilb's BLAS API
  2. glas/ndslice.h / glas/ndslice.d - for GLAS API

Manual Compilation

Compiler installation

LDC (LLVM D Compiler) >= 1.1.0 beta 6 is required to build a project. You may want to build LDC from source or use LDC 1.1.0 beta 6. Beta 2 generates a lot of warnings that can be ignored. Beta 3 is not supported.

LDC binaries contains two compilers: ldc2 and ldmd2. It is recommended to use ldmd2 with mir-glas.

Recent LDC packages come with the dub package manager. dub is used to build the project.

Mir CPUID

Mir CPUID is CPU Identification Routines.

Download mir-cpuid

dub fetch mir-cpuid --cache=local

Change the directory

cd mir-cpuid-<current-mir-cpuid-version>/mir-cpuid

Build mir-cpuid

dub build --build=release-nobounds --compiler=ldmd2 --build-mode=singleFile --parallel --force

You may need to add --arch=x86_64, if you use windows.

Copy libmir-cpuid.a to your project or add its directory to the library path.

Mir GLAS

Download mir-glas

dub fetch mir-glas --cache=local

Change the directory

cd mir-glas-<current-mir-glas-version>/mir-glas

Build mir-glas

dub build --config=static --build=target-native --compiler=ldmd2 --build-mode=singleFile --parallel --force

You may need to add --arch=x86_64 if you use windows.

Copy libmir-glas.a to your project or add its directory to the library path.

Status

We are open for contributing! The hardest part (GEMM) is already implemented.

  • CI testing with Netlib's BLAS test suite.
  • CI testing with Netlib's CBLAS test suite.
  • CI testing with Netlib's LAPACK test suite.
  • CI testing with Netlib's LAPACKE test suite.
  • Multi-threading
  • GPU back-end
  • Shared library support - requires only DUB configuration fixes.
  • Level 3 - matrix-matrix operations
    • GEMM - matrix matrix multiply
    • SYMM - symmetric matrix matrix multiply
    • HEMM - hermitian matrix matrix multiply
    • SYRK - symmetric rank-k update to a matrix
    • HERK - hermitian rank-k update to a matrix
    • SYR2K - symmetric rank-2k update to a matrix
    • HER2K - hermitian rank-2k update to a matrix
    • TRMM - triangular matrix matrix multiply
    • TRSM - solving triangular matrix with multiple right hand sides
  • Level 2 - matrix-vector operations
    • GEMV - matrix vector multiply
    • GBMV - banded matrix vector multiply
    • HEMV - hermitian matrix vector multiply
    • HBMV - hermitian banded matrix vector multiply
    • HPMV - hermitian packed matrix vector multiply
    • TRMV - triangular matrix vector multiply
    • TBMV - triangular banded matrix vector multiply
    • TPMV - triangular packed matrix vector multiply
    • TRSV - solving triangular matrix problems
    • TBSV - solving triangular banded matrix problems
    • TPSV - solving triangular packed matrix problems
    • GERU - performs the rank 1 operation A := alpha*x*y' + A
    • GERC - performs the rank 1 operation A := alpha*x*conjg( y' ) + A
    • HER - hermitian rank 1 operation A := alpha*x*conjg(x') + A
    • HPR - hermitian packed rank 1 operation A := alpha*x*conjg( x' ) + A
    • HER2 - hermitian rank 2 operation
    • HPR2 - hermitian packed rank 2 operation
  • Level 1 - vector-vector and scalar operations. Note: Mir already provides generic implementation.
    • ROTG - setup Givens rotation
    • ROTMG - setup modified Givens rotation
    • ROT - apply Givens rotation
    • ROTM - apply modified Givens rotation
    • SWAP - swap x and y
    • SCAL - x = a*x. Note: requires addition optimization for complex numbers.
    • COPY - copy x into y
    • AXPY - y = a*x + y. Note: requires addition optimization for complex numbers.
    • DOT - dot product
    • DOTU - dot product. Note: requires addition optimization for complex numbers.
    • DOTC - dot product, conjugating the first vector. Note: requires addition optimization for complex numbers.
    • DSDOT - dot product with extended precision accumulation and result
    • SDSDOT - dot product with extended precision accumulation
    • NRM2 - Euclidean norm
    • ASUM - sum of absolute values
    • IAMAX - index of max abs value

Porting to a new target

Five steps

  1. Implement cpuid_init function for mir-cpuid. This function should be implemented per platform or OS. Already implemented targets are
    • x86, any OS
    • x86_64, any OS
  2. Verify that source/glas/internal/memory.d contains an implementation for the OS. Already implemented targets are
    • Posix (Linux, macOS, and others)
    • Windows
  3. Add new configuration for register blocking to source/glas/internal/config.d. Already implemented configuration available for
    • x87
    • SSE2
    • AVX / AVX2
    • AVX512 (requires LLVM bug fixes).
  4. Create a Pool Request.
  5. Coordinate with LDC team in case of compiler bugs.

Questions & Answers

Why GLAS is called "Generic ..."?

  1. GLAS has a generic internal implementation, which can be easily ported to any other architecture with minimal efforts (5 minutes).
  2. GLAS API provides more functionality comparing with BLAS.
  3. It is written in Dlang using generic programming.

Why it is better then other BLAS Open Source Libraries like OpenBLAS and Eigen?

  1. GLAS is faster.
  2. GLAS API is more user-friendly and does not require additional data copying.
  3. GLAS does not require C++ runtime comparing with Eigen.
  4. GLAS does not require platform specific optimizations like Eigen intrinsics micro kernels and OpenBLAS assembler macro kernels.
  5. GLAS has a simple implementation, which can be easily ported and extended.

Why GLAS does not have Lazy Evaluation and Aliasing like Eigen?

GLAS is a lower level library than Eigen. For example, GLAS can be an Eigen BLAS back-end in the future Lazy Evaluation and Aliasing can be easily implemented in D. Explicit composition of operations can be done using mir.ndslice.algorithm and multidimensional map from mir.ndslice.topology, which is a generic way to perform any lazy operations you want.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].