Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → WojciechMula → Base64simd

WojciechMula / Base64simd

Licence: bsd-2-clause

Base64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)

Labels

simd sse neon avx2 base64 avx512

Projects that are alternatives of or similar to Base64simd

Implementations of SIMD instruction sets for systems which don't natively support them.

Stars: ✭ 1,012 (+780%)

Mutual labels: simd, sse, neon, avx2, avx512

Realtime raytracer using SIMD on ARM, MIPS, PPC and x86

Stars: ✭ 13 (-88.7%)

Mutual labels: simd, sse, neon, avx2, avx512

Portable header-only C++ low level SIMD library

Stars: ✭ 914 (+694.78%)

Mutual labels: simd, sse, neon, avx2, avx512

SIMD Vector Classes for C++

Stars: ✭ 985 (+756.52%)

Mutual labels: simd, sse, neon, avx2, avx512

Unisimd Assembler

SIMD macro assembler unified for ARM, MIPS, PPC and x86

Stars: ✭ 63 (-45.22%)

Mutual labels: simd, sse, neon, avx2, avx512

C++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.

Stars: ✭ 1,263 (+998.26%)

Mutual labels: simd, sse, neon, avx2, avx512

Boost SIMD

Stars: ✭ 238 (+106.96%)

Mutual labels: simd, sse, neon, avx2, avx512

C++ SIMD Noise Library

Stars: ✭ 542 (+371.3%)

Mutual labels: simd, sse, neon, avx2

Performance-portable, length-agnostic SIMD with runtime dispatch

Stars: ✭ 301 (+161.74%)

Mutual labels: simd, neon, avx2, avx512

DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

Stars: ✭ 859 (+646.96%)

Mutual labels: simd, sse, neon, avx2

SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification

Stars: ✭ 115 (+0%)

Mutual labels: sse, neon, avx2, avx512

Agenium Scale vectorization library for CPUs and GPUs

Stars: ✭ 138 (+20%)

Mutual labels: simd, neon, avx2, avx512

Library for specialized dense and sparse matrix operations, and deep learning primitives.

Stars: ✭ 518 (+350.43%)

Mutual labels: simd, sse, avx2, avx512

std::experimental::simd for GCC [ISO/IEC TS 19570:2018]

Stars: ✭ 275 (+139.13%)

Mutual labels: simd, sse, neon, avx512

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)

Stars: ✭ 964 (+738.26%)

Mutual labels: simd, sse, neon, avx512

Support for ternary logic in SSE, XOP, AVX2 and x86 programs

Stars: ✭ 21 (-81.74%)

Mutual labels: sse, simd, avx2, avx512

simd-byte-lookup

SIMDized check which bytes are in a set

Stars: ✭ 23 (-80%)

Mutual labels: sse, simd, avx2, avx512

UME::SIMD A library for explicit simd vectorization.

Stars: ✭ 66 (-42.61%)

Mutual labels: simd, neon, avx2, avx512

📽 Highly Optimized Graphics Math (glm) for C

Stars: ✭ 887 (+671.3%)

Mutual labels: simd, sse, neon

positional-popcount

Fast C functions for the computing the positional popcount (pospopcnt).

Stars: ✭ 47 (-59.13%)

Mutual labels: simd, avx2, avx512

View All Similar Projects ➔

================================================================================ base64 using SIMD instructions

Overview

Repository contains code for encoding and decoding base64 using SIMD instructions. Depending on CPU's architecture, vectorized encoding is faster than scalar versions by factor from 2 to 4; decoding is faster 2 .. 2.7 times.

There are several versions of procedures utilizing following instructions sets:

SSE,
AVX2,
AVX512F,
AVX512BW,
AVX512VBMI,
AVX512VL,
BMI2, and
ARM Neon.

Vectorization approaches were described in a series of articles:

Base64 encoding with SIMD instructions__,
Base64 decoding with SIMD instructions__,
Base64 encoding & decoding using AVX512BW instructions__ (includes AVX512VBMI and AVX512VL),
AVX512F base64 coding and decoding__.

__ http://0x80.pl/notesen/2016-01-12-sse-base64-encoding.html __ http://0x80.pl/notesen/2016-01-17-sse-base64-decoding.html __ http://0x80.pl/notesen/2016-04-03-avx512-base64.html __ http://0x80.pl/articles/avx512-foundation-base64.html

Daniel Lemire__ and I wrote also paper Faster Base64 Encoding and Decoding Using AVX2 Instructions__ which was published by ACM Transactiona on the Web__.

__ http://lemire.me __ https://arxiv.org/abs/1704.00605 __ https://tweb.acm.org/

Performance results from various machines are located in subdirectories results.

Project organization

There are separate subdirectories for both algorithms, however both have the same structure. Each project contains four programs:

verify --- does simple validation of particular parts of algorithms,
check --- validates whole procedures,
speed --- compares speed of different variants of procedures,
benchmark --- similarly to speed but works on small buffers and calculates CPU cycle rate (available only for Intel architectures).

Building

Change to either directory encode or decode and then use following make commands.

.. list-table:: :header-rows: 1

* - command
  - tools
  - instruction sets

* - ``make``
  - ``verify``, ``check``, ``speed``, ``benchmark``
  - scalar, SSE, BMI2

* - ``make avx2``
  - ``verify_avx2``, ``check_avx2``, ``speed_avx2``, ``benchmark_avx2``
  - scalar, SSE, BMI2, AVX2

* - ``make avx512``
  - ``verify_avx512``, ``check_avx512``, ``speed_avx512``, ``benchmark_avx512``
  - scalar, SSE, BMI2, AVX2, AVX512F

* - ``make avx512bw``
  - ``verify_avx512bw``, ``check_avx512bw``, ``speed_avx512bw``, ``benchmark_avx512bw``
  - scalar, SSE, BMI2, AVX2, AVX512F, AVX512BW

* - ``make avx512vbmi``
  - ``verify_avx512vbmi``, ``check_avx512vbmi``, ``benchmark_avx512vbmi``
  - scalar, SSE, BMI2, AVX2, AVX512F, AVX512BW, AVX512VBMI 

* - ``make xop``
  - ``verify_xop``, ``check_xop``, ``speed_xop``, ``benchmark_xop``
  - scalar, SSE and AMD XOP

* - ``make arm``
  - ``verify_arm``, ``check_arm``, ``speed_arm``
  - scalar, ARM Neon

Type make run (for SSE) or make run_ARCH to run all programs for given instruction sets; ARCH can be "sse", "avx2", "avx512", "avx512bw", "avx512vbmi", "avx512vl".

BMI2 presence is determined based on /proc/cpuinfo or a counterpart. When an AVX2 or AVX512 targets are used then BMI2 is enabled by default.

AVX512

To compile AVX512 versions of the programs at least GCC 5.3 is required. GCC 4.9.2 doesn't have AVX512 support.

Please download Intel Software Development Emulator__ in order to run AVX512 variants via make run_avx512, run_avx512bw or run_avx512vbmi. The emulator path should be added to the PATH.

__ https://software.intel.com/en-us/articles/intel-software-development-emulator

Known problems

Both encoding and decoding don't match the base64 specification, there is no processing of data tail, i.e. encoder never produces '=' chars at the end, and decoder doesn't handle them at all.

All these shortcoming are not present in a brilliant library by Alfred Klomp: https://github.com/aklomp/base64.

See also

Daniel's benchmarks and comparison with state of the art solutions https://github.com/lemire/fastbase64

Who uses our algorithms?

C/C++ library by Alfred Klomp https://github.com/aklomp/base64
.NET library by Günther Foidl https://github.com/gfoidl/Base64
there was attempt to include an assembly implementation into Go: https://github.com/golang/go/issues/20206

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 115

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗