TfjsA WebGL accelerated JavaScript library for training and deploying ML models.
DreamplaceDeep learning toolkit-enabled VLSI placement
HyperformulaA complete, open-source Excel-like calculation engine written in TypeScript. Includes 380+ built-in functions. Maintained by the Handsontable team⚡
BohriumAutomatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX
GpytorchA highly efficient and modular implementation of Gaussian Processes in PyTorch
GpufitGPU-accelerated Levenberg-Marquardt curve fitting in CUDA
Montecarlomeasurements.jlPropagation of distributions by Monte-Carlo sampling: Real number types with uncertainty represented by samples.
RemixautomlR package for automation of machine learning, forecasting, feature engineering, model evaluation, model interpretation, data generation, and recommenders.
StitchemVahana VR & VideoStitch Studio: software to create immersive 360° VR video, live and in post-production
GpurirPython library for Room Impulse Response (RIR) simulation with GPU acceleration
Marian DevFast Neural Machine Translation in C++ - development repository
Hedgehog LabRun, compile and execute JavaScript for Scientific Computing and Data Visualization TOTALLY TOTALLY TOTALLY in your BROWSER! An open source scientific computing environment for JavaScript TOTALLY in your browser, matrix operations with GPU acceleration, TeX support, data visualization and symbolic computation.
PysnnEfficient Spiking Neural Network framework, built on top of PyTorch for GPU acceleration
BlazingsqlBlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
OhgodatoolA tool for manipulating and editing AMD VBIOSes.
DeepnetDeep.Net machine learning framework for F#
EmuThe write-once-run-anywhere GPGPU library for Rust
Glove As A Tensorflow Embedding LayerTaking a pretrained GloVe model, and using it as a TensorFlow embedding weight layer **inside the GPU**. Therefore, you only need to send the index of the words through the GPU data transfer bus, reducing data transfer overhead.
CekirdeklerMulti-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity).
Tfjs CoreWebGL-accelerated ML // linear algebra // automatic differentiation for JavaScript.
SluggishToy CPU and GPU implementations of the Slug rendering algorithm
HeteroflowConcurrent CPU-GPU Programming using Task Models
Stdgpustdgpu: Efficient STL-like Data Structures on the GPU
DlwinGPU-accelerated Deep Learning on Windows 10 native
Opt einsum⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.
Neuralnetwork.netA TensorFlow-inspired neural network library built from scratch in C# 7.3 for .NET Standard 2.0, with GPU support through cuDNN
BayaderaHigh-performance Bayesian Data Analysis on the GPU in Clojure
VuhVulkan compute for people
TaichiGAMEGPU Accelerated Motion Engine based on Taichi Lang.
CascadeNode-based image editor with GPU-acceleration.
JampackExperimental parallel compression algorithm
vegasflowVegasFlow: accelerating Monte Carlo simulation across multiple hardware platforms
pyanime4kAn easy way to use anime4k in python
Galaxia-RuntimeGalaxy generator for Unity 3D, with Custom Particle Distributors, DirectX 11 Particles and Highly customization, curve driven Generation.
dpnpNumPy drop-in replacement for Intel(R) XPUs
KRSThe Kria Robotics Stack (KRS) is a ROS 2 superset for industry, an integrated set of robot libraries and utilities to accelerate the development, maintenance and commercialization of industrial-grade robotic solutions while using adaptive computing.
Nexus🖼️ Actionscript 3, GPU accelerated 2D game engine using Stage3D
Jamais-VuAudio Fingerprinting and Recognition in Python using NVidia's CUDA
gpuhdMassively Parallel Huffman Decoding on GPUs
PHCpackThe primary source code repository for PHCpack, a software package to solve polynomial systems with homotopy continuation methods.
Apriori-and-Eclat-Frequent-Itemset-MiningImplementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.
GoldenSunA path tracer based on hardware ray tracing
CARECHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.
pytodTOD: GPU-accelerated Outlier Detection via Tensor Operations
QUICKQUICK: A GPU-enabled ab intio quantum chemistry software package
CrossbowCrossbow: A Multi-GPU Deep Learning System for Training with Small Batch Sizes
brian2cudaA brian2 extension to simulate spiking neural networks on GPUs
gpuvmemGPU Framework for Radio Astronomical Image Synthesis
cef-mixerHigh Performance off-screen rendering (OSR) demo using CEF