isarn-sketches-sparkRoutines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (+12%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+9972%)
Tdigestt-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark
Stars: ✭ 274 (+996%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+500%)
pyspark-algorithmsPySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+188%)
jessicaJessica - Jessie (secure distributed Javascript) Compiler Architecture
Stars: ✭ 27 (+8%)
hyperqueueScheduler for sub-node tasks for HPC systems with batch scheduling
Stars: ✭ 48 (+92%)
high-assurance-legacyLegacy code connected to the high-assurance implementation of the Ouroboros protocol family
Stars: ✭ 81 (+224%)
lazycluster🎛 Distributed machine learning made simple.
Stars: ✭ 43 (+72%)
dask-pytorch-ddpdask-pytorch-ddp is a Python package that makes it easy to train PyTorch models on dask clusters using distributed data parallel.
Stars: ✭ 50 (+100%)
zmqZeroMQ based distributed patterns
Stars: ✭ 27 (+8%)
plinycomputeA system for development of high-performance, data-intensive, distributed computing, applications, tools, and libraries.
Stars: ✭ 27 (+8%)
kuwalaKuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (+1796%)
sparklanesA lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-32%)
Spark-Scala-EKSSpark Scala docker container sample for AWS testing - EKS & S3
Stars: ✭ 23 (-8%)
dcfYet another distributed compute framework
Stars: ✭ 48 (+92%)
hydra-hppHydra Hot Potato Player (game)
Stars: ✭ 12 (-52%)
gordoAn API-first distributed deployment system of deep learning models using timeseries data to predict the behaviour of systems
Stars: ✭ 25 (+0%)
protoactor-goProto Actor - Ultra fast distributed actors for Go, C# and Java/Kotlin
Stars: ✭ 4,138 (+16452%)
good-karma-kit😇 A Docker Compose bundle to run on servers with spare CPU, RAM, disk, and bandwidth to help the world. Includes Tor, ArchiveWarrior, BOINC, and more...
Stars: ✭ 238 (+852%)
distexDistributed process pool for Python
Stars: ✭ 101 (+304%)
check-engineData validation library for PySpark 3.0.0
Stars: ✭ 29 (+16%)
marsjsLabel images from Unsplash in browser - using MobileNet on Tensorflow.Js
Stars: ✭ 53 (+112%)
DataEngineeringThis repo contains commands that data engineers use in day to day work.
Stars: ✭ 47 (+88%)
nebulaA distributed block-based data storage and compute engine
Stars: ✭ 127 (+408%)
SynapseMLSimple and Distributed Machine Learning
Stars: ✭ 3,355 (+13320%)
IoTPyPython for streams
Stars: ✭ 24 (-4%)
asyncoroPython framework for asynchronous, concurrent, distributed, network programming with coroutines
Stars: ✭ 50 (+100%)
tutorialTutorials to help you build your first Swim app
Stars: ✭ 27 (+8%)
pycondorBuild and submit workflows to HTCondor in Python
Stars: ✭ 23 (-8%)
JOLI.jlJulia Operators LIbrary
Stars: ✭ 14 (-44%)
machinarisAn easy-to-use WebUI for crypto plotting and farming. Offers Plotman, MadMax, Chiadog, Bladebit, Farmr, and Forktools in a Docker container. Supports Chia, MMX, Chives, Flax, HDDCoin, and BPX among others.
Stars: ✭ 324 (+1196%)
Prime95Prime95 source code from GIMPS to find Mersenne Prime.
Stars: ✭ 25 (+0%)
cejaPySpark phonetic and string matching algorithms
Stars: ✭ 24 (-4%)
rippleSimple shared surface streaming application
Stars: ✭ 17 (-32%)
raven-distribution-frameworkDecentralized Computing Backend for Artificial Intelligence, Web3, Metaverse, and Gaming Application
Stars: ✭ 31 (+24%)
jobAnalytics and searchJobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (+0%)
crecommon runtime environment for distributed programming languages
Stars: ✭ 20 (-20%)
pyspark-ML-in-ColabPyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (+28%)
rceDistributed, workflow-driven integration environment
Stars: ✭ 42 (+68%)
SparkoraPowerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟
Stars: ✭ 51 (+104%)
nsmc-zeppelin-notebookMovie review dataset Word2Vec & sentiment classification Zeppelin notebook
Stars: ✭ 26 (+4%)
jupyterlab-sparkmonitorJupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook
Stars: ✭ 78 (+212%)
ShadowCloneUnleash the power of cloud
Stars: ✭ 224 (+796%)