All Projects → Ruby Spark → Similar Projects or Alternatives

752 Open source projects that are alternatives of or similar to Ruby Spark

H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+2459.28%)
Mutual labels:  spark, distributed
Ballista
Distributed compute platform implemented in Rust, and powered by Apache Arrow.
Stars: ✭ 2,274 (+928.96%)
Mutual labels:  spark, distributed
Ytk Learn
Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).
Stars: ✭ 337 (+52.49%)
Mutual labels:  spark, distributed
Distributed Dataset
A distributed data processing framework in Haskell.
Stars: ✭ 108 (-51.13%)
Mutual labels:  spark, distributed
Js Spark
Realtime calculation distributed system. AKA distributed lodash
Stars: ✭ 187 (-15.38%)
Mutual labels:  spark, distributed
Sparklyr
R interface for Apache Spark
Stars: ✭ 775 (+250.68%)
Mutual labels:  spark, distributed
Xlearning Xdml
extremely distributed machine learning
Stars: ✭ 113 (-48.87%)
Mutual labels:  spark, distributed
Kotlin Spark Api
This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x
Stars: ✭ 183 (-17.19%)
Mutual labels:  spark
Voik
♒︎ [WIP] An experimental ~distributed~ commit-log
Stars: ✭ 200 (-9.5%)
Mutual labels:  distributed
Spark Streaming With Kafka
Self-contained examples of Apache Spark streaming integrated with Apache Kafka.
Stars: ✭ 180 (-18.55%)
Mutual labels:  spark
Kraps Rpc
A RPC framework leveraging Spark RPC module
Stars: ✭ 175 (-20.81%)
Mutual labels:  spark
Azuredatabricksbestpractices
Version 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs
Stars: ✭ 186 (-15.84%)
Mutual labels:  spark
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+1211.76%)
Mutual labels:  spark
Bastion
Highly-available Distributed Fault-tolerant Runtime
Stars: ✭ 2,333 (+955.66%)
Mutual labels:  distributed
Gerapy
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
Stars: ✭ 2,601 (+1076.92%)
Mutual labels:  distributed
Xsql
Unified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (-20.36%)
Mutual labels:  spark
Atomnas
Code for ICLR 2020 paper 'AtomNAS: Fine-Grained End-to-End Neural Architecture Search'
Stars: ✭ 197 (-10.86%)
Mutual labels:  distributed
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-2.71%)
Mutual labels:  spark
Bigben
BigBen - a generic, multi-tenant, time-based event scheduler and cron scheduling framework
Stars: ✭ 174 (-21.27%)
Mutual labels:  distributed
Example Spark
Spark, Spark Streaming and Spark SQL unit testing strategies
Stars: ✭ 205 (-7.24%)
Mutual labels:  spark
Lingvo
Lingvo
Stars: ✭ 2,361 (+968.33%)
Mutual labels:  distributed
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+1039.37%)
Mutual labels:  spark
Lightgbm
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Stars: ✭ 13,293 (+5914.93%)
Mutual labels:  distributed
Tfmesos
Tensorflow in Docker on Mesos #tfmesos #tensorflow #mesos
Stars: ✭ 194 (-12.22%)
Mutual labels:  distributed
Transmogrifai
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Stars: ✭ 2,084 (+842.99%)
Mutual labels:  spark
Spark Structured Streaming Examples
Spark Structured Streaming / Kafka / Cassandra / Elastic
Stars: ✭ 168 (-23.98%)
Mutual labels:  spark
Bit
A tool for component-driven application development.
Stars: ✭ 14,443 (+6435.29%)
Mutual labels:  distributed
Plynx
PLynx is a domain agnostic platform for managing reproducible experiments and data-oriented workflows.
Stars: ✭ 192 (-13.12%)
Mutual labels:  distributed
Geopyspark
GeoTrellis for PySpark
Stars: ✭ 167 (-24.43%)
Mutual labels:  spark
Improved Body Parts
Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation
Stars: ✭ 202 (-8.6%)
Mutual labels:  distributed
Diaspora
A privacy-aware, distributed, open source social network.
Stars: ✭ 12,937 (+5753.85%)
Mutual labels:  distributed
Pottery
Redis for humans. 🌎🌍🌏
Stars: ✭ 204 (-7.69%)
Mutual labels:  distributed
Roaringbitmap
A better compressed bitset in Java
Stars: ✭ 2,460 (+1013.12%)
Mutual labels:  spark
Spark Practice
Apache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (-9.5%)
Mutual labels:  spark
Dkeras
Distributed Keras Engine, Make Keras faster with only one line of code.
Stars: ✭ 181 (-18.1%)
Mutual labels:  distributed
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (-2.26%)
Mutual labels:  spark
Sparkstreaming
💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。
Stars: ✭ 179 (-19%)
Mutual labels:  spark
Cookim
Distributed web chat application base websocket built on akka.
Stars: ✭ 198 (-10.41%)
Mutual labels:  distributed
Spark Kafka Writer
Write your Spark data to Kafka seamlessly
Stars: ✭ 175 (-20.81%)
Mutual labels:  spark
Scannerl
The modular distributed fingerprinting engine
Stars: ✭ 208 (-5.88%)
Mutual labels:  distributed
Spark
Firely's open source FHIR server
Stars: ✭ 174 (-21.27%)
Mutual labels:  spark
Dsock
Distributed WebSocket broker
Stars: ✭ 197 (-10.86%)
Mutual labels:  distributed
Spoon
🥄 A package for building specific Proxy Pool for different Sites.
Stars: ✭ 173 (-21.72%)
Mutual labels:  distributed
Vernemq
A distributed MQTT message broker based on Erlang/OTP. Built for high quality & Industrial use cases.
Stars: ✭ 2,628 (+1089.14%)
Mutual labels:  distributed
Idworker
idworker 是一个基于zookeeper和snowflake算法的分布式ID生成工具,通过zookeeper自动注册机器(最多1024台),无需手动指定workerId和datacenterId
Stars: ✭ 171 (-22.62%)
Mutual labels:  distributed
Deeplearning4j
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…
Stars: ✭ 12,277 (+5455.2%)
Mutual labels:  spark
Spark Knn
k-Nearest Neighbors algorithm on Spark
Stars: ✭ 205 (-7.24%)
Mutual labels:  spark
Onyx
Distributed, masterless, high performance, fault tolerant data processing
Stars: ✭ 2,019 (+813.57%)
Mutual labels:  distributed
Herddb
A JVM-embeddable Distributed Database
Stars: ✭ 192 (-13.12%)
Mutual labels:  distributed
Spark Iforest
Isolation Forest on Spark
Stars: ✭ 166 (-24.89%)
Mutual labels:  spark
Pysr
Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing
Stars: ✭ 213 (-3.62%)
Mutual labels:  distributed
Azure Cosmosdb Spark
Apache Spark Connector for Azure Cosmos DB
Stars: ✭ 165 (-25.34%)
Mutual labels:  spark
Zi5book
book.zi5.me全站kindle电子书籍爬取,按照作者书籍名分类,每本书有mobi和equb两种格式,采用分布式进行全站爬取
Stars: ✭ 191 (-13.57%)
Mutual labels:  distributed
Big Whale
Spark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (-26.24%)
Mutual labels:  spark
Whylogs Java
Profile and monitor your ML data pipeline end-to-end
Stars: ✭ 164 (-25.79%)
Mutual labels:  spark
Oneflow
OneFlow is a performance-centered and open-source deep learning framework.
Stars: ✭ 2,868 (+1197.74%)
Mutual labels:  distributed
Scanns
A scalable nearest neighbor search library in Apache Spark
Stars: ✭ 190 (-14.03%)
Mutual labels:  spark
Dop
JavaScript implementation for Distributed Object Protocol
Stars: ✭ 163 (-26.24%)
Mutual labels:  distributed
Bigdata docker
Big Data Ecosystem Docker
Stars: ✭ 161 (-27.15%)
Mutual labels:  spark
Arewedistributedyet
Website + Community effort to unlock the peer-to-peer web at arewedistributedyet.com ⚡🌐🔑
Stars: ✭ 189 (-14.48%)
Mutual labels:  distributed
1-60 of 752 similar projects