arrow-datafusionApache Arrow DataFusion SQL Query Engine
Stars: ✭ 2,360 (+3.78%)
Mutual labels: arrow, dataframe, datafusion
DatafusionDataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (-73.13%)
Mutual labels: dataframe, spark, arrow
SparklyrR interface for Apache Spark
Stars: ✭ 775 (-65.92%)
Mutual labels: spark, distributed
Spark RedisA connector for Spark that allows reading and writing to/from Redis cluster
Stars: ✭ 773 (-66.01%)
Mutual labels: dataframe, spark
Net.jgp.labs.sparkApache Spark examples exclusively in Java
Stars: ✭ 55 (-97.58%)
Mutual labels: dataframe, spark
Spark DariaEssential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (-75.68%)
Mutual labels: dataframe, spark
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+148.72%)
Mutual labels: spark, distributed
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (-59.15%)
Mutual labels: dataframe, spark
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-95.12%)
Mutual labels: spark, dataframe
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (-95.03%)
Mutual labels: spark, distributed
Distributed DatasetA distributed data processing framework in Haskell.
Stars: ✭ 108 (-95.25%)
Mutual labels: spark, distributed
Nd4jFast, Scientific and Numerical Computing for the JVM (NDArrays)
Stars: ✭ 1,742 (-23.39%)
Mutual labels: spark, jvm
Pdf编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+428.1%)
Mutual labels: spark, jvm
ModinModin: Speed up your Pandas workflows by changing a single line of code
Stars: ✭ 6,639 (+191.95%)
Mutual labels: dataframe, distributed
Ytk LearnYtk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).
Stars: ✭ 337 (-85.18%)
Mutual labels: spark, distributed
TitanoboaTitanoboa makes complex workflows easy. It is a low-code workflow orchestration platform for JVM - distributed, highly scalable and fault tolerant.
Stars: ✭ 787 (-65.39%)
Mutual labels: jvm, distributed
Js SparkRealtime calculation distributed system. AKA distributed lodash
Stars: ✭ 187 (-91.78%)
Mutual labels: spark, distributed
polarsFast multi-threaded DataFrame library in Rust | Python | Node.js
Stars: ✭ 6,368 (+180.04%)
Mutual labels: arrow, dataframe
deltaDDD-centric event-sourcing library for the JVM
Stars: ✭ 15 (-99.34%)
Mutual labels: jvm, distributed
Java Notes📚 计算机科学基础知识、Java开发、后端/服务端、面试相关 📚 computer-science/Java-development/backend/interview
Stars: ✭ 1,284 (-43.54%)
Mutual labels: jvm, distributed