FramelessExpressive types for Spark.
Stars: ✭ 717 (+497.5%)
xxhadoopData Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (-69.17%)
interview-refresh-java-bigdataa one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.
Stars: ✭ 25 (-79.17%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-50%)
Tweet-Analysis-With-Kafka-and-SparkA real time analytics dashboard to analyze the trending hashtags and @ mentions at any location using kafka and spark streaming.
Stars: ✭ 18 (-85%)
Spark LucenerddSpark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-5%)
Spark ALS基于spark-ml,spark-mllib,spark-streaming的推荐算法实现
Stars: ✭ 89 (-25.83%)
FreestyleA cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (+422.5%)
fdp-modelserverAn umbrella project for multiple implementations of model serving
Stars: ✭ 47 (-60.83%)
Zemberek Nlp ServerZemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu
Stars: ✭ 60 (-50%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+4613.33%)
Spark Nlp ModelsModels and Pipelines for the Spark NLP library
Stars: ✭ 88 (-26.67%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+4494.17%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-79.17%)
HyperspaceAn open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+105%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-51.67%)
DparkPython clone of Spark, a MapReduce alike framework in Python
Stars: ✭ 2,668 (+2123.33%)
AlluxioAlluxio, data orchestration for analytics and machine learning in the cloud
Stars: ✭ 5,379 (+4382.5%)
Seldon ServerMachine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+1095.83%)
Azure Event Hubs☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs
Stars: ✭ 233 (+94.17%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+348.33%)
Ruby SparkRuby wrapper for Apache Spark
Stars: ✭ 221 (+84.17%)
MlfeatureFeature engineering toolkit for Spark MLlib.
Stars: ✭ 12 (-90%)
daf-kyloKylo integration with PDND (previously DAF).
Stars: ✭ 20 (-83.33%)
dllibdllib is a distributed deep learning library running on Apache Spark
Stars: ✭ 32 (-73.33%)
LabsResearch on distributed system
Stars: ✭ 73 (-39.17%)
MareMaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-90.83%)
MagellanGeo Spatial Data Analytics on Spark
Stars: ✭ 507 (+322.5%)
ElephasDistributed Deep learning with Keras & Spark
Stars: ✭ 1,521 (+1167.5%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+66.67%)
Pdf编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+9907.5%)
ScannsA scalable nearest neighbor search library in Apache Spark
Stars: ✭ 190 (+58.33%)
Docker HadoopA Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-55%)
AzuredatabricksbestpracticesVersion 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs
Stars: ✭ 186 (+55%)
Bdp Dataplatform大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+280%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-90.83%)
spark-data-sourcesDeveloping Spark External Data Sources using the V2 API
Stars: ✭ 36 (-70%)
prostoProsto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (-55%)
Bigdataie大数据博客、笔试题、教程、项目、面经的整理
Stars: ✭ 445 (+270.83%)
ElassandraElassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+1241.67%)
Cube.js📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+9885.83%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (-5.83%)
Parquet IndexSpark SQL index for Parquet tables
Stars: ✭ 109 (-9.17%)