Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-48.67%)
interview-refresh-java-bigdataa one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.
Stars: ✭ 25 (-77.88%)
FreestyleA cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (+454.87%)
FlintWebex Bot SDK for Node.js (deprecated in favor of https://github.com/webex/webex-bot-node-framework)
Stars: ✭ 85 (-24.78%)
Tweet-Analysis-With-Kafka-and-SparkA real time analytics dashboard to analyze the trending hashtags and @ mentions at any location using kafka and spark streaming.
Stars: ✭ 18 (-84.07%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+4905.31%)
Spark ALS基于spark-ml,spark-mllib,spark-streaming的推荐算法实现
Stars: ✭ 89 (-21.24%)
fdp-modelserverAn umbrella project for multiple implementations of model serving
Stars: ✭ 47 (-58.41%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+4778.76%)
ElephasDistributed Deep learning with Keras & Spark
Stars: ✭ 1,521 (+1246.02%)
AlluxioAlluxio, data orchestration for analytics and machine learning in the cloud
Stars: ✭ 5,379 (+4660.18%)
HyperspaceAn open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+117.7%)
DparkPython clone of Spark, a MapReduce alike framework in Python
Stars: ✭ 2,668 (+2261.06%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+376.11%)
Azure Event Hubs☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs
Stars: ✭ 233 (+106.19%)
Docker HadoopA Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-52.21%)
prostoProsto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (-52.21%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-76.99%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-85.84%)
Ruby SparkRuby wrapper for Apache Spark
Stars: ✭ 221 (+95.58%)
MagellanGeo Spatial Data Analytics on Spark
Stars: ✭ 507 (+348.67%)
Spark Submit UiThis is a based on playframwork for submit spark app
Stars: ✭ 53 (-53.1%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+90.27%)
Pdf编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+10527.43%)
confluent-spark-avroSpark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
Stars: ✭ 18 (-84.07%)
Bdp Dataplatform大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+303.54%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+76.99%)
Spark NkpNatural Korean Processor for Apache Spark
Stars: ✭ 50 (-55.75%)
ScannsA scalable nearest neighbor search library in Apache Spark
Stars: ✭ 190 (+68.14%)
Bigdataie大数据博客、笔试题、教程、项目、面经的整理
Stars: ✭ 445 (+293.81%)
AzuredatabricksbestpracticesVersion 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs
Stars: ✭ 186 (+64.6%)
KontextfreiWriting application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (-40.71%)
Spark SwaggerSpark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (-77.88%)
Covid19TrackerA Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.
Stars: ✭ 65 (-42.48%)
Sparkstreaming💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。
Stars: ✭ 179 (+58.41%)
Awesome Recommendation EngineThe purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.
Stars: ✭ 47 (-58.41%)
Dji Firmware ToolsTools for handling firmwares of DJI products, with focus on quadcopters.
Stars: ✭ 424 (+275.22%)
SparkFirely's open source FHIR server
Stars: ✭ 174 (+53.98%)
LeharVisualize data using relative ordering
Stars: ✭ 81 (-28.32%)
blogblog entries
Stars: ✭ 39 (-65.49%)
SparkV🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.
Stars: ✭ 24 (-78.76%)
Spark TdaSparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (-60.18%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (+0%)
ArchivesparkAn Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Stars: ✭ 111 (-1.77%)