AbrisAvro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (+209.52%)
Sparkling WaterSparkling Water provides H2O functionality inside Spark cluster
Stars: ✭ 887 (+2011.9%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+1845.24%)
Spylon KernelJupyter kernel for scala and spark
Stars: ✭ 129 (+207.14%)
Zemberek Nlp ServerZemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu
Stars: ✭ 60 (+42.86%)
AngelA Flexible and Powerful Parameter Server for large-scale machine learning
Stars: ✭ 6,458 (+15276.19%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+3809.52%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+1673.81%)
confluent-spark-avroSpark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
Stars: ✭ 18 (-57.14%)
Cdhprojecthadoop各组件使用,持续更新
Stars: ✭ 733 (+1645.24%)
Spring Boot Quick🌿 基于springboot的快速学习示例,整合自己遇到的开源框架,如:rabbitmq(延迟队列)、Kafka、jpa、redies、oauth2、swagger、jsp、docker、spring-batch、异常处理、日志输出、多模块开发、多环境打包、缓存cache、爬虫、jwt、GraphQL、dubbo、zookeeper和Async等等📌
Stars: ✭ 1,819 (+4230.95%)
FramelessExpressive types for Spark.
Stars: ✭ 717 (+1607.14%)
Azure Event Hubs☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs
Stars: ✭ 233 (+454.76%)
LiftThe LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning workflows.
Stars: ✭ 127 (+202.38%)
spark-druid-olapSparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 286 (+580.95%)
FreestyleA cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (+1392.86%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (+200%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+13366.67%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+13026.19%)
Scala SamplesThere are pieces of scala code that explain Scala syntax and related things - like what you can do with all this
Stars: ✭ 125 (+197.62%)
AlluxioAlluxio, data orchestration for analytics and machine learning in the cloud
Stars: ✭ 5,379 (+12707.14%)
Spark-PMoFSpark Shuffle Optimization with RDMA+AEP
Stars: ✭ 28 (-33.33%)
Spark DariaEssential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (+1216.67%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (+190.48%)
LopqTraining of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (+1161.9%)
CdapAn open source framework for building data analytic applications.
Stars: ✭ 509 (+1111.9%)
ZparkioBoiler plate framework to use Spark and ZIO together.
Stars: ✭ 121 (+188.1%)
PointblankData validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (+1042.86%)
swordfishOpen-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-16.67%)
SparkCross-platform real-time collaboration client optimized for business and organizations.
Stars: ✭ 471 (+1021.43%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+52395.24%)
Ruby SparkRuby wrapper for Apache Spark
Stars: ✭ 221 (+426.19%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+14204.76%)
Kinesis SqlKinesis Connector for Structured Streaming
Stars: ✭ 120 (+185.71%)
YanagishimaWeb UI for Trino, Presto, Hive, Elasticsearch, SparkSQL
Stars: ✭ 424 (+909.52%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-40.48%)
MoonboxMoonbox is a DVtaaS (Data Virtualization as a Service) Platform
Stars: ✭ 424 (+909.52%)
Cube.js📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+28430.95%)
LearningsparkScala examples for learning to use Spark
Stars: ✭ 421 (+902.38%)
Spark ExcelA Spark plugin for reading Excel files via Apache POI
Stars: ✭ 216 (+414.29%)
SparkleHaskell on Apache Spark.
Stars: ✭ 419 (+897.62%)
Spring Shiro SparkSpring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (+171.43%)
Enterprise gatewayA lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
Stars: ✭ 412 (+880.95%)
Spark-ArResources for Spark AR
Stars: ✭ 43 (+2.38%)
SparkmonitorMonitor Apache Spark from Jupyter Notebook
Stars: ✭ 154 (+266.67%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (+169.05%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (+38.1%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-61.9%)
SparkV🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.
Stars: ✭ 24 (-42.86%)
spark-demosCollection of different demo applications using Apache Spark
Stars: ✭ 15 (-64.29%)
albisAlbis: High-Performance File Format for Big Data Systems
Stars: ✭ 20 (-52.38%)