Utils4sscala、spark使用过程中,各种测试用例以及相关资料整理
Stars: ✭ 1,070 (+846.9%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+1423.01%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+91.15%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1542.48%)
SpartaReal Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+353.98%)
LearningsparkScala examples for learning to use Spark
Stars: ✭ 421 (+272.57%)
CdapAn open source framework for building data analytic applications.
Stars: ✭ 509 (+350.44%)
Example SparkSpark, Spark Streaming and Spark SQL unit testing strategies
Stars: ✭ 205 (+81.42%)
Coolplayspark酷玩 Spark: Spark 源代码解析、Spark 类库等
Stars: ✭ 3,318 (+2836.28%)
AngelA Flexible and Powerful Parameter Server for large-scale machine learning
Stars: ✭ 6,458 (+5615.04%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+722.12%)
Spark StatesCustom state store providers for Apache Spark
Stars: ✭ 83 (-26.55%)
Kinesis SqlKinesis Connector for Structured Streaming
Stars: ✭ 120 (+6.19%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (+23.89%)
Data AcceleratorData Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+118.58%)
Pyspark ExamplesCode examples on Apache Spark using python
Stars: ✭ 58 (-48.67%)
Ammonite SparkRun spark calculations from Ammonite
Stars: ✭ 88 (-22.12%)
SparktutorialSource code for James Lee's Aparch Spark with Java course
Stars: ✭ 105 (-7.08%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-23.89%)
Parquet IndexSpark SQL index for Parquet tables
Stars: ✭ 109 (-3.54%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-25.66%)
Spark On K8s OperatorKubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+1475.22%)
Spark Nlp ModelsModels and Pipelines for the Spark NLP library
Stars: ✭ 88 (-22.12%)
BigdataclassTwo-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-2.65%)
SplashSplash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (-7.08%)
FlintWebex Bot SDK for Node.js (deprecated in favor of https://github.com/webex/webex-bot-node-framework)
Stars: ✭ 85 (-24.78%)
ElephasDistributed Deep learning with Keras & Spark
Stars: ✭ 1,521 (+1246.02%)
Spark FfmFFM (Field-Awared Factorization Machine) on Spark
Stars: ✭ 101 (-10.62%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-27.43%)
MleapMLeap: Deploy ML Pipelines to Production
Stars: ✭ 1,232 (+990.27%)
LeharVisualize data using relative ordering
Stars: ✭ 81 (-28.32%)
Spark GbtlrHybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark
Stars: ✭ 81 (-28.32%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-30.09%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-0.88%)
Lambda ArchApplying Lambda Architecture with Spark, Kafka, and Cassandra.
Stars: ✭ 111 (-1.77%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-4.42%)
AlmondA Scala kernel for Jupyter
Stars: ✭ 1,354 (+1098.23%)
Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (-30.97%)
HomeApacheCN 开源组织:公告、介绍、成员、活动、交流方式
Stars: ✭ 1,199 (+961.06%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-14.16%)
Cleanframestype-class based data cleansing library for Apache Spark SQL
Stars: ✭ 75 (-33.63%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-4.42%)
SchemerSchema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-14.16%)
Ds CheatsheetsList of Data Science Cheatsheets to rule the world
Stars: ✭ 9,452 (+8264.6%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+957.52%)