ZparkioBoiler plate framework to use Spark and ZIO together.
Stars: ✭ 121 (-20.39%)
Cube.js📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+7783.55%)
OpaqueAn encrypted data analytics platform
Stars: ✭ 129 (-15.13%)
Scala SamplesThere are pieces of scala code that explain Scala syntax and related things - like what you can do with all this
Stars: ✭ 125 (-17.76%)
ElephasDistributed Deep learning with Keras & Spark
Stars: ✭ 1,521 (+900.66%)
HorovodDistributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Stars: ✭ 11,943 (+7757.24%)
TeddySpark Streaming监控平台,支持任务部署与告警、自启动
Stars: ✭ 120 (-21.05%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-7.89%)
Serverless Url ShortenerAzure Function for a URL shortening website. Uses serverless functions, Azure Table Storage and Application Insights.
Stars: ✭ 113 (-25.66%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (-15.79%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-17.11%)
World Cleanup Day☀️ World Cleanup Day: App (React Native) & Platform (Node). Join us in building software for a cleaner planet! PRs welcome!
Stars: ✭ 110 (-27.63%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-19.74%)
RasterframesGeospatial Raster support for Spark DataFrames
Stars: ✭ 142 (-6.58%)
ElassandraElassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+959.21%)
Spring Shiro SparkSpring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (-25%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+1032.24%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-26.32%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1121.05%)
Spring Boot Quick🌿 基于springboot的快速学习示例,整合自己遇到的开源框架,如:rabbitmq(延迟队列)、Kafka、jpa、redies、oauth2、swagger、jsp、docker、spring-batch、异常处理、日志输出、多模块开发、多环境打包、缓存cache、爬虫、jwt、GraphQL、dubbo、zookeeper和Async等等📌
Stars: ✭ 1,819 (+1096.71%)
Cape PythonCollaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (-17.76%)
BigdataclassTwo-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-27.63%)
QuicksqlA Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+1098.03%)
Spark Bigquery ConnectorBigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Stars: ✭ 126 (-17.11%)
Spark AuthorizerA Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (-7.24%)
Spark Infotheoretic Feature SelectionThis package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.
Stars: ✭ 123 (-19.08%)
Apache Spark NodeNode.js bindings for Apache Spark DataFrame APIs
Stars: ✭ 136 (-10.53%)
DeequDeequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Stars: ✭ 2,020 (+1228.95%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (-3.29%)
BlobxferAzure Storage transfer tool and data movement library
Stars: ✭ 120 (-21.05%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-8.55%)
Kinesis SqlKinesis Connector for Structured Streaming
Stars: ✭ 120 (-21.05%)
AbrisAvro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (-14.47%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+972.37%)
Cc PysparkProcess Common Crawl data with Python and Spark
Stars: ✭ 147 (-3.29%)
Spark LucenerddSpark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-25%)
Spylon KernelJupyter kernel for scala and spark
Stars: ✭ 129 (-15.13%)
Laravel Azure StorageMicrosoft Azure Blob Storage integration for Laravel's Storage API
Stars: ✭ 139 (-8.55%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (-25.66%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+980.26%)
ArchivesparkAn Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Stars: ✭ 111 (-26.97%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+7884.21%)
Lambda ArchApplying Lambda Architecture with Spark, Kafka, and Cassandra.
Stars: ✭ 111 (-26.97%)
FeastFeature Store for Machine Learning
Stars: ✭ 2,576 (+1594.74%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-8.55%)
OpenubaA robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-16.45%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-1.32%)
Nd4jFast, Scientific and Numerical Computing for the JVM (NDArrays)
Stars: ✭ 1,742 (+1046.05%)
Isolation ForestA Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.
Stars: ✭ 139 (-8.55%)
LiftThe LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning workflows.
Stars: ✭ 127 (-16.45%)