Ruby SparkRuby wrapper for Apache Spark
Stars: ✭ 221 (-11.24%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-39.76%)
Hadoop Docker基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Stars: ✭ 238 (-4.42%)
XsqlUnified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (-29.32%)
Nd4jFast, Scientific and Numerical Computing for the JVM (NDArrays)
Stars: ✭ 1,742 (+599.6%)
Spark ExcelA Spark plugin for reading Excel files via Apache POI
Stars: ✭ 216 (-13.25%)
RasterframesGeospatial Raster support for Spark DataFrames
Stars: ✭ 142 (-42.97%)
Kraps RpcA RPC framework leveraging Spark RPC module
Stars: ✭ 175 (-29.72%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-43.78%)
DparkPython clone of Spark, a MapReduce alike framework in Python
Stars: ✭ 2,668 (+971.49%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+911.24%)
Isolation ForestA Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.
Stars: ✭ 139 (-44.18%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-13.65%)
Deeplearning4jSuite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…
Stars: ✭ 12,277 (+4830.52%)
HorovodDistributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Stars: ✭ 11,943 (+4696.39%)
Tf✔️ tf is a microframework for parameterized testing of functions and HTTP in Go.
Stars: ✭ 133 (-46.59%)
AbrisAvro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (-47.79%)
Vim VspecVim plugin: Testing framework for Vim script
Stars: ✭ 207 (-16.87%)
Spylon KernelJupyter kernel for scala and spark
Stars: ✭ 129 (-48.19%)
GeopysparkGeoTrellis for PySpark
Stars: ✭ 167 (-32.93%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+559.44%)
HyperspaceAn open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (-1.2%)
FeastFeature Store for Machine Learning
Stars: ✭ 2,576 (+934.54%)
Big WhaleSpark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (-34.54%)
OpenubaA robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-49%)
Spark Knnk-Nearest Neighbors algorithm on Spark
Stars: ✭ 205 (-17.67%)
Cape PythonCollaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (-49.8%)
Spark Bigquery ConnectorBigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Stars: ✭ 126 (-49.4%)
MydatascienceportfolioApplying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-8.84%)
Spark Infotheoretic Feature SelectionThis package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.
Stars: ✭ 123 (-50.6%)
Vue Info CardSimple and beautiful card component with an elegant spark line, for VueJS.
Stars: ✭ 159 (-36.14%)
DeequDeequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Stars: ✭ 2,020 (+711.24%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+1064.26%)
TeddySpark Streaming监控平台,支持任务部署与告警、自启动
Stars: ✭ 120 (-51.81%)
BanditHuman-friendly unit testing for C++11
Stars: ✭ 240 (-3.61%)
ElassandraElassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+546.59%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-38.96%)
Cube.js📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+4712.45%)
BallistaDistributed compute platform implemented in Rust, and powered by Apache Arrow.
Stars: ✭ 2,274 (+813.25%)
Spring Shiro SparkSpring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (-54.22%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (-54.62%)
Spark WorkshopApache Spark™ and Scala Workshops
Stars: ✭ 224 (-10.04%)
QuillCompile-time Language Integrated Queries for Scala
Stars: ✭ 1,998 (+702.41%)
Data AcceleratorData Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (-0.8%)
Neo4j Spark ConnectorNeo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
Stars: ✭ 245 (-1.61%)
Recheck Webrecheck for web apps – change comparison tool with local Golden Masters, Git-like ignore syntax and "Unbreakable Selenium" tests.
Stars: ✭ 224 (-10.04%)
AzuredatabricksbestpracticesVersion 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs
Stars: ✭ 186 (-25.3%)
Spark TsneDistributed t-SNE via Apache Spark
Stars: ✭ 151 (-39.36%)