ElassandraElassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+887.73%)
Cleanframestype-class based data cleansing library for Apache Spark SQL
Stars: ✭ 75 (-53.99%)
FlintA Time Series Library for Apache Spark
Stars: ✭ 878 (+438.65%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-14.72%)
YanagishimaWeb UI for Trino, Presto, Hive, Elasticsearch, SparkSQL
Stars: ✭ 424 (+160.12%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (-28.22%)
TedsdsApache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-91.41%)
LearningsparkScala examples for learning to use Spark
Stars: ✭ 421 (+158.28%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-21.47%)
AntsdbAntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase
Stars: ✭ 99 (-39.26%)
Live log analyzer sparkSpark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-91.41%)
SparkleHaskell on Apache Spark.
Stars: ✭ 419 (+157.06%)
Vue Info CardSimple and beautiful card component with an elegant spark line, for VueJS.
Stars: ✭ 159 (-2.45%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+153.37%)
AlmondA Scala kernel for Jupyter
Stars: ✭ 1,354 (+730.67%)
UrhoxUrho3D extension library
Stars: ✭ 13 (-92.02%)
XlearningAI on Hadoop
Stars: ✭ 1,709 (+948.47%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-92.64%)
Luigi WarehouseA luigi powered analytics / warehouse stack
Stars: ✭ 72 (-55.83%)
IgniteApache Ignite
Stars: ✭ 4,027 (+2370.55%)
DataxDataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (-28.83%)
QuillCompile-time Language Integrated Queries for Scala
Stars: ✭ 1,998 (+1125.77%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+7345.4%)
OpenubaA robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-22.09%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-40.49%)
MlfeatureFeature engineering toolkit for Spark MLlib.
Stars: ✭ 12 (-92.64%)
AtsdAxibase Time Series Database Documentation
Stars: ✭ 68 (-58.28%)
Spark LucenerddSpark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-30.06%)
SidekickHigh Performance HTTP Sidecar Load Balancer
Stars: ✭ 366 (+124.54%)
KontextfreiWriting application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (-58.9%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+121.47%)
SparklerSpark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (+122.09%)
SrcA light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (-58.9%)
SparkstreamingSpark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志分析统计;SpringBoot+Echarts实现数据可视化展示
Stars: ✭ 349 (+114.11%)
Spring Shiro SparkSpring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (-30.06%)
SparklensQubole Sparklens tool for performance tuning Apache Spark
Stars: ✭ 345 (+111.66%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-60.12%)
Hadoop CommonMirror of Apache Hadoop common
Stars: ✭ 155 (-4.91%)
OzoneScalable, redundant, and distributed object store for Apache Hadoop
Stars: ✭ 330 (+102.45%)
JumbuneJumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-60.74%)
Gather DeploymentGathers scalable tensorflow and infrastructure deployment
Stars: ✭ 326 (+100%)
SparklintA tool for monitoring and tuning Spark jobs for efficiency.
Stars: ✭ 316 (+93.87%)
MareMaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-93.25%)
SchemerSchema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-40.49%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-93.25%)
LiftThe LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning workflows.
Stars: ✭ 127 (-22.09%)
Hadoop PotA scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.
Stars: ✭ 8 (-95.09%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+720.86%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-84.05%)
Stormtweetssentimentd3vizComputes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.
Stars: ✭ 25 (-84.66%)