All Projects → Big Whale → Similar Projects or Alternatives

651 Open source projects that are alternatives of or similar to Big Whale

Elassandra
Elassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+887.73%)
Mutual labels:  spark
Cleanframes
type-class based data cleansing library for Apache Spark SQL
Stars: ✭ 75 (-53.99%)
Mutual labels:  spark
Flint
A Time Series Library for Apache Spark
Stars: ✭ 878 (+438.65%)
Mutual labels:  spark
Sparkling Graph
SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-14.72%)
Mutual labels:  spark
Yanagishima
Web UI for Trino, Presto, Hive, Elasticsearch, SparkSQL
Stars: ✭ 424 (+160.12%)
Mutual labels:  spark
Hdfs Shell
HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (-28.22%)
Mutual labels:  hadoop
Tedsds
Apache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-91.41%)
Mutual labels:  spark
Learningspark
Scala examples for learning to use Spark
Stars: ✭ 421 (+158.28%)
Mutual labels:  spark
Griffon Vm
Griffon Data Science Virtual Machine
Stars: ✭ 128 (-21.47%)
Mutual labels:  hadoop
Antsdb
AntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase
Stars: ✭ 99 (-39.26%)
Mutual labels:  hadoop
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-91.41%)
Mutual labels:  spark
Sparkle
Haskell on Apache Spark.
Stars: ✭ 419 (+157.06%)
Mutual labels:  spark
Vue Info Card
Simple and beautiful card component with an elegant spark line, for VueJS.
Stars: ✭ 159 (-2.45%)
Mutual labels:  spark
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+153.37%)
Mutual labels:  spark
Spark Twitter Stream Example
"Sentiment analysis" on a live Twitter feed with Apache Spark and Apache Bahir
Stars: ✭ 73 (-55.21%)
Mutual labels:  spark
Cdc Kafka Hadoop
MySQL to NoSQL real time dataflow
Stars: ✭ 13 (-92.02%)
Mutual labels:  hadoop
Almond
A Scala kernel for Jupyter
Stars: ✭ 1,354 (+730.67%)
Mutual labels:  spark
Urhox
Urho3D extension library
Stars: ✭ 13 (-92.02%)
Mutual labels:  spark
Xlearning
AI on Hadoop
Stars: ✭ 1,709 (+948.47%)
Mutual labels:  hadoop
Sparkling Titanic
Training models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-92.64%)
Mutual labels:  spark
Luigi Warehouse
A luigi powered analytics / warehouse stack
Stars: ✭ 72 (-55.83%)
Mutual labels:  spark
Ignite
Apache Ignite
Stars: ✭ 4,027 (+2370.55%)
Mutual labels:  hadoop
Datax
DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (-28.83%)
Mutual labels:  hadoop
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (-56.44%)
Mutual labels:  spark
Quill
Compile-time Language Integrated Queries for Scala
Stars: ✭ 1,998 (+1125.77%)
Mutual labels:  spark
Technology Talk
汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+7345.4%)
Mutual labels:  spark
Openuba
A robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-22.09%)
Mutual labels:  spark
Logisland
Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-40.49%)
Mutual labels:  spark
Mlfeature
Feature engineering toolkit for Spark MLlib.
Stars: ✭ 12 (-92.64%)
Mutual labels:  spark
Spark Ml Source Analysis
spark ml 算法原理剖析以及具体的源码实现分析
Stars: ✭ 1,873 (+1049.08%)
Mutual labels:  spark
Atsd
Axibase Time Series Database Documentation
Stars: ✭ 68 (-58.28%)
Mutual labels:  hadoop
Spark Structured Streaming Book
The Internals of Spark Structured Streaming
Stars: ✭ 371 (+127.61%)
Mutual labels:  spark
Spark Lucenerdd
Spark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-30.06%)
Mutual labels:  spark
Sidekick
High Performance HTTP Sidecar Load Balancer
Stars: ✭ 366 (+124.54%)
Mutual labels:  spark
Kontextfrei
Writing application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (-58.9%)
Mutual labels:  spark
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+121.47%)
Mutual labels:  spark
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (-15.95%)
Mutual labels:  spark
Sparkler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (+122.09%)
Mutual labels:  spark
Src
A light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (-58.9%)
Mutual labels:  hadoop
Sparkstreaming
Spark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志分析统计;SpringBoot+Echarts实现数据可视化展示
Stars: ✭ 349 (+114.11%)
Mutual labels:  spark
Spring Shiro Spark
Spring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (-30.06%)
Mutual labels:  spark
Sparklens
Qubole Sparklens tool for performance tuning Apache Spark
Stars: ✭ 345 (+111.66%)
Mutual labels:  spark
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-60.12%)
Mutual labels:  spark
Hadoop Common
Mirror of Apache Hadoop common
Stars: ✭ 155 (-4.91%)
Mutual labels:  hadoop
Ozone
Scalable, redundant, and distributed object store for Apache Hadoop
Stars: ✭ 330 (+102.45%)
Mutual labels:  hadoop
Jumbune
Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-60.74%)
Mutual labels:  hadoop
Gather Deployment
Gathers scalable tensorflow and infrastructure deployment
Stars: ✭ 326 (+100%)
Mutual labels:  hadoop
Spark Mllib Twitter Sentiment Analysis
🌟 ✨ Analyze and visualize Twitter Sentiment on a world map using Spark MLlib
Stars: ✭ 113 (-30.67%)
Mutual labels:  spark
Sparklint
A tool for monitoring and tuning Spark jobs for efficiency.
Stars: ✭ 316 (+93.87%)
Mutual labels:  spark
Pyspark Twitter Stream Mining
Real-time Machine Learning with Apache Spark on Twitter Public Stream
Stars: ✭ 64 (-60.74%)
Mutual labels:  spark
Hbaseclient
HBase客户端数据管理软件
Stars: ✭ 135 (-17.18%)
Mutual labels:  hadoop
Mare
MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-93.25%)
Mutual labels:  spark
Schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-40.49%)
Mutual labels:  spark
Sparkjni
A heterogeneous Apache Spark framework.
Stars: ✭ 11 (-93.25%)
Mutual labels:  spark
Lift
The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning workflows.
Stars: ✭ 127 (-22.09%)
Mutual labels:  spark
Relation extraction
Relation Extraction using Deep learning(CNN)
Stars: ✭ 96 (-41.1%)
Mutual labels:  spark
Hadoop Pot
A scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.
Stars: ✭ 8 (-95.09%)
Mutual labels:  hadoop
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+720.86%)
Mutual labels:  spark
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-84.05%)
Mutual labels:  spark
Stormtweetssentimentd3viz
Computes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.
Stars: ✭ 25 (-84.66%)
Mutual labels:  hadoop
301-360 of 651 similar projects