All Projects → Gaffer → Similar Projects or Alternatives

1458 Open source projects that are alternatives of or similar to Gaffer

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Stars: ✭ 4,581 (+178.99%)

Mutual labels: big-data, hadoop

Succinct

Enabling queries on compressed data.

Stars: ✭ 257 (-84.35%)

Mutual labels: spark, big-data

Cloudbreak

A tool for provisioning and managing Apache Hadoop clusters in the cloud. Cloudbreak, as part of the Hortonworks Data Platform, makes it easy to provision, configure and elastically grow HDP clusters on cloud infrastructure. Cloudbreak can be used to provision Hadoop across cloud infrastructure providers including AWS, Azure, GCP and OpenStack.

Stars: ✭ 301 (-81.67%)

Mutual labels: big-data, hadoop

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (-81.85%)

Mutual labels: spark, hadoop

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (-93.12%)

Mutual labels: spark, hadoop

basin

Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser

Stars: ✭ 25 (-98.48%)

Mutual labels: spark, hadoop

Ytk Learn

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Stars: ✭ 337 (-79.48%)

Mutual labels: spark, hadoop

Ozone

Scalable, redundant, and distributed object store for Apache Hadoop

Stars: ✭ 330 (-79.9%)

Mutual labels: big-data, hadoop

Oap

Optimized Analytics Package for Spark* Platform

Stars: ✭ 343 (-79.11%)

Mutual labels: spark, parquet

Tez

Apache Tez

Stars: ✭ 313 (-80.94%)

Mutual labels: big-data, hadoop

Haproxy Configs

80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.

Stars: ✭ 106 (-93.54%)

Mutual labels: hadoop, hbase

Ignite

Apache Ignite

Stars: ✭ 4,027 (+145.25%)

Mutual labels: big-data, hadoop

hadoop-docker-lite

Docker build project to setup a lightweight hadoop cluster containing hadoop, pig, zookeeper, hbase, phoenix, storm, kafka, kafka manager

Stars: ✭ 24 (-98.54%)

Mutual labels: hadoop, hbase

Orc

Apache ORC - the smallest, fastest columnar storage for Hadoop workloads

Stars: ✭ 389 (-76.31%)

Mutual labels: big-data, hadoop

Gremlin Scala

Scala wrapper for Apache TinkerPop 3 Graph DSL

Stars: ✭ 462 (-71.86%)

Mutual labels: graph, graph-database

Eliasdb

EliasDB a graph-based database.

Stars: ✭ 611 (-62.79%)

Mutual labels: graph, graph-database

Awesome Graph

A curated list of resources for graph databases and graph computing tools

Stars: ✭ 717 (-56.33%)

Mutual labels: graph, graph-database

Feast

Feature Store for Machine Learning

Stars: ✭ 2,576 (+56.88%)

Mutual labels: spark, big-data

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (+13.03%)

Mutual labels: spark, hadoop

Big data architect skills

一个大数据架构师应该掌握的技能

Stars: ✭ 400 (-75.64%)

Mutual labels: spark, hadoop

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (-92.87%)

Mutual labels: big-data, hadoop

Bigdata

💎🔥大数据学习笔记

Stars: ✭ 488 (-70.28%)

Mutual labels: hadoop, hbase

Kylo

Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.

Stars: ✭ 916 (-44.21%)

Mutual labels: spark, hadoop

Zeppelin

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Stars: ✭ 5,513 (+235.75%)

Mutual labels: spark, big-data

Pdf

编程电子书，电子书，编程书籍，包括C，C#，Docker，Elasticsearch，Git，Hadoop，HeadFirst，Java，Javascript，jvm，Kafka，Linux，Maven，MongoDB，MyBatis，MySQL，Netty，Nginx，Python，RabbitMQ，Redis，Scala，Solr，Spark，Spring，SpringBoot，SpringCloud，TCPIP，Tomcat，Zookeeper，人工智能，大数据类，并发编程，数据库类，数据挖掘，新面试题，架构设计，算法系列，计算机类，设计模式，软件测试，重构优化，等更多分类

Stars: ✭ 12,009 (+631.36%)

Mutual labels: spark, hadoop

Useractionanalyzeplatform

电商用户行为分析大数据平台

Stars: ✭ 645 (-60.72%)

Mutual labels: spark, hadoop

Flink Learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+592.94%)

Mutual labels: spark, hbase

Sparkjni

A heterogeneous Apache Spark framework.

Stars: ✭ 11 (-99.33%)

Mutual labels: spark, big-data

Spark Movie Lens

An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Stars: ✭ 745 (-54.63%)

Mutual labels: spark, big-data

bigtable

TypeScript Bigtable Client with 🔋🔋 included.

Stars: ✭ 13 (-99.21%)

Mutual labels: big-data, hbase

Parquet Generator

Parquet file generator

Stars: ✭ 16 (-99.03%)

Mutual labels: spark, parquet

Data Algorithms Book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Stars: ✭ 949 (-42.2%)

Mutual labels: spark, hadoop

Hadoop For Geoevent

ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.

Stars: ✭ 5 (-99.7%)

Mutual labels: big-data, hadoop

Interview Questions Collection

按知识领域整理面试题，包括C++、Java、Hadoop、机器学习等

Stars: ✭ 21 (-98.72%)

Mutual labels: spark, hadoop

Pucket

Bucketing and partitioning system for Parquet

Stars: ✭ 29 (-98.23%)

Mutual labels: spark, parquet

Heracles

High performance HBase / Spark SQL engine

Stars: ✭ 27 (-98.36%)

Mutual labels: spark, hbase

Spark

Apache Spark - A unified analytics engine for large-scale data processing

Stars: ✭ 31,618 (+1825.58%)

Mutual labels: spark, big-data

Indradb

A graph database written in rust

Stars: ✭ 1,035 (-36.97%)

Mutual labels: graph, graph-database

Docker Hadoop

A Docker container with a full Hadoop cluster setup with Spark and Zeppelin

Stars: ✭ 54 (-96.71%)

Mutual labels: spark, hadoop

Moosefs

MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)

Stars: ✭ 1,025 (-37.58%)

Mutual labels: big-data, hadoop

Waimak

Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.

Stars: ✭ 60 (-96.35%)

Mutual labels: spark, hadoop

Rumble

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-96.47%)

Mutual labels: spark, parquet

Movies Javascript Bolt

Neo4j Movies Example with webpack-in-browser app using the neo4j-javascript-driver

Stars: ✭ 123 (-92.51%)

Mutual labels: graph, graph-database

Movies Java Bolt

Neo4j Movies Example application with SparkJava backend using the neo4j-java-driver

Stars: ✭ 66 (-95.98%)

Mutual labels: graph, graph-database

Rsparkling

RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)

Stars: ✭ 65 (-96.04%)

Mutual labels: spark, big-data

Atsd

Axibase Time Series Database Documentation

Stars: ✭ 68 (-95.86%)

Mutual labels: hadoop, hbase

Spring Boot Quick

🌿 基于springboot的快速学习示例,整合自己遇到的开源框架,如：rabbitmq(延迟队列)、Kafka、jpa、redies、oauth2、swagger、jsp、docker、spring-batch、异常处理、日志输出、多模块开发、多环境打包、缓存cache、爬虫、jwt、GraphQL、dubbo、zookeeper和Async等等📌

Stars: ✭ 1,819 (+10.78%)

Mutual labels: spark, hbase

Nagios Plugins

450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...

Stars: ✭ 1,000 (-39.1%)

Mutual labels: hadoop, hbase

Spark Doc Zh

Apache Spark 官方文档中文版

Stars: ✭ 1,126 (-31.43%)

Mutual labels: spark, big-data

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (-95.68%)

Mutual labels: spark, big-data

Amazon S3 Find And Forget

Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)

Stars: ✭ 115 (-93%)

Mutual labels: big-data, parquet

Docker Spark

🚢 Docker image for Apache Spark

Stars: ✭ 78 (-95.25%)

Mutual labels: spark, hadoop

Spark Website

Apache Spark Website

Stars: ✭ 75 (-95.43%)

Mutual labels: spark, big-data

Asakusafw

Asakusa Framework

Stars: ✭ 114 (-93.06%)

Mutual labels: big-data, hadoop

Dataspherestudio

DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

Stars: ✭ 1,195 (-27.22%)

Mutual labels: spark, hadoop

Setl

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (-95.19%)

Mutual labels: spark, big-data

Neo4j

Graphs for Everyone

Stars: ✭ 9,582 (+483.56%)

Mutual labels: graph, graph-database

Cog

A Persistent Embedded Graph Database for Python