All Projects → Gaffer → Similar Projects or Alternatives

1458 Open source projects that are alternatives of or similar to Gaffer

Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+178.99%)
Mutual labels:  big-data, hadoop
Succinct
Enabling queries on compressed data.
Stars: ✭ 257 (-84.35%)
Mutual labels:  spark, big-data
Cloudbreak
A tool for provisioning and managing Apache Hadoop clusters in the cloud. Cloudbreak, as part of the Hortonworks Data Platform, makes it easy to provision, configure and elastically grow HDP clusters on cloud infrastructure. Cloudbreak can be used to provision Hadoop across cloud infrastructure providers including AWS, Azure, GCP and OpenStack.
Stars: ✭ 301 (-81.67%)
Mutual labels:  big-data, hadoop
Elasticluster
Create clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (-81.85%)
Mutual labels:  spark, hadoop
Xlearning Xdml
extremely distributed machine learning
Stars: ✭ 113 (-93.12%)
Mutual labels:  spark, hadoop
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-98.48%)
Mutual labels:  spark, hadoop
Ytk Learn
Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).
Stars: ✭ 337 (-79.48%)
Mutual labels:  spark, hadoop
Ozone
Scalable, redundant, and distributed object store for Apache Hadoop
Stars: ✭ 330 (-79.9%)
Mutual labels:  big-data, hadoop
Oap
Optimized Analytics Package for Spark* Platform
Stars: ✭ 343 (-79.11%)
Mutual labels:  spark, parquet
Tez
Apache Tez
Stars: ✭ 313 (-80.94%)
Mutual labels:  big-data, hadoop
Haproxy Configs
80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
Stars: ✭ 106 (-93.54%)
Mutual labels:  hadoop, hbase
Ignite
Apache Ignite
Stars: ✭ 4,027 (+145.25%)
Mutual labels:  big-data, hadoop
hadoop-docker-lite
Docker build project to setup a lightweight hadoop cluster containing hadoop, pig, zookeeper, hbase, phoenix, storm, kafka, kafka manager
Stars: ✭ 24 (-98.54%)
Mutual labels:  hadoop, hbase
Orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Stars: ✭ 389 (-76.31%)
Mutual labels:  big-data, hadoop
Gremlin Scala
Scala wrapper for Apache TinkerPop 3 Graph DSL
Stars: ✭ 462 (-71.86%)
Mutual labels:  graph, graph-database
Eliasdb
EliasDB a graph-based database.
Stars: ✭ 611 (-62.79%)
Mutual labels:  graph, graph-database
Awesome Graph
A curated list of resources for graph databases and graph computing tools
Stars: ✭ 717 (-56.33%)
Mutual labels:  graph, graph-database
Feast
Feature Store for Machine Learning
Stars: ✭ 2,576 (+56.88%)
Mutual labels:  spark, big-data
Waterdrop
Production Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+13.03%)
Mutual labels:  spark, hadoop
Big data architect skills
一个大数据架构师应该掌握的技能
Stars: ✭ 400 (-75.64%)
Mutual labels:  spark, hadoop
Hdfs Shell
HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (-92.87%)
Mutual labels:  big-data, hadoop
Bigdata
💎🔥大数据学习笔记
Stars: ✭ 488 (-70.28%)
Mutual labels:  hadoop, hbase
Kylo
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Stars: ✭ 916 (-44.21%)
Mutual labels:  spark, hadoop
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+235.75%)
Mutual labels:  spark, big-data
Pdf
编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+631.36%)
Mutual labels:  spark, hadoop
Useractionanalyzeplatform
电商用户行为分析大数据平台
Stars: ✭ 645 (-60.72%)
Mutual labels:  spark, hadoop
Flink Learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+592.94%)
Mutual labels:  spark, hbase
Sparkjni
A heterogeneous Apache Spark framework.
Stars: ✭ 11 (-99.33%)
Mutual labels:  spark, big-data
Spark Movie Lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (-54.63%)
Mutual labels:  spark, big-data
bigtable
TypeScript Bigtable Client with 🔋🔋 included.
Stars: ✭ 13 (-99.21%)
Mutual labels:  big-data, hbase
Parquet Generator
Parquet file generator
Stars: ✭ 16 (-99.03%)
Mutual labels:  spark, parquet
Data Algorithms Book
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (-42.2%)
Mutual labels:  spark, hadoop
Hadoop For Geoevent
ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-99.7%)
Mutual labels:  big-data, hadoop
Interview Questions Collection
按知识领域整理面试题,包括C++、Java、Hadoop、机器学习等
Stars: ✭ 21 (-98.72%)
Mutual labels:  spark, hadoop
Pucket
Bucketing and partitioning system for Parquet
Stars: ✭ 29 (-98.23%)
Mutual labels:  spark, parquet
Heracles
High performance HBase / Spark SQL engine
Stars: ✭ 27 (-98.36%)
Mutual labels:  spark, hbase
Spark
Apache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+1825.58%)
Mutual labels:  spark, big-data
Indradb
A graph database written in rust
Stars: ✭ 1,035 (-36.97%)
Mutual labels:  graph, graph-database
Docker Hadoop
A Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-96.71%)
Mutual labels:  spark, hadoop
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (-37.58%)
Mutual labels:  big-data, hadoop
Waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-96.35%)
Mutual labels:  spark, hadoop
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-96.47%)
Mutual labels:  spark, parquet
Movies Javascript Bolt
Neo4j Movies Example with webpack-in-browser app using the neo4j-javascript-driver
Stars: ✭ 123 (-92.51%)
Mutual labels:  graph, graph-database
Movies Java Bolt
Neo4j Movies Example application with SparkJava backend using the neo4j-java-driver
Stars: ✭ 66 (-95.98%)
Mutual labels:  graph, graph-database
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-96.04%)
Mutual labels:  spark, big-data
Atsd
Axibase Time Series Database Documentation
Stars: ✭ 68 (-95.86%)
Mutual labels:  hadoop, hbase
Spring Boot Quick
🌿 基于springboot的快速学习示例,整合自己遇到的开源框架,如:rabbitmq(延迟队列)、Kafka、jpa、redies、oauth2、swagger、jsp、docker、spring-batch、异常处理、日志输出、多模块开发、多环境打包、缓存cache、爬虫、jwt、GraphQL、dubbo、zookeeper和Async等等📌
Stars: ✭ 1,819 (+10.78%)
Mutual labels:  spark, hbase
Nagios Plugins
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (-39.1%)
Mutual labels:  hadoop, hbase
Spark Doc Zh
Apache Spark 官方文档中文版
Stars: ✭ 1,126 (-31.43%)
Mutual labels:  spark, big-data
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (-95.68%)
Mutual labels:  spark, big-data
Amazon S3 Find And Forget
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
Stars: ✭ 115 (-93%)
Mutual labels:  big-data, parquet
Docker Spark
🚢 Docker image for Apache Spark
Stars: ✭ 78 (-95.25%)
Mutual labels:  spark, hadoop
Spark Website
Apache Spark Website
Stars: ✭ 75 (-95.43%)
Mutual labels:  spark, big-data
Asakusafw
Asakusa Framework
Stars: ✭ 114 (-93.06%)
Mutual labels:  big-data, hadoop
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (-27.22%)
Mutual labels:  spark, hadoop
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-95.19%)
Mutual labels:  spark, big-data
Neo4j
Graphs for Everyone
Stars: ✭ 9,582 (+483.56%)
Mutual labels:  graph, graph-database
Cog
A Persistent Embedded Graph Database for Python
Stars: ✭ 90 (-94.52%)
Mutual labels:  graph, graph-database
Redisgraph
A graph database as a Redis module
Stars: ✭ 1,292 (-21.32%)
Mutual labels:  graph, graph-database
Tinkerpop
Apache TinkerPop - a graph computing framework
Stars: ✭ 1,309 (-20.28%)
Mutual labels:  graph, graph-database
61-120 of 1458 similar projects