All Projects → Bigdata Notes → Similar Projects or Alternatives

2100 Open source projects that are alternatives of or similar to Bigdata Notes

A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.

Stars: ✭ 412 (-96.25%)

Mutual labels: spark, yarn

Moonbox

Moonbox is a DVtaaS (Data Virtualization as a Service) Platform

Stars: ✭ 424 (-96.14%)

Mutual labels: spark, hive

Listenbrainz Server

Server for the ListenBrainz project

Stars: ✭ 420 (-96.18%)

Mutual labels: spark, big-data

Cortx

CORTX Community Object Storage is 100% open source object storage uniquely optimized for mass capacity storage devices.

Stars: ✭ 426 (-96.12%)

Mutual labels: big-data, bigdata

Metorikku

A simplified, lightweight ETL Framework based on Apache Spark

Stars: ✭ 361 (-96.72%)

Mutual labels: spark, big-data

Cookbook

🎉🎉🎉JAVA高级架构师技术栈==任何技能通过 “刻意练习” 都可以达到融会贯通的境界，就像烹饪一样，这里有一份JAVA开发技术手册，只需要增加自己练习的次数。🏃🏃🏃

Stars: ✭ 428 (-96.11%)

Mutual labels: zookeeper, kafka

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (-96.24%)

Mutual labels: kafka, spark

Yanagishima

Web UI for Trino, Presto, Hive, Elasticsearch, SparkSQL

Stars: ✭ 424 (-96.14%)

Mutual labels: spark, hive

Dataengineeringproject

Example end to end data engineering project.

Stars: ✭ 82 (-99.25%)

Mutual labels: kafka, big-data

Tf Yarn

Train TensorFlow models on YARN in just a few lines of code!

Stars: ✭ 76 (-99.31%)

Mutual labels: hadoop, yarn

Docker Hadoop

A Docker container with a full Hadoop cluster setup with Spark and Zeppelin

Stars: ✭ 54 (-99.51%)

Mutual labels: spark, hadoop

Spark Kafka Writer

Write your Spark data to Kafka seamlessly

Stars: ✭ 175 (-98.41%)

Mutual labels: kafka, spark

Moosefs

MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)

Stars: ✭ 1,025 (-90.67%)

Mutual labels: big-data, hadoop

Devops Bash Tools

550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...

Stars: ✭ 226 (-97.94%)

Mutual labels: kafka, hadoop

Java Sourcecode Blogs

Java源码分析【源码笔记】专注于Java后端系列框架的源码分析，每周持续推出Java后端系列框架的源码分析文章。

Stars: ✭ 448 (-95.92%)

Mutual labels: zookeeper, kafka

Bigslice

A serverless cluster computing system for the Go programming language

Stars: ✭ 469 (-95.73%)

Mutual labels: bigdata, mapreduce

Pdf

编程电子书，电子书，编程书籍，包括C，C#，Docker，Elasticsearch，Git，Hadoop，HeadFirst，Java，Javascript，jvm，Kafka，Linux，Maven，MongoDB，MyBatis，MySQL，Netty，Nginx，Python，RabbitMQ，Redis，Scala，Solr，Spark，Spring，SpringBoot，SpringCloud，TCPIP，Tomcat，Zookeeper，人工智能，大数据类，并发编程，数据库类，数据挖掘，新面试题，架构设计，算法系列，计算机类，设计模式，软件测试，重构优化，等更多分类

Stars: ✭ 12,009 (+9.26%)

Mutual labels: spark, hadoop

Superman

Superman是什么：构建Java 高级开发技术的知识体系，从基础不断打怪升级成为超人之路（更新中.......）

Stars: ✭ 106 (-99.04%)

Mutual labels: zookeeper, kafka

Whatsmars

Java生态研究(Spring Boot + Redis + Dubbo + RocketMQ + Elasticsearch)🔥🔥🔥🔥🔥

Stars: ✭ 1,389 (-87.36%)

Mutual labels: zookeeper, kafka

Kafka Zk Restapi

Kafka Zookeeper RESTful API to perform topic/consumer group administration/metric(offset\lag\message) collection and monitor

Stars: ✭ 121 (-98.9%)

Mutual labels: zookeeper, kafka

Kafka Stack Docker Compose

docker compose files to create a fully working kafka stack

Stars: ✭ 1,836 (-83.3%)

Mutual labels: zookeeper, kafka

Cdap

An open source framework for building data analytic applications.

Stars: ✭ 509 (-95.37%)

Mutual labels: spark, mapreduce

Magellan

Geo Spatial Data Analytics on Spark

Stars: ✭ 507 (-95.39%)

Mutual labels: spark, big-data

Javakeeper

✍️ Java 工程师必备架构体系知识总结：涵盖分布式、微服务、RPC等互联网公司常用架构，以及数据存储、缓存、搜索等必备技能

Stars: ✭ 502 (-95.43%)

Mutual labels: zookeeper, kafka

Thunder

⚡️ Nepxion Thunder is a distribution RPC framework based on Netty + Hessian + Kafka + ActiveMQ + Tibco + Zookeeper + Redis + Spring Web MVC + Spring Boot + Docker 多协议、多组件、多序列化的分布式RPC调用框架

Stars: ✭ 204 (-98.14%)

Mutual labels: zookeeper, kafka

Firecamp

Serverless Platform for the stateful services

Stars: ✭ 194 (-98.23%)

Mutual labels: zookeeper, kafka

Kafdrop

Kafka Web UI

Stars: ✭ 3,158 (-71.27%)

Mutual labels: zookeeper, kafka

Bigdataie

大数据博客、笔试题、教程、项目、面经的整理

Stars: ✭ 445 (-95.95%)

Mutual labels: spark, bigdata

Books Recommendation

程序员进阶书籍（视频），持续更新（Programmer Books）

Stars: ✭ 558 (-94.92%)

Mutual labels: zookeeper, kafka

Alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

Stars: ✭ 5,379 (-51.06%)

Mutual labels: spark, hadoop

Storm Dynamic Spout

A framework for building spouts for Apache Storm and a Kafka based spout for dynamically skipping messages to be processed later.

Stars: ✭ 40 (-99.64%)

Mutual labels: kafka, storm

Operators

Collection of Kubernetes Operators built with KUDO.

Stars: ✭ 175 (-98.41%)

Mutual labels: zookeeper, kafka

Cloud Note

基于分布式的云笔记（参考某道云笔记），数据存储在redis与hbase中

Stars: ✭ 71 (-99.35%)

Mutual labels: hbase, hdfs

gomrjob

gomrjob - a Go Framework for Hadoop Map Reduce Jobs

Stars: ✭ 39 (-99.65%)

Mutual labels: hadoop, mapreduce

Freestyle

A cohesive & pragmatic framework of FP centric Scala libraries

Stars: ✭ 627 (-94.3%)

Mutual labels: kafka, spark

Clustering4Ever

C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.

Stars: ✭ 126 (-98.85%)

Mutual labels: big-data, bigdata

yarn-prometheus-exporter

Export Hadoop YARN (resource-manager) metrics in prometheus format

Stars: ✭ 44 (-99.6%)

Mutual labels: yarn, hadoop

Kafka Streams

equivalent to kafka-streams 🐙 for nodejs ✨🐢🚀✨

Stars: ✭ 613 (-94.42%)

Mutual labels: kafka, big-data

Spark Website

Apache Spark Website

Stars: ✭ 75 (-99.32%)

Mutual labels: spark, big-data

HadoopDedup

🍉基于Hadoop和HBase的大规模海量数据去重

Stars: ✭ 27 (-99.75%)

Mutual labels: big-data, mapreduce

pyspark-algorithms

PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2

Stars: ✭ 72 (-99.34%)

Mutual labels: big-data, mapreduce

smart-data-lake

Smart Automation Tool for building modern Data Lakes and Data Pipelines

Stars: ✭ 79 (-99.28%)

Mutual labels: hive, hadoop

Scriptis

Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.

Stars: ✭ 696 (-93.67%)

Mutual labels: spark, hive

Useractionanalyzeplatform

电商用户行为分析大数据平台

Stars: ✭ 645 (-94.13%)

Mutual labels: spark, hadoop

Cuesheet

A framework for writing Spark 2.x applications in a pretty way

Stars: ✭ 86 (-99.22%)

Mutual labels: spark, yarn

nifi

Deploy a secured, clustered, auto-scaling NiFi service in AWS.

Stars: ✭ 37 (-99.66%)

Mutual labels: big-data, zookeeper

Coding Now

学习记录的一些笔记，以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等

Stars: ✭ 750 (-93.18%)

Mutual labels: spark, bigdata

Cleanframes

type-class based data cleansing library for Apache Spark SQL

Stars: ✭ 75 (-99.32%)

Mutual labels: spark, bigdata

gan deeplearning4j

Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.

Stars: ✭ 19 (-99.83%)

Mutual labels: big-data, bigdata

Zeppelin

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Stars: ✭ 5,513 (-49.84%)

Mutual labels: spark, big-data

Streamx

kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)

Stars: ✭ 96 (-99.13%)

Mutual labels: kafka, big-data

Stream Reactor

Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more.

Stars: ✭ 753 (-93.15%)

Mutual labels: kafka, hbase

Hadoop Yarn Api Python Client

Python client for Hadoop® YARN API

Stars: ✭ 91 (-99.17%)

Mutual labels: hadoop, yarn

Bandar Log

Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.

Stars: ✭ 19 (-99.83%)

Mutual labels: kafka, big-data

Kylo

Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.

Stars: ✭ 916 (-91.67%)

Mutual labels: spark, hadoop

Stormtweetssentimentd3viz

Computes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.

Stars: ✭ 25 (-99.77%)

Mutual labels: hadoop, storm

Kudo

Kubernetes Universal Declarative Operator (KUDO)

Stars: ✭ 849 (-92.28%)

Mutual labels: zookeeper, kafka

Sitewhere

SiteWhere is an industrial strength open-source application enablement platform for the Internet of Things (IoT). It provides a multi-tenant microservice-based infrastructure that includes device/asset management, data ingestion, big-data storage, and integration through a modern, scalable architecture. SiteWhere provides REST APIs for all system functionality. SiteWhere provides SDKs for many common device platforms including Android, iOS, Arduino, and any Java-capable platform such as Raspberry Pi rapidly accelerating the speed of innovation.

Stars: ✭ 788 (-92.83%)

Mutual labels: zookeeper, kafka

Hazelcast Jet

Distributed Stream and Batch Processing

Stars: ✭ 855 (-92.22%)

Mutual labels: kafka, big-data

Bigdata File Viewer

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.

Stars: ✭ 86 (-99.22%)

Mutual labels: bigdata, hdfs

241-300 of 2100 similar projects