All Projects → Kafka Connect Hdfs → Similar Projects or Alternatives

1385 Open source projects that are alternatives of or similar to Kafka Connect Hdfs

kafka-connect-fs

Kafka Connect FileSystem Connector

Stars: ✭ 107 (-73.25%)

Mutual labels: hadoop, confluent, hdfs, apache-kafka

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+2647.75%)

Mutual labels: kafka, big-data, hadoop, hdfs

Camus

Mirror of Linkedin's Camus

Stars: ✭ 81 (-79.75%)

Mutual labels: kafka, hadoop, confluent, hdfs

Repository

个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。

Stars: ✭ 92 (-77%)

Mutual labels: kafka, hadoop, hdfs

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-96.75%)

Mutual labels: big-data, hadoop, hdfs

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (-70.75%)

Mutual labels: big-data, hadoop, hdfs

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (-55.75%)

Mutual labels: kafka, big-data, hadoop

Kafka Ui

Open-Source Web GUI for Apache Kafka Management

Stars: ✭ 230 (-42.5%)

Mutual labels: apache-kafka, kafka, big-data

Hadoop For Geoevent

ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.

Stars: ✭ 5 (-98.75%)

Mutual labels: big-data, hadoop, hdfs

God Of Bigdata

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stars: ✭ 6,008 (+1402%)

Mutual labels: kafka, hadoop, hdfs

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+114.25%)

Mutual labels: kafka, hadoop, hdfs

Kafka Connect Jdbc

Kafka Connect connector for JDBC-compatible databases

Stars: ✭ 698 (+74.5%)

Mutual labels: kafka, confluent, streaming

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-62.5%)

Mutual labels: big-data, hadoop, hdfs

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-96.5%)

Mutual labels: big-data, hadoop, hdfs

Streamx

kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)

Stars: ✭ 96 (-76%)

Mutual labels: kafka, big-data, streaming

Bigdata Notebook

Stars: ✭ 100 (-75%)

Mutual labels: kafka, hadoop, streaming

Sparta

Real Time Analytics and Data Pipelines based on Spark Streaming

Stars: ✭ 513 (+28.25%)

Mutual labels: kafka, hdfs, streaming

Eel Sdk

Big Data Toolkit for the JVM

Stars: ✭ 140 (-65%)

Mutual labels: kafka, big-data, hadoop

Data Accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (-38.25%)

Mutual labels: kafka, big-data, streaming

Kafkactl

Command Line Tool for managing Apache Kafka

Stars: ✭ 177 (-55.75%)

Mutual labels: apache-kafka, kafka

Cp All In One

docker-compose.yml files for cp-all-in-one , cp-all-in-one-community, cp-all-in-one-cloud

Stars: ✭ 239 (-40.25%)

Mutual labels: apache-kafka, kafka

Kafka Sprout

🚀 Web GUI for Kafka Cluster Management

Stars: ✭ 388 (-3%)

Mutual labels: apache-kafka, kafka

kafkaESK

An event-driven monitoring tool that can consume messages from Apache Kafka clusters and display the aggregated data on a dashboard for analysis and maintenance.

Stars: ✭ 79 (-80.25%)

Mutual labels: confluent, apache-kafka

docker-hadoop

Docker image for main Apache Hadoop components (Yarn/Hdfs)

Stars: ✭ 59 (-85.25%)

Mutual labels: hadoop, hdfs

ksql-jdbc-driver

JDBC driver for Apache Kafka

Stars: ✭ 85 (-78.75%)

Mutual labels: confluent, apache-kafka

Ignite

Apache Ignite

Stars: ✭ 4,027 (+906.75%)

Mutual labels: big-data, hadoop

Kop

Kafka-on-Pulsar - A protocol handler that brings native Kafka protocol to Apache Pulsar

Stars: ✭ 159 (-60.25%)

Mutual labels: apache-kafka, kafka

Azkarra Streams

🚀 Azkarra is a lightweight java framework to make it easy to develop, deploy and manage cloud-native streaming microservices based on Apache Kafka Streams.

Stars: ✭ 146 (-63.5%)

Mutual labels: apache-kafka, kafka

Real Time Social Media Mining

DevOps pipeline for Real Time Social/Web Mining

Stars: ✭ 22 (-94.5%)

Mutual labels: big-data, hdfs

Kafka Tutorials

Kafka Tutorials microsite

Stars: ✭ 144 (-64%)

Mutual labels: apache-kafka, kafka

teraslice

Scalable data processing pipelines in JavaScript

Stars: ✭ 48 (-88%)

Mutual labels: hadoop, hdfs

skein

A tool and library for easily deploying applications on Apache YARN

Stars: ✭ 128 (-68%)

Mutual labels: hadoop, hdfs

Bigdl

Building Large-Scale AI Applications for Distributed Big Data

Stars: ✭ 3,813 (+853.25%)

Mutual labels: big-data, hadoop

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (-90.75%)

Mutual labels: hadoop, hdfs

Oryx

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

Stars: ✭ 1,785 (+346.25%)

Mutual labels: apache-kafka, kafka

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (-90.25%)

Mutual labels: big-data, hadoop

iis

Information Inference Service of the OpenAIRE system

Stars: ✭ 16 (-96%)

Mutual labels: big-data, hadoop

rastercube

rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)

Stars: ✭ 15 (-96.25%)

Mutual labels: big-data, hadoop

hive to es

同步Hive数据仓库数据到Elasticsearch的小工具

Stars: ✭ 21 (-94.75%)

Mutual labels: hadoop, hdfs

datasqueeze

Hadoop utility to compact small files

Stars: ✭ 18 (-95.5%)

Mutual labels: hadoop, hdfs

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-95.25%)

Mutual labels: hadoop, hdfs

aaocp

一个对用户行为日志进行分析的大数据项目

Stars: ✭ 53 (-86.75%)

Mutual labels: hadoop, hdfs

fsbrowser

Fast desktop client for Hadoop Distributed File System

Stars: ✭ 27 (-93.25%)

Mutual labels: hadoop, hdfs

Docker Kafka

Apache Kafka on Docker

Stars: ✭ 143 (-64.25%)

Mutual labels: apache-kafka, kafka

HDFS-Netdisc

基于Hadoop的分布式云存储系统 🌴

Stars: ✭ 56 (-86%)

Mutual labels: hadoop, hdfs

sparkucx

A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer

Stars: ✭ 32 (-92%)

Mutual labels: big-data, hadoop

Movies-Analytics-in-Spark-and-Scala

Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.

Stars: ✭ 47 (-88.25%)

Mutual labels: big-data, hadoop

clusterdock

clusterdock is a framework for creating Docker-based container clusters

Stars: ✭ 26 (-93.5%)

Mutual labels: big-data, hadoop

big-data-lite

Samples to the Oracle Big Data Lite VM

Stars: ✭ 41 (-89.75%)

Mutual labels: big-data, hadoop

df data service

DataFibers Data Service

Stars: ✭ 31 (-92.25%)

Mutual labels: streaming, hadoop

Hive

Apache Hive

Stars: ✭ 4,031 (+907.75%)

Mutual labels: big-data, hadoop

fastdata-cluster

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-95%)

Mutual labels: hadoop, hdfs

py-hdfs-mount

Mount HDFS with fuse, works with kerberos!

Stars: ✭ 13 (-96.75%)

Mutual labels: hadoop, hdfs

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (-91.5%)

Mutual labels: big-data, hadoop

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (-7%)

Mutual labels: kafka, hadoop

fluent-plugin-webhdfs

Hadoop WebHDFS output plugin for Fluentd

Stars: ✭ 57 (-85.75%)

Mutual labels: hadoop, hdfs

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (-72.25%)

Mutual labels: big-data, hadoop

pulsar-adapters

Apache Pulsar Adapters

Stars: ✭ 18 (-95.5%)

Mutual labels: streaming, apache-kafka

hadoop-data-ingestion-tool

OLAP and ETL of Big Data

Stars: ✭ 17 (-95.75%)

Mutual labels: big-data, hadoop

firehose

Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.

Stars: ✭ 213 (-46.75%)

Mutual labels: streaming, apache-kafka

1-60 of 1385 similar projects

›

next*5