All Projects → tomwhite → hadoop-ecosystem

tomwhite / hadoop-ecosystem

Licence: other
Visualizations of the Hadoop Ecosystem

Programming Languages

shell
77523 projects

Projects that are alternatives of or similar to hadoop-ecosystem

datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+95%)
Mutual labels:  hadoop
xxhadoop
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (+85%)
Mutual labels:  hadoop
hadoopoffice
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (+180%)
Mutual labels:  hadoop
big-data-exploration
[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product
Stars: ✭ 43 (+115%)
Mutual labels:  hadoop
corc
An ORC File Scheme for the Cascading data processing platform.
Stars: ✭ 14 (-30%)
Mutual labels:  hadoop
jmx exporter-cloudera-hadoop
Prometheus jmx_exporter configurations for Cloudera Hadoop
Stars: ✭ 33 (+65%)
Mutual labels:  hadoop
dockerfiles
Multi docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (+45%)
Mutual labels:  hadoop
hadoop-etl-udfs
The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-15%)
Mutual labels:  hadoop
disq
A library for manipulating bioinformatics sequencing formats in Apache Spark
Stars: ✭ 29 (+45%)
Mutual labels:  hadoop
rastercube
rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)
Stars: ✭ 15 (-25%)
Mutual labels:  hadoop
disk
基于hadoop+hbase+springboot实现分布式网盘系统
Stars: ✭ 53 (+165%)
Mutual labels:  hadoop
BigInsights-on-Apache-Hadoop
Example projects for 'BigInsights for Apache Hadoop' on IBM Bluemix
Stars: ✭ 21 (+5%)
Mutual labels:  hadoop
oci-cloudera
Terraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)
Stars: ✭ 20 (+0%)
Mutual labels:  hadoop
LogAnalyzeHelper
论坛日志分析系统清洗程序(包含IP规则库,UDF开发,MapReduce程序,日志数据)
Stars: ✭ 33 (+65%)
Mutual labels:  hadoop
sparkucx
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (+60%)
Mutual labels:  hadoop
qs-hadoop
大数据生态圈学习
Stars: ✭ 18 (-10%)
Mutual labels:  hadoop
skein
A tool and library for easily deploying applications on Apache YARN
Stars: ✭ 128 (+540%)
Mutual labels:  hadoop
liquibase-impala
Liquibase extension to add Impala Database support
Stars: ✭ 23 (+15%)
Mutual labels:  hadoop
memex-gate
General Architecture for Text Engineering
Stars: ✭ 47 (+135%)
Mutual labels:  hadoop
learning-spark
Tidy up Spark and Hadoop tutorials.
Stars: ✭ 28 (+40%)
Mutual labels:  hadoop
A graph of the Hadoop Ecosystem. A dot file is used to describe the relationship between projects. Currently there are only a small number of projects, but feel free to fork and add more projects and relations.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].