Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (+241.46%)
Mutual labels: big-data, hadoop
CalciteApache Calcite
Stars: ✭ 2,816 (+6768.29%)
Mutual labels: big-data, hadoop
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+265.85%)
Mutual labels: big-data, hadoop
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (+212.2%)
Mutual labels: big-data, hadoop
rastercuberastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)
Stars: ✭ 15 (-63.41%)
Mutual labels: big-data, hadoop
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+3904.88%)
Mutual labels: big-data, hadoop
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+331.71%)
Mutual labels: big-data, hadoop
Bigdata Notes大数据入门指南 ⭐
Stars: ✭ 10,991 (+26707.32%)
Mutual labels: big-data, hadoop
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-4.88%)
Mutual labels: big-data, hadoop
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-60.98%)
Mutual labels: big-data, hadoop
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (+185.37%)
Mutual labels: big-data, hadoop
Movies-Analytics-in-Spark-and-ScalaData cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Stars: ✭ 47 (+14.63%)
Mutual labels: big-data, hadoop
DrillApache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (+3848.78%)
Mutual labels: big-data, hadoop
Calcite AvaticaMirror of Apache Calcite - Avatica
Stars: ✭ 130 (+217.07%)
Mutual labels: big-data, hadoop
AsakusafwAsakusa Framework
Stars: ✭ 114 (+178.05%)
Mutual labels: big-data, hadoop
PrestoThe official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+31502.44%)
Mutual labels: big-data, hadoop
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+2400%)
Mutual labels: big-data, hadoop
Docker Spark ClusterA Spark cluster setup running on Docker containers
Stars: ✭ 57 (+39.02%)
Mutual labels: big-data, hadoop
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+424.39%)
Mutual labels: big-data, hadoop
sparkucxA high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (-21.95%)
Mutual labels: big-data, hadoop