HamaMirror of Apache Hama
Stars: ✭ 129 (-8.51%)
Cleanframestype-class based data cleansing library for Apache Spark SQL
Stars: ✭ 75 (-46.81%)
SigmfThe Signal Metadata Format Specification
Stars: ✭ 120 (-14.89%)
OrcAn ORC file format reader and writer for Go.
Stars: ✭ 97 (-31.21%)
DrillApache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (+1048.23%)
BigdataclassTwo-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-21.99%)
CookbookThe Data Engineering Cookbook
Stars: ✭ 9,829 (+6870.92%)
TreevizTree diagrams with JavaScript 🌲 📈
Stars: ✭ 95 (-32.62%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+1064.54%)
ReefMirror of Apache REEF
Stars: ✭ 92 (-34.75%)
MnemonicApache Mnemonic - A non-volatile hybrid memory storage oriented library
Stars: ✭ 91 (-35.46%)
Bitcoin Value Predictor[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (-35.46%)
AsakusafwAsakusa Framework
Stars: ✭ 114 (-19.15%)
Parquet MrApache Parquet
Stars: ✭ 1,278 (+806.38%)
TajoMirror of Apache Tajo
Stars: ✭ 128 (-9.22%)
Pythondatarepo for code published on pythondata.com
Stars: ✭ 113 (-19.86%)
Bigdata File ViewerA cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (-39.01%)
Athena CliPresto-like CLI tool for AWS Athena
Stars: ✭ 85 (-39.72%)
Liteflowliteflow是一个基于任务版本来实现的分布式任务流调度系统
Stars: ✭ 112 (-20.57%)
FeastFeature Store for Machine Learning
Stars: ✭ 2,576 (+1726.95%)
LabsResearch on distributed system
Stars: ✭ 73 (-48.23%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-43.97%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-0.71%)
Lambda ArchApplying Lambda Architecture with Spark, Kafka, and Cassandra.
Stars: ✭ 111 (-21.28%)
RichdemHigh-performance Terrain and Hydrology Analysis
Stars: ✭ 127 (-9.93%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-47.52%)
Books技术书籍等
Stars: ✭ 110 (-21.99%)
BookkeeperApache Bookkeeper
Stars: ✭ 1,178 (+735.46%)
FpartSort files and pack them into partitions
Stars: ✭ 127 (-9.93%)
Flinkstreamsql基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
Stars: ✭ 1,682 (+1092.91%)
VolcanoA Cloud Native Batch System (Project under CNCF)
Stars: ✭ 2,114 (+1399.29%)
AppdocsApplication Performance Optimization Summary
Stars: ✭ 1,169 (+729.08%)
CarbondataMirror of Apache CarbonData
Stars: ✭ 1,158 (+721.28%)
Daudit🌲 Configuration flaws detector for Hadoop, MongoDB, MySQL, and more!
Stars: ✭ 108 (-23.4%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-1.42%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-10.64%)
Awesome BigdataA curated list of awesome big data frameworks, ressources and other awesomeness.
Stars: ✭ 10,478 (+7331.21%)
Flink ShadedApache Flink shaded artifacts repository
Stars: ✭ 67 (-52.48%)
Ng Docs非常适合初学Angular的同学阅读的一份文档. 包含Angular API、Rxjs、Zorro(还没做)、在线测验(还没做)等.
Stars: ✭ 66 (-53.19%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-53.9%)
Mobydq🐳 Tool to automate data quality checks on data pipelines
Stars: ✭ 123 (-12.77%)
Cloud VolumeRead and write Neuroglancer datasets programmatically.
Stars: ✭ 63 (-55.32%)