AWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.

Stars: ✭ 21 (-99.21%)

Mutual labels: bigdata

Tennis Crystal Ball

Ultimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction

Stars: ✭ 107 (-95.96%)

Mutual labels: bigdata

Java Notes

☕️ Java 基础 👫 面向对象思想✏️ 算法 📝 操作系统 ☁️ 网络 💾 数据库 🙊 Spring 💡 系统架构🐘大数据

Stars: ✭ 160 (-93.96%)

Mutual labels: bigdata

Big Data Study

🐳 big data study

Stars: ✭ 141 (-94.68%)

Mutual labels: bigdata

Liteflow

liteflow是一个基于任务版本来实现的分布式任务流调度系统

Stars: ✭ 112 (-95.78%)

Mutual labels: bigdata

Athena Cli

Presto-like CLI tool for AWS Athena

Stars: ✭ 85 (-96.79%)

Mutual labels: bigdata

Flink Training Course

Flink 中文视频课程（持续更新...）

Stars: ✭ 3,963 (+49.49%)

Mutual labels: flink

Mobius

C# and F# language binding and extensions to Apache Spark

Stars: ✭ 929 (-64.96%)

Mutual labels: bigdata

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (-68.84%)

Mutual labels: flink

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (-35.08%)

Mutual labels: bigdata

Splash

Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange

Stars: ✭ 105 (-96.04%)

Mutual labels: bigdata

Coding Now

学习记录的一些笔记，以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等

Stars: ✭ 750 (-71.71%)

Mutual labels: bigdata

Athenacli

AthenaCLI is a CLI tool for AWS Athena service that can do auto-completion and syntax highlighting.

Stars: ✭ 151 (-94.3%)

Mutual labels: bigdata

Spark Movie Lens

An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Stars: ✭ 745 (-71.9%)

Mutual labels: bigdata

Running Elasticsearch Fun Profit

A book about running Elasticsearch

Stars: ✭ 664 (-74.95%)

Mutual labels: bigdata

Volcano

A Cloud Native Batch System (Project under CNCF)

Stars: ✭ 2,114 (-20.26%)

Mutual labels: bigdata

Flink Forward China 2018

Flink Forward China 2018 Slides

Stars: ✭ 583 (-78.01%)

Mutual labels: flink

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (-49.53%)

Mutual labels: bigdata

Bigdata practice

大数据分析可视化实践

Stars: ✭ 166 (-93.74%)

Mutual labels: bigdata

Jigsaw

Jigsaw七巧板 provides a set of web components based on Angular5/8/9+. The main purpose of Jigsaw is to help the application developers to construct complex & intensive interacting & user friendly web pages. Jigsaw is supporting the development of all applications of Big Data Product of ZTE.

Stars: ✭ 354 (-86.65%)

Mutual labels: bigdata

Cds

Data syncing in golang for ClickHouse.

Stars: ✭ 501 (-81.1%)

Mutual labels: bigdata

Eagle

Real time data processing system based on flink and CEP

Stars: ✭ 95 (-96.42%)

Mutual labels: flink

Yauaa

Yet Another UserAgent Analyzer

Stars: ✭ 472 (-82.2%)

Mutual labels: flink

Pulsar Flink

Elastic data processing with Apache Pulsar and Apache Flink

Stars: ✭ 126 (-95.25%)

Mutual labels: flink

Bdp Dataplatform

大数据生态解决方案数据平台：基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。

Stars: ✭ 456 (-82.8%)

Mutual labels: flink

Mnemonic

Apache Mnemonic - A non-volatile hybrid memory storage oriented library

Stars: ✭ 91 (-96.57%)

Mutual labels: bigdata

Poli

An easy-to-use BI server built for SQL lovers. Power data analysis in SQL and gain faster business insights.

Stars: ✭ 1,850 (-30.22%)

Mutual labels: bigdata

Hudi Resources

汇总Apache Hudi相关资料

Stars: ✭ 79 (-97.02%)

Mutual labels: bigdata

Api.rss

RSS as RESTful. This service allows you to transform RSS feed into an awesome API.

Stars: ✭ 340 (-87.17%)

Mutual labels: bigdata

Circosjs

d3 library to build circular graphs

Stars: ✭ 436 (-83.55%)

Mutual labels: bigdata

Ignite Book Code Samples

All code samples, scripts and more in-depth examples for the book high performance in-memory computing with Apache Ignite. Please use the repository "the-apache-ignite-book" for Ignite version 2.6 or above.

Stars: ✭ 86 (-96.76%)

Mutual labels: bigdata

Featran

A Scala feature transformation library for data science and machine learning

Stars: ✭ 420 (-84.16%)

Mutual labels: flink

Flink Docker

Docker packaging for Apache Flink

Stars: ✭ 118 (-95.55%)

Mutual labels: flink

Big data architect skills

一个大数据架构师应该掌握的技能

Stars: ✭ 400 (-84.91%)

Mutual labels: bigdata

Mlsql

The Programming Language Designed For Big Data and AI

Stars: ✭ 1,262 (-52.4%)

Mutual labels: bigdata

Sidekick

High Performance HTTP Sidecar Load Balancer

Stars: ✭ 366 (-86.19%)

Mutual labels: bigdata

Javainterview

最全的Java技术知识点，以及Java源码分析。为开源贡献自己的一份力。

Stars: ✭ 154 (-94.19%)

Mutual labels: bigdata

Sylph

Stream computing platform for bigdata

Stars: ✭ 362 (-86.34%)

Mutual labels: flink

Hops Examples

Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops

Stars: ✭ 84 (-96.83%)

Mutual labels: flink

Datawave

DataWave is an ingest/query framework that leverages Apache Accumulo to provide fast, secure data access.

Stars: ✭ 347 (-86.91%)

Mutual labels: bigdata

Genie

Distributed Big Data Orchestration Service

Stars: ✭ 1,544 (-41.76%)

Mutual labels: bigdata

Datafaker

Datafaker is a large-scale test data and flow test data generation tool. Datafaker fakes data and inserts to varied data sources. 测试数据生成工具

Stars: ✭ 327 (-87.67%)

Mutual labels: bigdata

Uproot4

ROOT I/O in pure Python and NumPy.

Stars: ✭ 80 (-96.98%)

Mutual labels: bigdata

Uproot3

ROOT I/O in pure Python and NumPy.

Stars: ✭ 312 (-88.23%)

Mutual labels: bigdata

Spline

Data Lineage Tracking And Visualization Solution

Stars: ✭ 306 (-88.46%)

Mutual labels: bigdata

Azure Event Hubs Spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

Stars: ✭ 140 (-94.72%)

Mutual labels: bigdata

Lambda Arch

Applying Lambda Architecture with Spark, Kafka, and Cassandra.

Stars: ✭ 111 (-95.81%)

Mutual labels: bigdata

Cleanframes

type-class based data cleansing library for Apache Spark SQL

Stars: ✭ 75 (-97.17%)

Mutual labels: bigdata

Flink

Apache Flink is an open source project of The Apache Software Foundation (ASF). The Apache Flink project originated from the Stratosphere research project.

Stars: ✭ 17,781 (+570.73%)

Mutual labels: flink

Cloudflow

Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

Stars: ✭ 278 (-89.51%)

Mutual labels: flink

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (-29.99%)

Mutual labels: flink

61-120 of 271 similar projects

‹

›