A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.

Stars: ✭ 86 (+100%)

Mutual labels: bigdata

Shifu

An end-to-end machine learning and data mining framework on Hadoop

Stars: ✭ 207 (+381.4%)

Mutual labels: bigdata

Volcano

A Cloud Native Batch System (Project under CNCF)

Stars: ✭ 2,114 (+4816.28%)

Mutual labels: bigdata

Aws Etl Orchestrator

A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.

Stars: ✭ 245 (+469.77%)

Mutual labels: bigdata

Lambda Arch

Applying Lambda Architecture with Spark, Kafka, and Cassandra.

Stars: ✭ 111 (+158.14%)

Mutual labels: bigdata

Flinkx

Based on Apache Flink. support data synchronization/integration and streaming SQL computation.

Stars: ✭ 2,651 (+6065.12%)

Mutual labels: bigdata

Daudit

🌲 Configuration flaws detector for Hadoop, MongoDB, MySQL, and more!

Stars: ✭ 108 (+151.16%)

Mutual labels: bigdata

codefoundry

Examples for gauravbytes.com

Stars: ✭ 57 (+32.56%)

Mutual labels: bigdata

Flink Notes

flink学习笔记

Stars: ✭ 106 (+146.51%)

Mutual labels: bigdata

Javainterview

最全的Java技术知识点，以及Java源码分析。为开源贡献自己的一份力。

Stars: ✭ 154 (+258.14%)

Mutual labels: bigdata

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+25460.47%)

Mutual labels: bigdata

Tdengine

An open-source big data platform designed and optimized for the Internet of Things (IoT).

Stars: ✭ 17,434 (+40444.19%)

Mutual labels: bigdata

Biglasso

biglasso: Extending Lasso Model Fitting to Big Data in R

Stars: ✭ 87 (+102.33%)

Mutual labels: bigdata

Poli

An easy-to-use BI server built for SQL lovers. Power data analysis in SQL and gain faster business insights.

Stars: ✭ 1,850 (+4202.33%)

Mutual labels: bigdata

Ecommercerecommendsystem

商品大数据实时推荐系统。前端：Vue + TypeScript + ElementUI，后端 Spring + Spark

Stars: ✭ 139 (+223.26%)

Mutual labels: bigdata

Mlsql

The Programming Language Designed For Big Data and AI

Stars: ✭ 1,262 (+2834.88%)

Mutual labels: bigdata

Flink Boot

懒松鼠Flink-Boot 脚手架让Flink全面拥抱Spring生态体系，使得开发者可以以Java WEB开发模式开发出分布式运行的流处理程序，懒松鼠让跨界变得更加简单。懒松鼠旨在让开发者以更底上手成本（不需要理解分布式计算的理论知识和Flink框架的细节）便可以快速编写业务代码实现。为了进一步提升开发者使用懒松鼠脚手架开发大型项目的敏捷的度，该脚手架默认集成Spring框架进行Bean管理，同时将微服务以及WEB开发领域中经常用到的框架集成进来，进一步提升开发速度。比如集成Mybatis ORM框架，Hibernate Validator校验框架,Spring Retry重试框架等，具体见下面的脚手架特性。

Stars: ✭ 209 (+386.05%)

Mutual labels: bigdata

Tipdm

TipDM建模平台，开源的数据挖掘工具。

Stars: ✭ 130 (+202.33%)

Mutual labels: bigdata

Every Single Day I Tldr

A daily digest of the articles or videos I've found interesting, that I want to share with you.

Stars: ✭ 249 (+479.07%)

Mutual labels: bigdata

Fpart

Sort files and pack them into partitions

Stars: ✭ 127 (+195.35%)

Mutual labels: bigdata

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (+372.09%)

Mutual labels: bigdata

Hadoopcryptoledger

Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive

Stars: ✭ 126 (+193.02%)

Mutual labels: bigdata

optimus

🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Stars: ✭ 1,351 (+3041.86%)

Mutual labels: bigdata

Genie

Distributed Big Data Orchestration Service

Stars: ✭ 1,544 (+3490.7%)

Mutual labels: bigdata

Kotlin Spark Api

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

Stars: ✭ 183 (+325.58%)

Mutual labels: bigdata

Books

技术书籍等

Stars: ✭ 110 (+155.81%)

Mutual labels: bigdata

Dpark

Python clone of Spark, a MapReduce alike framework in Python

Stars: ✭ 2,668 (+6104.65%)

Mutual labels: bigdata

Spark R Notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 109 (+153.49%)

Mutual labels: bigdata

Bigdata practice

大数据分析可视化实践

Stars: ✭ 166 (+286.05%)

Mutual labels: bigdata

Awesome Bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.

Stars: ✭ 10,478 (+24267.44%)

Mutual labels: bigdata

NLog.Targets.Syslog

A Syslog server target for NLog

Stars: ✭ 63 (+46.51%)

Mutual labels: syslog

Griddb

GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.

Stars: ✭ 1,587 (+3590.7%)

Mutual labels: bigdata

Nmflibrary

MATLAB library for non-negative matrix factorization (NMF): Version 1.8.1

Stars: ✭ 153 (+255.81%)

Mutual labels: bigdata

Sparktutorial

Source code for James Lee's Aparch Spark with Java course

Stars: ✭ 105 (+144.19%)

Mutual labels: bigdata

Hadoop Attack Library

A collection of pentest tools and resources targeting Hadoop environments

Stars: ✭ 228 (+430.23%)

Mutual labels: bigdata

Bigdata Notebook

Stars: ✭ 100 (+132.56%)

Mutual labels: bigdata

Hudi

Upserts, Deletes And Incremental Processing on Big Data.

Stars: ✭ 2,586 (+5913.95%)

Mutual labels: bigdata

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (+3011.63%)

Mutual labels: bigdata

workflUX

An open-source, cloud-ready web application for simplified deployment of big data workflows.

Stars: ✭ 26 (-39.53%)

Mutual labels: bigdata

Mnemonic

Apache Mnemonic - A non-volatile hybrid memory storage oriented library

Stars: ✭ 91 (+111.63%)

Mutual labels: bigdata

Avro

Apache Avro is a data serialization system.

Stars: ✭ 2,005 (+4562.79%)

Mutual labels: bigdata

Ignite Book Code Samples

All code samples, scripts and more in-depth examples for the book high performance in-memory computing with Apache Ignite. Please use the repository "the-apache-ignite-book" for Ignite version 2.6 or above.

Stars: ✭ 86 (+100%)

Mutual labels: bigdata

Node Hbase

Asynchronous HBase client for NodeJs using REST

Stars: ✭ 226 (+425.58%)

Mutual labels: bigdata

Big Data Study

🐳 big data study

Stars: ✭ 141 (+227.91%)

Mutual labels: bigdata

Syslog

An Arduino library for logging to Syslog server in IETF format (RFC 5424) and BSD format (RFC 3164)