All Projects → Spark-MLlib-Tutorial → Similar Projects or Alternatives

166 Open source projects that are alternatives of or similar to Spark-MLlib-Tutorial

Books

技术书籍等

Stars: ✭ 110 (+243.75%)

Mutual labels: bigdata

Sparktutorial

Source code for James Lee's Aparch Spark with Java course

Stars: ✭ 105 (+228.13%)

Mutual labels: bigdata

Hudi

Upserts, Deletes And Incremental Processing on Big Data.

Stars: ✭ 2,586 (+7981.25%)

Mutual labels: bigdata

Hadoopcryptoledger

Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive

Stars: ✭ 126 (+293.75%)

Mutual labels: bigdata

Ignite Book Code Samples

All code samples, scripts and more in-depth examples for the book high performance in-memory computing with Apache Ignite. Please use the repository "the-apache-ignite-book" for Ignite version 2.6 or above.

Stars: ✭ 86 (+168.75%)

Mutual labels: bigdata

Bigdata practice

大数据分析可视化实践

Stars: ✭ 166 (+418.75%)

Mutual labels: bigdata

Awesome Bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.

Stars: ✭ 10,478 (+32643.75%)

Mutual labels: bigdata

Node Hbase

Asynchronous HBase client for NodeJs using REST

Stars: ✭ 226 (+606.25%)

Mutual labels: bigdata

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (+4081.25%)

Mutual labels: bigdata

Big Data Study

🐳 big data study

Stars: ✭ 141 (+340.63%)

Mutual labels: bigdata

Fpart

Sort files and pack them into partitions

Stars: ✭ 127 (+296.88%)

Mutual labels: bigdata

Hudi Resources

汇总Apache Hudi相关资料

Stars: ✭ 79 (+146.88%)

Mutual labels: bigdata

Kotlin Spark Api

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

Stars: ✭ 183 (+471.88%)

Mutual labels: bigdata

Genie

Distributed Big Data Orchestration Service

Stars: ✭ 1,544 (+4725%)

Mutual labels: bigdata

Hadoop Attack Library

A collection of pentest tools and resources targeting Hadoop environments

Stars: ✭ 228 (+612.5%)

Mutual labels: bigdata

Spark R Notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 109 (+240.63%)

Mutual labels: bigdata

Nmflibrary

MATLAB library for non-negative matrix factorization (NMF): Version 1.8.1

Stars: ✭ 153 (+378.13%)

Mutual labels: bigdata

Griddb

GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.

Stars: ✭ 1,587 (+4859.38%)

Mutual labels: bigdata

Every Single Day I Tldr

A daily digest of the articles or videos I've found interesting, that I want to share with you.

Stars: ✭ 249 (+678.13%)

Mutual labels: bigdata

Bigdata Notebook

Stars: ✭ 100 (+212.5%)

Mutual labels: bigdata

Avro

Apache Avro is a data serialization system.

Stars: ✭ 2,005 (+6165.63%)

Mutual labels: bigdata

Mnemonic

Apache Mnemonic - A non-volatile hybrid memory storage oriented library

Stars: ✭ 91 (+184.38%)

Mutual labels: bigdata

Flink Boot

懒松鼠Flink-Boot 脚手架让Flink全面拥抱Spring生态体系，使得开发者可以以Java WEB开发模式开发出分布式运行的流处理程序，懒松鼠让跨界变得更加简单。懒松鼠旨在让开发者以更底上手成本（不需要理解分布式计算的理论知识和Flink框架的细节）便可以快速编写业务代码实现。为了进一步提升开发者使用懒松鼠脚手架开发大型项目的敏捷的度，该脚手架默认集成Spring框架进行Bean管理，同时将微服务以及WEB开发领域中经常用到的框架集成进来，进一步提升开发速度。比如集成Mybatis ORM框架，Hibernate Validator校验框架,Spring Retry重试框架等，具体见下面的脚手架特性。

Stars: ✭ 209 (+553.13%)

Mutual labels: bigdata

Mlsql

The Programming Language Designed For Big Data and AI

Stars: ✭ 1,262 (+3843.75%)

Mutual labels: bigdata

Ecommercerecommendsystem

商品大数据实时推荐系统。前端：Vue + TypeScript + ElementUI，后端 Spring + Spark

Stars: ✭ 139 (+334.38%)

Mutual labels: bigdata

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+5278.13%)

Mutual labels: bigdata

Uproot4

ROOT I/O in pure Python and NumPy.

Stars: ✭ 80 (+150%)

Mutual labels: bigdata

Awesome Learning

实践源码库：https://github.com/jast90/bigdata 。微信搜索Jast关注公众号，获取最新技术分享😯。

Stars: ✭ 197 (+515.63%)

Mutual labels: bigdata

Volcano

A Cloud Native Batch System (Project under CNCF)

Stars: ✭ 2,114 (+6506.25%)

Mutual labels: bigdata

Simple It English

Simple-IT-English: smart wordbook from community for community

Stars: ✭ 233 (+628.13%)

Mutual labels: bigdata

Liteflow

liteflow是一个基于任务版本来实现的分布式任务流调度系统

Stars: ✭ 112 (+250%)

Mutual labels: bigdata

Flinkx

Based on Apache Flink. support data synchronization/integration and streaming SQL computation.

Stars: ✭ 2,651 (+8184.38%)

Mutual labels: bigdata

Lambda Arch

Applying Lambda Architecture with Spark, Kafka, and Cassandra.

Stars: ✭ 111 (+246.88%)

Mutual labels: bigdata

bigdatatutorial

Stars: ✭ 34 (+6.25%)

Mutual labels: bigdata

Flinkstreamsql

基于开源的flink，对其实时sql进行扩展；主要实现了流与维表的join，支持原生flink SQL所有的语法

Stars: ✭ 1,682 (+5156.25%)

Mutual labels: bigdata

Java Notes

☕️ Java 基础 👫 面向对象思想✏️ 算法 📝 操作系统 ☁️ 网络 💾 数据库 🙊 Spring 💡 系统架构🐘大数据

Stars: ✭ 160 (+400%)

Mutual labels: bigdata

Daudit

🌲 Configuration flaws detector for Hadoop, MongoDB, MySQL, and more!

Stars: ✭ 108 (+237.5%)

Mutual labels: bigdata

Tdengine

An open-source big data platform designed and optimized for the Internet of Things (IoT).

Stars: ✭ 17,434 (+54381.25%)

Mutual labels: bigdata

Tennis Crystal Ball

Ultimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction

Stars: ✭ 107 (+234.38%)

Mutual labels: bigdata

Javainterview

最全的Java技术知识点，以及Java源码分析。为开源贡献自己的一份力。

Stars: ✭ 154 (+381.25%)

Mutual labels: bigdata

Flink Notes

flink学习笔记

Stars: ✭ 106 (+231.25%)

Mutual labels: bigdata

codefoundry

Examples for gauravbytes.com

Stars: ✭ 57 (+78.13%)

Mutual labels: bigdata

Splash

Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange

Stars: ✭ 105 (+228.13%)

Mutual labels: bigdata

Athenacli

AthenaCLI is a CLI tool for AWS Athena service that can do auto-completion and syntax highlighting.

Stars: ✭ 151 (+371.88%)

Mutual labels: bigdata

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+34246.88%)

Mutual labels: bigdata

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (+571.88%)

Mutual labels: bigdata

Covid19 Market Waiting Times

A project to help people stand in line at the market as little as possible

Stars: ✭ 95 (+196.88%)

Mutual labels: bigdata

Poli

An easy-to-use BI server built for SQL lovers. Power data analysis in SQL and gain faster business insights.

Stars: ✭ 1,850 (+5681.25%)

Mutual labels: bigdata

Biglasso

biglasso: Extending Lasso Model Fitting to Big Data in R

Stars: ✭ 87 (+171.88%)

Mutual labels: bigdata

Aws Etl Orchestrator

A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.

Stars: ✭ 245 (+665.63%)

Mutual labels: bigdata

Bigdata File Viewer

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.

Stars: ✭ 86 (+168.75%)

Mutual labels: bigdata

Azure Event Hubs Spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

Stars: ✭ 140 (+337.5%)

Mutual labels: bigdata

Athena Cli

Presto-like CLI tool for AWS Athena

Stars: ✭ 85 (+165.63%)

Mutual labels: bigdata

Shifu

An end-to-end machine learning and data mining framework on Hadoop