All Projects → Featran → Similar Projects or Alternatives

1270 Open source projects that are alternatives of or similar to Featran

Data science blogs

A repository to keep track of all the code that I end up writing for my blog posts.

Stars: ✭ 139 (-66.9%)

Mutual labels: spark, data, xgboost

Spark Tda

SparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.

Stars: ✭ 45 (-89.29%)

Mutual labels: spark, ml

Codesearchnet

Datasets, tools, and benchmarks for representation learning of code.

Stars: ✭ 1,378 (+228.1%)

Mutual labels: data, ml

Model Serving Tutorial

Code and presentation for Strata Model Serving tutorial

Stars: ✭ 57 (-86.43%)

Mutual labels: spark, flink

Benchm Ml

A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).

Stars: ✭ 1,835 (+336.9%)

Mutual labels: spark, xgboost

Awesome Ai Ml Dl

Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.

Stars: ✭ 831 (+97.86%)

Mutual labels: data, ml

Datafusion

DataFusion has now been donated to the Apache Arrow project

Stars: ✭ 611 (+45.48%)

Mutual labels: spark, data

Bigdataguide

大数据学习，从零开始学习大数据，包含大数据学习各阶段学习视频、面试资料

Stars: ✭ 817 (+94.52%)

Mutual labels: spark, flink

Ecommercerecommendsystem

商品大数据实时推荐系统。前端：Vue + TypeScript + ElementUI，后端 Spring + Spark

Stars: ✭ 139 (-66.9%)

Mutual labels: spark, flink

Java learning practice

java 进阶之路：面试高频算法、akka、多线程、NIO、Netty、SpringBoot、Spark&&Flink 等

Stars: ✭ 110 (-73.81%)

Mutual labels: spark, flink

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (-1.67%)

Mutual labels: spark, data

Transmogrifai

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning

Stars: ✭ 2,084 (+396.19%)

Mutual labels: spark, ml

Big Whale

Spark、Flink等离线任务的调度以及实时任务的监控

Stars: ✭ 163 (-61.19%)

Mutual labels: spark, flink

Lexpredict Lexnlp

LexNLP by LexPredict

Stars: ✭ 439 (+4.52%)

Mutual labels: data, ml

Alink

Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.

Stars: ✭ 2,936 (+599.05%)

Mutual labels: xgboost, flink

Bdp Dataplatform

大数据生态解决方案数据平台：基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。

Stars: ✭ 456 (+8.57%)

Mutual labels: spark, flink

Sparklearning

Learning Apache spark,including code and data .Most part can run local.

Stars: ✭ 558 (+32.86%)

Mutual labels: spark, ml

Data Ingestion Platform

Stars: ✭ 39 (-90.71%)

Mutual labels: spark, flink

Audioowl

Fast and simple music and audio analysis using RNN in Python 🕵️‍♀️ 🥁

Stars: ✭ 151 (-64.05%)

Mutual labels: data, ml

Bigdata Notebook

Stars: ✭ 100 (-76.19%)

Mutual labels: spark, flink

Flink Learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+2609.05%)

Mutual labels: spark, flink

Quicksql

A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources

Stars: ✭ 1,821 (+333.57%)

Mutual labels: spark, flink

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (+341.9%)

Mutual labels: spark, flink

Mmlspark

Simple and Distributed Machine Learning

Stars: ✭ 2,899 (+590.24%)

Mutual labels: spark, ml

Sparkstreaming

💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算)；🚀 支持运行过程中增删topic；🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。

Stars: ✭ 179 (-57.38%)

Mutual labels: spark, flink

fastdata-cluster

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-95.24%)

Mutual labels: spark, flink

Dataspherestudio

DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

Stars: ✭ 1,195 (+184.52%)

Mutual labels: spark, flink

Sk Dist

Distributed scikit-learn meta-estimators in PySpark

Stars: ✭ 260 (-38.1%)

Mutual labels: spark, ml

neptune-client

📒 Experiment tracking tool and model registry

Stars: ✭ 348 (-17.14%)

Mutual labels: ml, xgboost

Hadoopcryptoledger

Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive

Stars: ✭ 126 (-70%)

Mutual labels: spark, flink

Automl alex

State-of-the art Automated Machine Learning python library for Tabular Data

Stars: ✭ 132 (-68.57%)

Mutual labels: ml, xgboost

Hyperparameter hunter

Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries

Stars: ✭ 648 (+54.29%)

Mutual labels: ml, xgboost

Pycm

Multi-class confusion matrix library in Python

Stars: ✭ 1,076 (+156.19%)

Mutual labels: data, ml

Feast

Feature Store for Machine Learning

Stars: ✭ 2,576 (+513.33%)

Mutual labels: spark, ml

God Of Bigdata

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stars: ✭ 6,008 (+1330.48%)

Mutual labels: spark, flink

Athenax

SQL-based streaming analytics platform at scale

Stars: ✭ 1,178 (+180.48%)

Mutual labels: flink, data

Zeppelin

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Stars: ✭ 5,513 (+1212.62%)

Mutual labels: spark, flink

Scio

A Scala API for Apache Beam and Google Cloud Dataflow.

Stars: ✭ 2,247 (+435%)

Mutual labels: data, ml

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+104.05%)

Mutual labels: spark, flink

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (+96.67%)

Mutual labels: spark, flink

Pulsar Spark

When Apache Pulsar meets Apache Spark

Stars: ✭ 55 (-86.9%)

Mutual labels: spark, flink

Home

ApacheCN 开源组织：公告、介绍、成员、活动、交流方式

Stars: ✭ 1,199 (+185.48%)

Mutual labels: spark, ml

Repository

个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。

Stars: ✭ 92 (-78.1%)

Mutual labels: spark, flink

Hops Examples

Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops

Stars: ✭ 84 (-80%)

Mutual labels: spark, flink

Pyspark Cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

Stars: ✭ 108 (-74.29%)

Mutual labels: spark, data

Kamu Cli

Next generation tool for decentralized exchange and transformation of semi-structured data

Stars: ✭ 69 (-83.57%)

Mutual labels: spark, flink

Datacompy

Pandas and Spark DataFrame comparison for humans

Stars: ✭ 147 (-65%)

Mutual labels: spark, data

awesome-AI-kubernetes

❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc

Stars: ✭ 95 (-77.38%)

Mutual labels: spark, ml

Cloudflow

Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

Stars: ✭ 278 (-33.81%)

Mutual labels: spark, flink

Flink Ai Extended

Stars: ✭ 377 (-10.24%)

Mutual labels: flink

Open Solution Home Credit

Open solution to the Home Credit Default Risk challenge 🏡

Stars: ✭ 397 (-5.48%)

Mutual labels: xgboost

Dataframe Js

A javascript library providing a new data structure for datascientists and developpers

Stars: ✭ 376 (-10.48%)

Mutual labels: data

Keypathkit

KeyPathKit is a library that provides the standard functions to manipulate data along with a call-syntax that relies on typed keypaths to make the call sites as short and clean as possible.

Stars: ✭ 376 (-10.48%)

Mutual labels: data

Marmaray

Generic Data Ingestion & Dispersal Library for Hadoop

Stars: ✭ 414 (-1.43%)

Mutual labels: spark

React Spreadsheet

Simple, customizable yet performant spreadsheet for React

Stars: ✭ 393 (-6.43%)

Mutual labels: data

Stm32 Usart Uart Dma Rx Tx

STM32 examples for USART using DMA for efficient RX and TX transmission

Stars: ✭ 372 (-11.43%)

Mutual labels: data

Tensorflowonspark

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

Stars: ✭ 3,748 (+792.38%)

Mutual labels: spark

Iceberg

Iceberg is a table format for large, slow-moving tabular data

Stars: ✭ 393 (-6.43%)

Mutual labels: spark

Meza

A Python toolkit for processing tabular data

Stars: ✭ 374 (-10.95%)

Mutual labels: data

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (-11.43%)

Mutual labels: spark

1-60 of 1270 similar projects

›

next*5