All Projects → Feast → Similar Projects or Alternatives

1234 Open source projects that are alternatives of or similar to Feast

Gimel

Big Data Processing Framework - Unified Data API or SQL on Any Storage

Stars: ✭ 216 (-91.61%)

Mutual labels: spark, big-data

SynapseML

Simple and Distributed Machine Learning

Stars: ✭ 3,355 (+30.24%)

Mutual labels: big-data, ml

Labs

Research on distributed system

Stars: ✭ 73 (-97.17%)

Mutual labels: spark, big-data

Spark Website

Apache Spark Website

Stars: ✭ 75 (-97.09%)

Mutual labels: spark, big-data

incubator-liminal

Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.

Stars: ✭ 117 (-95.46%)

Mutual labels: big-data, ml

Spark

Apache Spark - A unified analytics engine for large-scale data processing

Stars: ✭ 31,618 (+1127.41%)

Mutual labels: spark, big-data

Logisland

Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.

Stars: ✭ 97 (-96.23%)

Mutual labels: spark, big-data

hamilton

A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.

Stars: ✭ 612 (-76.24%)

Mutual labels: data-engineering, feature-engineering

Hub

Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai

Stars: ✭ 4,003 (+55.4%)

Mutual labels: ml, mlops

Sk Dist

Distributed scikit-learn meta-estimators in PySpark

Stars: ✭ 260 (-89.91%)

Mutual labels: spark, ml

neptune-client

📒 Experiment tracking tool and model registry

Stars: ✭ 348 (-86.49%)

Mutual labels: ml, mlops

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (-48.06%)

Mutual labels: spark, big-data

Goodreads etl pipeline

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

Stars: ✭ 793 (-69.22%)

Mutual labels: spark, data-engineering

Bentoml

Model Serving Made Easy

Stars: ✭ 3,064 (+18.94%)

Mutual labels: ml, mlops

Sparklearning

Learning Apache spark,including code and data .Most part can run local.

Stars: ✭ 558 (-78.34%)

Mutual labels: spark, ml

Datasciencevm

Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)

Stars: ✭ 153 (-94.06%)

Mutual labels: big-data, ml

Spark Doc Zh

Apache Spark 官方文档中文版

Stars: ✭ 1,126 (-56.29%)

Mutual labels: spark, big-data

Sparkling Graph

SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.

Stars: ✭ 139 (-94.6%)

Mutual labels: spark, big-data

cli

Polyaxon Core Client & CLI to streamline MLOps

Stars: ✭ 18 (-99.3%)

Mutual labels: ml, mlops

Magellan

Geo Spatial Data Analytics on Spark

Stars: ✭ 507 (-80.32%)

Mutual labels: spark, big-data

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (-97.24%)

Mutual labels: spark, big-data

Home

ApacheCN 开源组织：公告、介绍、成员、活动、交流方式

Stars: ✭ 1,199 (-53.45%)

Mutual labels: spark, ml

Dataengineeringproject

Example end to end data engineering project.

Stars: ✭ 82 (-96.82%)

Mutual labels: big-data, data-engineering

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (-27.95%)

Mutual labels: spark

Auto ml

[UNMAINTAINED] Automated machine learning for analytics & production

Stars: ✭ 1,559 (-39.48%)

Mutual labels: feature-engineering

Java learning practice

java 进阶之路：面试高频算法、akka、多线程、NIO、Netty、SpringBoot、Spark&&Flink 等

Stars: ✭ 110 (-95.73%)

Mutual labels: spark

Mobydq

🐳 Tool to automate data quality checks on data pipelines

Stars: ✭ 123 (-95.23%)

Mutual labels: big-data

Ros2learn

ROS 2 enabled Machine Learning algorithms

Stars: ✭ 119 (-95.38%)

Mutual labels: ml

Responsible Ai Widgets

This project provides responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.

Stars: ✭ 107 (-95.85%)

Mutual labels: ml

Spark R Notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 109 (-95.77%)

Mutual labels: big-data

Ml Dl Scripts

The repository provides usefull python scripts for ML and data analysis

Stars: ✭ 119 (-95.38%)

Mutual labels: ml

Parquet Index

Spark SQL index for Parquet tables

Stars: ✭ 109 (-95.77%)

Mutual labels: spark

Anndotnet

ANNdotNET - deep learning tool on .NET Platform.

Stars: ✭ 109 (-95.77%)

Mutual labels: ml

Butterfree

A tool for building feature stores.

Stars: ✭ 126 (-95.11%)

Mutual labels: data-engineering

Hazelcast Nodejs Client

Hazelcast IMDG Node.js Client

Stars: ✭ 124 (-95.19%)

Mutual labels: big-data

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (-95.46%)

Mutual labels: big-data

Distributed Dataset

A distributed data processing framework in Haskell.

Stars: ✭ 108 (-95.81%)

Mutual labels: spark

Pyspark Cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

Stars: ✭ 108 (-95.81%)

Mutual labels: spark

Drill

Apache Drill is a distributed MPP query layer for self describing data

Stars: ✭ 1,619 (-37.15%)

Mutual labels: big-data

Hnswlib

Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs

Stars: ✭ 108 (-95.81%)

Mutual labels: spark

Superset

Apache Superset is a Data Visualization and Data Exploration Platform

Stars: ✭ 42,634 (+1555.05%)

Mutual labels: data-engineering

Spark Infotheoretic Feature Selection

This package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.

Stars: ✭ 123 (-95.23%)

Mutual labels: spark

D6t Python

Accelerate data science

Stars: ✭ 118 (-95.42%)

Mutual labels: data-engineering

Flink Learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+341.69%)

Mutual labels: spark

Yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Stars: ✭ 19,914 (+673.06%)

Mutual labels: ml

Deephyper

DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

Stars: ✭ 117 (-95.46%)

Mutual labels: ml

Seldon Server

Machine Learning Platform and Recommendation Engine built on Kubernetes