All Projects → Bigdl → Similar Projects or Alternatives

1103 Open source projects that are alternatives of or similar to Bigdl

Mmlspark

Simple and Distributed Machine Learning

Stars: ✭ 2,899 (-23.97%)

Mutual labels: ai, spark, big-data

Gaffer

A large-scale entity and relation database supporting aggregation of properties

Stars: ✭ 1,642 (-56.94%)

Mutual labels: spark, big-data, hadoop

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+48.33%)

Mutual labels: spark, big-data, hadoop

Analytics Zoo

Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray

Stars: ✭ 2,448 (-35.8%)

Mutual labels: bigdl, distributed-deep-learning, analytics-zoo

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-96.07%)

Mutual labels: spark, big-data, hadoop

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+188.25%)

Mutual labels: spark, big-data, hadoop

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (-97.04%)

Mutual labels: ai, spark, hadoop

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (-94.36%)

Mutual labels: spark, big-data, hadoop

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+478.23%)

Mutual labels: spark, big-data, hadoop

Docker Spark Cluster

A Spark cluster setup running on Docker containers

Stars: ✭ 57 (-98.51%)

Mutual labels: spark, big-data, hadoop

Hadoop cookbook

Cookbook to install Hadoop 2.0+ using Chef

Stars: ✭ 82 (-97.85%)

Mutual labels: spark, hadoop

Spark Website

Apache Spark Website

Stars: ✭ 75 (-98.03%)

Mutual labels: spark, big-data

Docker Spark

🚢 Docker image for Apache Spark

Stars: ✭ 78 (-97.95%)

Mutual labels: spark, hadoop

Repository

个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。

Stars: ✭ 92 (-97.59%)

Mutual labels: spark, hadoop

Ytk Learn

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Stars: ✭ 337 (-91.16%)

Mutual labels: spark, hadoop

Hive

Apache Hive

Stars: ✭ 4,031 (+5.72%)

Mutual labels: big-data, hadoop

Sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Stars: ✭ 362 (-90.51%)

Mutual labels: spark, big-data

Hadoopcryptoledger

Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive

Stars: ✭ 126 (-96.7%)

Mutual labels: spark, hadoop

Feast

Feature Store for Machine Learning

Stars: ✭ 2,576 (-32.44%)

Mutual labels: spark, big-data

Airflow Pipeline

An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR

Stars: ✭ 128 (-96.64%)

Mutual labels: spark, hadoop

Sparkling Graph

SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.

Stars: ✭ 139 (-96.35%)

Mutual labels: spark, big-data

Bigdata Notebook

Stars: ✭ 100 (-97.38%)

Mutual labels: spark, hadoop

Aliyun Emapreduce Datasources

Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.

Stars: ✭ 132 (-96.54%)

Mutual labels: spark, hadoop

Spark.jl

Julia binding for Apache Spark

Stars: ✭ 153 (-95.99%)

Mutual labels: spark, big-data

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (-90.24%)

Mutual labels: spark, hadoop

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (-94.68%)

Mutual labels: spark, hadoop

Dataspherestudio

DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

Stars: ✭ 1,195 (-68.66%)

Mutual labels: spark, hadoop

Apache Spark Hands On

Educational notes,Hands on problems w/ solutions for hadoop ecosystem

Stars: ✭ 74 (-98.06%)

Mutual labels: spark, hadoop

Setl

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (-97.93%)

Mutual labels: spark, big-data

Labs

Research on distributed system

Stars: ✭ 73 (-98.09%)

Mutual labels: spark, big-data

Logisland

Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.

Stars: ✭ 97 (-97.46%)

Mutual labels: spark, big-data

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (-64.91%)

Mutual labels: spark, big-data

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (-98.14%)

Mutual labels: spark, big-data

Gimel

Big Data Processing Framework - Unified Data API or SQL on Any Storage

Stars: ✭ 216 (-94.34%)

Mutual labels: spark, big-data

Vespa

The open big data serving engine. https://vespa.ai

Stars: ✭ 3,747 (-1.73%)

Mutual labels: ai, big-data

Ozone

Scalable, redundant, and distributed object store for Apache Hadoop

Stars: ✭ 330 (-91.35%)

Mutual labels: big-data, hadoop

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (-51.32%)

Mutual labels: spark, hadoop

Bigdataclass

Two-day workshop that covers how to use R to interact databases and Spark

Stars: ✭ 110 (-97.12%)

Mutual labels: spark, big-data

Spark On Lambda

Apache Spark on AWS Lambda

Stars: ✭ 137 (-96.41%)

Mutual labels: spark, big-data

Devops Roadmap

DevOps methodology & roadmap for a devops developer in 2019. Interesting books to learn new technologies.

Stars: ✭ 349 (-90.85%)

Mutual labels: ai, big-data

Geni

A Clojure dataframe library that runs on Spark

Stars: ✭ 152 (-96.01%)

Mutual labels: spark, big-data

Rsparkling

RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)

Stars: ✭ 65 (-98.3%)

Mutual labels: spark, big-data

Deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…

Stars: ✭ 12,277 (+221.98%)

Mutual labels: spark, hadoop

Data Accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (-93.52%)

Mutual labels: spark, big-data

Xlearning

AI on Hadoop

Stars: ✭ 1,709 (-55.18%)

Mutual labels: ai, hadoop

Datasciencevm

Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)

Stars: ✭ 153 (-95.99%)

Mutual labels: ai, big-data

Hyperspace

An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.

Stars: ✭ 246 (-93.55%)

Mutual labels: spark, big-data

Geopyspark

GeoTrellis for PySpark

Stars: ✭ 167 (-95.62%)

Mutual labels: spark, big-data

Autodl

Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]

Stars: ✭ 854 (-77.6%)

Mutual labels: ai, big-data

Koalas

Koalas: pandas API on Apache Spark

Stars: ✭ 3,044 (-20.17%)

Mutual labels: spark, big-data

Big Whale

Spark、Flink等离线任务的调度以及实时任务的监控

Stars: ✭ 163 (-95.73%)

Mutual labels: spark, hadoop

Trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Stars: ✭ 4,581 (+20.14%)

Mutual labels: big-data, hadoop

Succinct

Enabling queries on compressed data.

Stars: ✭ 257 (-93.26%)

Mutual labels: spark, big-data

Oie Resources

A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.

Stars: ✭ 283 (-92.58%)

Mutual labels: ai, big-data

Cloudbreak

A tool for provisioning and managing Apache Hadoop clusters in the cloud. Cloudbreak, as part of the Hortonworks Data Platform, makes it easy to provision, configure and elastically grow HDP clusters on cloud infrastructure. Cloudbreak can be used to provision Hadoop across cloud infrastructure providers including AWS, Azure, GCP and OpenStack.

Stars: ✭ 301 (-92.11%)

Mutual labels: big-data, hadoop

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (-92.18%)

Mutual labels: spark, hadoop

Spline

Data Lineage Tracking And Visualization Solution

Stars: ✭ 306 (-91.97%)

Mutual labels: spark, hadoop

Tez

Apache Tez