All Projects → Ibis → Similar Projects or Alternatives

996 Open source projects that are alternatives of or similar to Ibis

BigData-News

基于Spark2.2新闻网大数据实时系统项目

Stars: ✭ 36 (-97.79%)

Mutual labels: spark, hadoop

Spline

Data Lineage Tracking And Visualization Solution

Stars: ✭ 306 (-81.23%)

Mutual labels: hadoop, spark

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (-49.33%)

Mutual labels: hadoop, spark

Yandex Big Data Engineering

Stars: ✭ 17 (-98.96%)

Mutual labels: hdfs, spark

Dockerfiles

50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu

Stars: ✭ 847 (-48.04%)

Mutual labels: hadoop, spark

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (-95.64%)

Mutual labels: hdfs, spark

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+246.99%)

Mutual labels: hadoop, spark

Pucket

Bucketing and partitioning system for Parquet

Stars: ✭ 29 (-98.22%)

Mutual labels: hdfs, spark

Jsr203 Hadoop

A Java NIO file system provider for HDFS

Stars: ✭ 35 (-97.85%)

Mutual labels: hadoop, hdfs

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (-92.82%)

Mutual labels: hadoop, hdfs

Data Algorithms Book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Stars: ✭ 949 (-41.78%)

Mutual labels: hadoop, spark

Gaffer

A large-scale entity and relation database supporting aggregation of properties

Stars: ✭ 1,642 (+0.74%)

Mutual labels: hadoop, spark

Airflow Pipeline

An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR

Stars: ✭ 128 (-92.15%)

Mutual labels: hadoop, spark

Datacompy

Pandas and Spark DataFrame comparison for humans

Stars: ✭ 147 (-90.98%)

Mutual labels: pandas, spark

Cape Python

Collaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark

Stars: ✭ 125 (-92.33%)

Mutual labels: pandas, spark

docker-hadoop

Docker image for main Apache Hadoop components (Yarn/Hdfs)

Stars: ✭ 59 (-96.38%)

Mutual labels: hadoop, hdfs

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (-86.81%)

Mutual labels: hadoop, spark

HDFS-Netdisc

基于Hadoop的分布式云存储系统 🌴

Stars: ✭ 56 (-96.56%)

Mutual labels: hadoop, hdfs

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (-97.73%)

Mutual labels: hadoop, hdfs

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (-87.55%)

Mutual labels: hadoop, spark

Docker Spark Cluster

A Spark cluster setup running on Docker containers

Stars: ✭ 57 (-96.5%)

Mutual labels: hadoop, spark

Camus

Mirror of Linkedin's Camus

Stars: ✭ 81 (-95.03%)

Mutual labels: hadoop, hdfs

implyr

SQL backend to dplyr for Impala

Stars: ✭ 74 (-95.46%)

Mutual labels: hadoop, impala

ros hadoop

Hadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.

Stars: ✭ 92 (-94.36%)

Mutual labels: hadoop, hdfs

datasqueeze

Hadoop utility to compact small files

Stars: ✭ 18 (-98.9%)

Mutual labels: hadoop, hdfs

spark-util

low-level helpers for Apache Spark libraries and tests

Stars: ✭ 16 (-99.02%)

Mutual labels: spark, hadoop

swordfish

Open-source distribute workflow schedule tools, also support streaming task.

Stars: ✭ 35 (-97.85%)

Mutual labels: spark, hadoop

Docker Hadoop

A Docker container with a full Hadoop cluster setup with Spark and Zeppelin

Stars: ✭ 54 (-96.69%)

Mutual labels: hadoop, spark

Wifi

基于wifi抓取信息的大数据查询分析系统

Stars: ✭ 93 (-94.29%)

Mutual labels: hadoop, hdfs

Docker Spark

🚢 Docker image for Apache Spark

Stars: ✭ 78 (-95.21%)

Mutual labels: hadoop, spark

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (-93.19%)

Mutual labels: spark, hadoop

incubator-linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Stars: ✭ 2,459 (+50.86%)

Mutual labels: spark, impala

Addax

Addax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.

Stars: ✭ 615 (-62.27%)

Mutual labels: hadoop, impala

Alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

Stars: ✭ 5,379 (+230%)

Mutual labels: hadoop, spark

Weblogsanalysissystem

A big data platform for analyzing web access logs

Stars: ✭ 37 (-97.73%)

Mutual labels: hadoop, spark

Rumble

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-96.44%)

Mutual labels: hdfs, spark

Waimak

Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.

Stars: ✭ 60 (-96.32%)

Mutual labels: hadoop, spark

Hadoop Yarn Api Python Client

Python client for Hadoop® YARN API

Stars: ✭ 91 (-94.42%)

Mutual labels: hadoop

Distributed Dataset

A distributed data processing framework in Haskell.

Stars: ✭ 108 (-93.37%)

Mutual labels: spark

Data Mining Python

《python数据分析与挖掘实战》项目实践及拓展

Stars: ✭ 92 (-94.36%)

Mutual labels: pandas

Spark On Kubernetes Helm

Spark on Kubernetes infrastructure Helm charts repo

Stars: ✭ 92 (-94.36%)

Mutual labels: spark

Haproxy Configs

80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.

Stars: ✭ 106 (-93.5%)

Mutual labels: hadoop

Danfojs

danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

Stars: ✭ 1,304 (-20%)

Mutual labels: pandas

Pymc Example Project

Example PyMC3 project for performing Bayesian data analysis using a probabilistic programming approach to machine learning.

Stars: ✭ 90 (-94.48%)

Mutual labels: pandas

Pyspark Cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

Stars: ✭ 108 (-93.37%)

Mutual labels: spark

Moonshot

Vectorized backtester and trading engine for QuantRocket

Stars: ✭ 88 (-94.6%)

Mutual labels: pandas

Credit Risk Modelling

Credit Risk analysis by using Python and ML

Stars: ✭ 91 (-94.42%)

Mutual labels: pandas

Tensorflowonyarn

Support TensorFlow on YARN

Stars: ✭ 114 (-93.01%)

Mutual labels: hadoop

Ni Pyt

Materiály k předmětu NI-PYT na FIT ČVUT

Stars: ✭ 112 (-93.13%)

Mutual labels: pandas

Hnswlib

Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs

Stars: ✭ 108 (-93.37%)

Mutual labels: spark

Udacity Data Engineering

Udacity Data Engineering Nano Degree (DEND)

Stars: ✭ 89 (-94.54%)

Mutual labels: spark

Hadoop Mapreduce

Mirror of Apache Hadoop MapReduce

Stars: ✭ 88 (-94.6%)

Mutual labels: hadoop

Flink Learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+598.04%)

Mutual labels: spark

Ammonite Spark

Run spark calculations from Ammonite

Stars: ✭ 88 (-94.6%)

Mutual labels: spark

Spark Nlp Models

Models and Pipelines for the Spark NLP library

Stars: ✭ 88 (-94.6%)

Mutual labels: spark

Python Bigdata

Data science and Big Data with Python