All Projects → Iql → Similar Projects or Alternatives

399 Open source projects that are alternatives of or similar to Iql

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

Stars: ✭ 54 (-84.16%)

Mutual labels: spark

spark-extension

A library that provides useful extensions to Apache Spark and PySpark.

Stars: ✭ 25 (-92.67%)

Mutual labels: spark

Sk Dist

Distributed scikit-learn meta-estimators in PySpark

Stars: ✭ 260 (-23.75%)

Mutual labels: spark

dllib

dllib is a distributed deep learning library running on Apache Spark

Stars: ✭ 32 (-90.62%)

Mutual labels: spark

incubator-linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Stars: ✭ 2,459 (+621.11%)

Mutual labels: spark

Datavec

ETL Library for Machine Learning - data pipelines, data munging and wrangling

Stars: ✭ 272 (-20.23%)

Mutual labels: spark

blog

blog entries

Stars: ✭ 39 (-88.56%)

Mutual labels: spark

Zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

Stars: ✭ 303 (-11.14%)

Mutual labels: spark

visions

Type System for Data Analysis in Python

Stars: ✭ 136 (-60.12%)

Mutual labels: spark

spark-structured-streaming-examples

Spark structured streaming examples with using of version 3.0.0

Stars: ✭ 23 (-93.26%)

Mutual labels: spark

basin

Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser

Stars: ✭ 25 (-92.67%)

Mutual labels: spark

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-96.19%)

Mutual labels: spark

Cloudflow

Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

Stars: ✭ 278 (-18.48%)

Mutual labels: spark

spark learning

尚硅谷大数据Spark-2019版最新 Spark 学习

Stars: ✭ 42 (-87.68%)

Mutual labels: spark

Delta

An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.

Stars: ✭ 3,903 (+1044.57%)

Mutual labels: spark

confluent-spark-avro

Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.

Stars: ✭ 18 (-94.72%)

Mutual labels: spark

Docker Spark Cluster

A simple spark standalone cluster for your testing environment purposses

Stars: ✭ 261 (-23.46%)

Mutual labels: spark

data-algorithms-with-spark

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Stars: ✭ 34 (-90.03%)

Mutual labels: spark

Clickhouse Native Jdbc

ClickHouse Native Protocol JDBC implementation

Stars: ✭ 310 (-9.09%)

Mutual labels: spark

Casper

A compiler for automatically re-targeting sequential Java code to Apache Spark.

Stars: ✭ 45 (-86.8%)

Mutual labels: spark

Succinct

Enabling queries on compressed data.

Stars: ✭ 257 (-24.63%)

Mutual labels: spark

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (-67.45%)

Mutual labels: spark

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (-12.61%)

Mutual labels: spark

Spark-PMoF

Spark Shuffle Optimization with RDMA+AEP

Stars: ✭ 28 (-91.79%)

Mutual labels: spark

sparkProjectTemplate.g8

Template for Spark Projects

Stars: ✭ 77 (-77.42%)

Mutual labels: spark

spark-http-stream

spark structured streaming via HTTP communication

Stars: ✭ 17 (-95.01%)

Mutual labels: spark

kafka-compose

🎼 Docker compose files for various kafka stacks

Stars: ✭ 32 (-90.62%)

Mutual labels: spark

Spark Druid Olap

Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.

Stars: ✭ 282 (-17.3%)

Mutual labels: spark

daf-kylo

Kylo integration with PDND (previously DAF).

Stars: ✭ 20 (-94.13%)

Mutual labels: spark

Crayon

Simple framework agnostic UI router for SPAs

Stars: ✭ 310 (-9.09%)

Mutual labels: spark

Spotify-Song-Recommendation-ML

UC Berkeley team's submission for RecSys Challenge 2018

Stars: ✭ 70 (-79.47%)

Mutual labels: spark

Hbase Rdd

Spark RDD to read, write and delete from HBase

Stars: ✭ 277 (-18.77%)

Mutual labels: spark

spark-data-sources

Developing Spark External Data Sources using the V2 API

Stars: ✭ 36 (-89.44%)

Mutual labels: spark

Cook

Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark

Stars: ✭ 314 (-7.92%)

Mutual labels: spark

bigkube

Minikube for big data with Scala and Spark

Stars: ✭ 16 (-95.31%)

Mutual labels: spark

Helk

The Hunting ELK

Stars: ✭ 3,097 (+808.21%)

Mutual labels: spark

Covid19Tracker

A Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.

Stars: ✭ 65 (-80.94%)

Mutual labels: spark

Spline

Data Lineage Tracking And Visualization Solution

Stars: ✭ 306 (-10.26%)

Mutual labels: spark

SparkV

🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.

Stars: ✭ 24 (-92.96%)

Mutual labels: spark

Around Dataengineering

A Data Engineering & Machine Learning Knowledge Hub

Stars: ✭ 257 (-24.63%)

Mutual labels: spark

trembita

Model complex data transformation pipelines easily

Stars: ✭ 44 (-87.1%)

Mutual labels: spark

Wirbelsturm

Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.

Stars: ✭ 332 (-2.64%)

Mutual labels: spark

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-95.89%)

Mutual labels: spark

Spark Jupyter Aws

A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support

Stars: ✭ 259 (-24.05%)

Mutual labels: spark

smolder

HL7 Apache Spark Datasource

Stars: ✭ 33 (-90.32%)

Mutual labels: spark

Awesome Ada

A curated list of awesome resources related to the Ada and SPARK programming language

Stars: ✭ 299 (-12.32%)

Mutual labels: spark

spark-demos

Collection of different demo applications using Apache Spark

Stars: ✭ 15 (-95.6%)

Mutual labels: spark

Big Data Rosetta Code

Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code

Stars: ✭ 254 (-25.51%)

Mutual labels: spark

tpch-spark

TPC-H queries in Apache Spark SQL using native DataFrames API

Stars: ✭ 63 (-81.52%)

Mutual labels: spark

Coolplayspark

酷玩 Spark: Spark 源代码解析、Spark 类库等

Stars: ✭ 3,318 (+873.02%)

Mutual labels: spark

frovedis

Framework of vectorized and distributed data analytics

Stars: ✭ 59 (-82.7%)

Mutual labels: spark

laravel-spark-camera

Profile Photo Camera support for Laravel Spark

Stars: ✭ 30 (-91.2%)

Mutual labels: spark

BigData-News

基于Spark2.2新闻网大数据实时系统项目

Stars: ✭ 36 (-89.44%)

Mutual labels: spark

Spark Hbase Connector

Connect Spark to HBase for reading and writing data with ease

Stars: ✭ 299 (-12.32%)

Mutual labels: spark

Book

本项目收藏这些年来看过或者听过的一些不错的书籍，在整理文件时看见这些，发现删掉有点可惜，放着又太浪费空间，本着分享的原则，就把它们共享出来，一方面给需要的读者提供这些书籍，另一方面也是一种像知识库的积累吧

Stars: ✭ 47 (-86.22%)

Mutual labels: spark

Ytk Learn

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Stars: ✭ 337 (-1.17%)

Mutual labels: spark

Sparklint

A tool for monitoring and tuning Spark jobs for efficiency.

Stars: ✭ 316 (-7.33%)

Mutual labels: spark

Learningsparkv2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Stars: ✭ 307 (-9.97%)

Mutual labels: spark

Spark Notebook

Interactive and Reactive Data Science using Scala and Spark.