All Projects → Scanns → Similar Projects or Alternatives

443 Open source projects that are alternatives of or similar to Scanns

An improved implementation of the classical feature selection method: minimum Redundancy and Maximum Relevance (mRMR).

Stars: ✭ 67 (-64.74%)

Mutual labels: spark

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Stars: ✭ 337 (+77.37%)

Mutual labels: spark

Benchm Ml

A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).

Stars: ✭ 1,835 (+865.79%)

Mutual labels: spark

Sparklint

A tool for monitoring and tuning Spark jobs for efficiency.

Stars: ✭ 316 (+66.32%)

Mutual labels: spark

Thingsboard

Open-source IoT Platform - Device management, data collection, processing and visualization.

Stars: ✭ 10,526 (+5440%)

Mutual labels: spark

Clickhouse Native Jdbc

ClickHouse Native Protocol JDBC implementation

Stars: ✭ 310 (+63.16%)

Mutual labels: spark

Spark Alchemy

Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive

Stars: ✭ 122 (-35.79%)

Mutual labels: spark

Learningsparkv2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Stars: ✭ 307 (+61.58%)

Mutual labels: spark

Rsparkling

RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)

Stars: ✭ 65 (-65.79%)

Mutual labels: spark

Delta

An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.

Stars: ✭ 3,903 (+1954.21%)

Mutual labels: spark

Roaringbitmap

A better compressed bitset in Java

Stars: ✭ 2,460 (+1194.74%)

Mutual labels: spark

Zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

Stars: ✭ 303 (+59.47%)

Mutual labels: spark

W2v

Word2Vec models with Twitter data using Spark. Blog:

Stars: ✭ 64 (-66.32%)

Mutual labels: spark

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (+56.84%)

Mutual labels: spark

Zparkio

Boiler plate framework to use Spark and ZIO together.

Stars: ✭ 121 (-36.32%)

Mutual labels: spark

Spark Notebook

Interactive and Reactive Data Science using Scala and Spark.

Stars: ✭ 3,081 (+1521.58%)

Mutual labels: spark

Pysparkgeoanalysis

🌐 Interactive Workshop on GeoAnalysis using PySpark

Stars: ✭ 63 (-66.84%)

Mutual labels: spark

Cloudflow

Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

Stars: ✭ 278 (+46.32%)

Mutual labels: spark

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-21.05%)

Mutual labels: spark

Datavec

ETL Library for Machine Learning - data pipelines, data munging and wrangling

Stars: ✭ 272 (+43.16%)

Mutual labels: spark

Roffildlibrary

Library for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS

Stars: ✭ 63 (-66.84%)

Mutual labels: spark

Docker Spark Cluster

A simple spark standalone cluster for your testing environment purposses

Stars: ✭ 261 (+37.37%)

Mutual labels: spark

Example Spark Kafka

Apache Spark and Apache Kafka integration example

Stars: ✭ 120 (-36.84%)

Mutual labels: spark

Sk Dist

Distributed scikit-learn meta-estimators in PySpark

Stars: ✭ 260 (+36.84%)

Mutual labels: spark

Silex

something to help you spark

Stars: ✭ 61 (-67.89%)

Mutual labels: spark

Succinct

Enabling queries on compressed data.

Stars: ✭ 257 (+35.26%)

Mutual labels: spark

Big Whale

Spark、Flink等离线任务的调度以及实时任务的监控

Stars: ✭ 163 (-14.21%)

Mutual labels: spark

spark-structured-streaming-examples

Spark structured streaming examples with using of version 3.0.0

Stars: ✭ 23 (-87.89%)

Mutual labels: spark

Waimak

Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.

Stars: ✭ 60 (-68.42%)

Mutual labels: spark

sparkProjectTemplate.g8

Template for Spark Projects

Stars: ✭ 77 (-59.47%)

Mutual labels: spark

Kinesis Sql

Kinesis Connector for Structured Streaming

Stars: ✭ 120 (-36.84%)

Mutual labels: spark

kafka-spark-streaming-zeppelin-docker

One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)

Stars: ✭ 82 (-56.84%)

Mutual labels: spark

Zemberek Nlp Server

Zemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu

Stars: ✭ 60 (-68.42%)

Mutual labels: spark

basin

Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser

Stars: ✭ 25 (-86.84%)

Mutual labels: spark

Pyspark Learning

Updated repository

Stars: ✭ 147 (-22.63%)

Mutual labels: spark

dllib

dllib is a distributed deep learning library running on Apache Spark

Stars: ✭ 32 (-83.16%)

Mutual labels: spark

Rumble

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-69.47%)

Mutual labels: spark

spark learning

尚硅谷大数据Spark-2019版最新 Spark 学习

Stars: ✭ 42 (-77.89%)

Mutual labels: spark

Ibis

A pandas-like deferred expression system, with first-class SQL support

Stars: ✭ 1,630 (+757.89%)

Mutual labels: spark

prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

Stars: ✭ 54 (-71.58%)

Mutual labels: spark

Model Serving Tutorial

Code and presentation for Strata Model Serving tutorial

Stars: ✭ 57 (-70%)

Mutual labels: spark

confluent-spark-avro

Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.

Stars: ✭ 18 (-90.53%)

Mutual labels: spark

Kraps Rpc

A RPC framework leveraging Spark RPC module

Stars: ✭ 175 (-7.89%)

Mutual labels: spark

blog

blog entries

Stars: ✭ 39 (-79.47%)

Mutual labels: spark

Net.jgp.labs.spark

Apache Spark examples exclusively in Java

Stars: ✭ 55 (-71.05%)

Mutual labels: spark

data-algorithms-with-spark

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Stars: ✭ 34 (-82.11%)

Mutual labels: spark

Spark Lucenerdd

Spark RDD with Lucene's query and entity linkage capabilities

Stars: ✭ 114 (-40%)

Mutual labels: spark

trembita

Model complex data transformation pipelines easily

Stars: ✭ 44 (-76.84%)

Mutual labels: spark

Docker Hadoop

A Docker container with a full Hadoop cluster setup with Spark and Zeppelin

Stars: ✭ 54 (-71.58%)

Mutual labels: spark

pgvector

Open-source vector similarity search for Postgres

Stars: ✭ 482 (+153.68%)

Mutual labels: nearest-neighbor-search

Spark Cassandra Connector

DataStax Spark Cassandra Connector

Stars: ✭ 1,816 (+855.79%)

Mutual labels: spark

Utils4s

scala、spark使用过程中，各种测试用例以及相关资料整理

Stars: ✭ 1,070 (+463.16%)

Mutual labels: spark

Js Spark

Realtime calculation distributed system. AKA distributed lodash

Stars: ✭ 187 (-1.58%)

Mutual labels: spark

Kotlin Spark Api

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

Stars: ✭ 183 (-3.68%)

Mutual labels: spark

Sparkstreaming

💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算)；🚀 支持运行过程中增删topic；🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。

Stars: ✭ 179 (-5.79%)

Mutual labels: spark

Transmogrifai

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning

Stars: ✭ 2,084 (+996.84%)

Mutual labels: spark

Learningapachespark

LearningApacheSpark

Stars: ✭ 155 (-18.42%)

Mutual labels: spark

Knn Matting

Source Code for KNN Matting, CVPR 2012 / TPAMI 2013. MATLAB code ready to run. Simple and robust implementation under 40 lines.

Stars: ✭ 130 (-31.58%)

Mutual labels: nearest-neighbor-search

Spark python ml examples

Spark 2.0 Python Machine Learning examples