All Projects → Scanns → Similar Projects or Alternatives

443 Open source projects that are alternatives of or similar to Scanns

Fast Mrmr
An improved implementation of the classical feature selection method: minimum Redundancy and Maximum Relevance (mRMR).
Stars: ✭ 67 (-64.74%)
Mutual labels:  spark
Ytk Learn
Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).
Stars: ✭ 337 (+77.37%)
Mutual labels:  spark
Benchm Ml
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
Stars: ✭ 1,835 (+865.79%)
Mutual labels:  spark
Sparklint
A tool for monitoring and tuning Spark jobs for efficiency.
Stars: ✭ 316 (+66.32%)
Mutual labels:  spark
Thingsboard
Open-source IoT Platform - Device management, data collection, processing and visualization.
Stars: ✭ 10,526 (+5440%)
Mutual labels:  spark
Clickhouse Native Jdbc
ClickHouse Native Protocol JDBC implementation
Stars: ✭ 310 (+63.16%)
Mutual labels:  spark
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-35.79%)
Mutual labels:  spark
Learningsparkv2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Stars: ✭ 307 (+61.58%)
Mutual labels:  spark
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-65.79%)
Mutual labels:  spark
Delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (+1954.21%)
Mutual labels:  spark
Roaringbitmap
A better compressed bitset in Java
Stars: ✭ 2,460 (+1194.74%)
Mutual labels:  spark
Zat
Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (+59.47%)
Mutual labels:  spark
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-66.32%)
Mutual labels:  spark
Elasticluster
Create clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (+56.84%)
Mutual labels:  spark
Zparkio
Boiler plate framework to use Spark and ZIO together.
Stars: ✭ 121 (-36.32%)
Mutual labels:  spark
Spark Notebook
Interactive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+1521.58%)
Mutual labels:  spark
Pysparkgeoanalysis
🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-66.84%)
Mutual labels:  spark
Cloudflow
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (+46.32%)
Mutual labels:  spark
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-21.05%)
Mutual labels:  spark
Datavec
ETL Library for Machine Learning - data pipelines, data munging and wrangling
Stars: ✭ 272 (+43.16%)
Mutual labels:  spark
Roffildlibrary
Library for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS
Stars: ✭ 63 (-66.84%)
Mutual labels:  spark
Docker Spark Cluster
A simple spark standalone cluster for your testing environment purposses
Stars: ✭ 261 (+37.37%)
Mutual labels:  spark
Example Spark Kafka
Apache Spark and Apache Kafka integration example
Stars: ✭ 120 (-36.84%)
Mutual labels:  spark
Sk Dist
Distributed scikit-learn meta-estimators in PySpark
Stars: ✭ 260 (+36.84%)
Mutual labels:  spark
Silex
something to help you spark
Stars: ✭ 61 (-67.89%)
Mutual labels:  spark
Succinct
Enabling queries on compressed data.
Stars: ✭ 257 (+35.26%)
Mutual labels:  spark
Big Whale
Spark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (-14.21%)
Mutual labels:  spark
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-87.89%)
Mutual labels:  spark
Waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-68.42%)
Mutual labels:  spark
sparkProjectTemplate.g8
Template for Spark Projects
Stars: ✭ 77 (-59.47%)
Mutual labels:  spark
Kinesis Sql
Kinesis Connector for Structured Streaming
Stars: ✭ 120 (-36.84%)
Mutual labels:  spark
kafka-spark-streaming-zeppelin-docker
One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)
Stars: ✭ 82 (-56.84%)
Mutual labels:  spark
Zemberek Nlp Server
Zemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu
Stars: ✭ 60 (-68.42%)
Mutual labels:  spark
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-86.84%)
Mutual labels:  spark
Pyspark Learning
Updated repository
Stars: ✭ 147 (-22.63%)
Mutual labels:  spark
dllib
dllib is a distributed deep learning library running on Apache Spark
Stars: ✭ 32 (-83.16%)
Mutual labels:  spark
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-69.47%)
Mutual labels:  spark
spark learning
尚硅谷大数据Spark-2019版最新 Spark 学习
Stars: ✭ 42 (-77.89%)
Mutual labels:  spark
Ibis
A pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+757.89%)
Mutual labels:  spark
prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (-71.58%)
Mutual labels:  spark
Model Serving Tutorial
Code and presentation for Strata Model Serving tutorial
Stars: ✭ 57 (-70%)
Mutual labels:  spark
confluent-spark-avro
Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
Stars: ✭ 18 (-90.53%)
Mutual labels:  spark
Kraps Rpc
A RPC framework leveraging Spark RPC module
Stars: ✭ 175 (-7.89%)
Mutual labels:  spark
blog
blog entries
Stars: ✭ 39 (-79.47%)
Mutual labels:  spark
Net.jgp.labs.spark
Apache Spark examples exclusively in Java
Stars: ✭ 55 (-71.05%)
Mutual labels:  spark
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (-82.11%)
Mutual labels:  spark
Spark Lucenerdd
Spark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-40%)
Mutual labels:  spark
trembita
Model complex data transformation pipelines easily
Stars: ✭ 44 (-76.84%)
Mutual labels:  spark
Docker Hadoop
A Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-71.58%)
Mutual labels:  spark
pgvector
Open-source vector similarity search for Postgres
Stars: ✭ 482 (+153.68%)
Mutual labels:  nearest-neighbor-search
Spark Cassandra Connector
DataStax Spark Cassandra Connector
Stars: ✭ 1,816 (+855.79%)
Mutual labels:  spark
Utils4s
scala、spark使用过程中,各种测试用例以及相关资料整理
Stars: ✭ 1,070 (+463.16%)
Mutual labels:  spark
Js Spark
Realtime calculation distributed system. AKA distributed lodash
Stars: ✭ 187 (-1.58%)
Mutual labels:  spark
Kotlin Spark Api
This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x
Stars: ✭ 183 (-3.68%)
Mutual labels:  spark
Sparkstreaming
💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。
Stars: ✭ 179 (-5.79%)
Mutual labels:  spark
Transmogrifai
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Stars: ✭ 2,084 (+996.84%)
Mutual labels:  spark
Learningapachespark
LearningApacheSpark
Stars: ✭ 155 (-18.42%)
Mutual labels:  spark
Knn Matting
Source Code for KNN Matting, CVPR 2012 / TPAMI 2013. MATLAB code ready to run. Simple and robust implementation under 40 lines.
Stars: ✭ 130 (-31.58%)
Mutual labels:  nearest-neighbor-search
Spark python ml examples
Spark 2.0 Python Machine Learning examples
Stars: ✭ 87 (-54.21%)
Mutual labels:  spark
Datafusion
DataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+221.58%)
Mutual labels:  spark
301-360 of 443 similar projects