All Projects → spark-extension → Similar Projects or Alternatives

456 Open source projects that are alternatives of or similar to spark-extension

Mare
MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-56%)
Mutual labels:  spark
Spark Authorizer
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (+464%)
Mutual labels:  spark
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+3328%)
Mutual labels:  spark
experiments
Code examples for my blog posts
Stars: ✭ 21 (-16%)
Mutual labels:  spark
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (+4%)
Mutual labels:  spark
Data science blogs
A repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (+456%)
Mutual labels:  spark
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+3616%)
Mutual labels:  spark
spark3D
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Stars: ✭ 23 (-8%)
Mutual labels:  pyspark
Ecommercerecommendsystem
商品大数据实时推荐系统。前端:Vue + TypeScript + ElementUI,后端 Spring + Spark
Stars: ✭ 139 (+456%)
Mutual labels:  spark
smolder
HL7 Apache Spark Datasource
Stars: ✭ 33 (+32%)
Mutual labels:  spark
Luigi Warehouse
A luigi powered analytics / warehouse stack
Stars: ✭ 72 (+188%)
Mutual labels:  spark
Yandex Big Data Engineering
Stars: ✭ 17 (-32%)
Mutual labels:  spark
Isolation Forest
A Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.
Stars: ✭ 139 (+456%)
Mutual labels:  spark
Big Data Scala Spark
Coursera's big data course with Scala and Spark
Stars: ✭ 16 (-36%)
Mutual labels:  spark
workshop-spark
Código para workshops Spark com ambiente de desenvolvimento em docker
Stars: ✭ 27 (+8%)
Mutual labels:  pyspark
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (+3204%)
Mutual labels:  spark
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (+448%)
Mutual labels:  spark
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+3072%)
Mutual labels:  spark
splink
Implementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+624%)
Mutual labels:  spark
Sparklyr
R interface for Apache Spark
Stars: ✭ 775 (+3000%)
Mutual labels:  spark
Horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Stars: ✭ 11,943 (+47672%)
Mutual labels:  spark
Coding Now
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Stars: ✭ 750 (+2900%)
Mutual labels:  spark
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+12076%)
Mutual labels:  spark
Sparkctr
CTR prediction model based on spark(LR, GBDT, DNN)
Stars: ✭ 740 (+2860%)
Mutual labels:  spark
Iot Traffic Monitor
Stars: ✭ 131 (+424%)
Mutual labels:  spark
Kafka Storm Starter
Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+2812%)
Mutual labels:  spark
spark-word2vec
A parallel implementation of word2vec based on Spark
Stars: ✭ 24 (-4%)
Mutual labels:  spark
Hail
Scalable genomic data analysis.
Stars: ✭ 706 (+2724%)
Mutual labels:  spark
Opaque
An encrypted data analytics platform
Stars: ✭ 129 (+416%)
Mutual labels:  spark
pyspark-ML-in-Colab
Pyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (+28%)
Mutual labels:  pyspark
Spark Structured Streaming Examples
Spark Structured Streaming / Kafka / Cassandra / Elastic
Stars: ✭ 168 (+572%)
Mutual labels:  spark
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (+184%)
Mutual labels:  spark
Every Single Day I Tldr
A daily digest of the articles or videos I've found interesting, that I want to share with you.
Stars: ✭ 249 (+896%)
Mutual labels:  spark
Spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+6784%)
Mutual labels:  spark
Dev Setup
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
Stars: ✭ 5,590 (+22260%)
Mutual labels:  spark
visualize-data-with-python
A Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (+140%)
Mutual labels:  spark
Datafusion
DataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+2344%)
Mutual labels:  spark
Airflow Pipeline
An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (+412%)
Mutual labels:  spark
Mongo Spark
The MongoDB Spark Connector
Stars: ✭ 588 (+2252%)
Mutual labels:  spark
Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+888%)
Mutual labels:  spark
Sparklearning
Learning Apache spark,including code and data .Most part can run local.
Stars: ✭ 558 (+2132%)
Mutual labels:  spark
Openuba
A robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (+408%)
Mutual labels:  spark
Justenoughscalaforspark
A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+2052%)
Mutual labels:  spark
frovedis
Framework of vectorized and distributed data analytics
Stars: ✭ 59 (+136%)
Mutual labels:  spark
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+1952%)
Mutual labels:  spark
Cape Python
Collaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (+400%)
Mutual labels:  spark
Magellan
Geo Spatial Data Analytics on Spark
Stars: ✭ 507 (+1928%)
Mutual labels:  spark
Neo4j Spark Connector
Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
Stars: ✭ 245 (+880%)
Mutual labels:  spark
Pdf
编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+47936%)
Mutual labels:  spark
Spark Bigquery Connector
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Stars: ✭ 126 (+404%)
Mutual labels:  spark
Bdp Dataplatform
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+1724%)
Mutual labels:  spark
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+36%)
Mutual labels:  pyspark
Usersessionbehaviorofflineanalysis
四川大学拓思爱诺用户session行为数据离线分析项目
Stars: ✭ 69 (+176%)
Mutual labels:  spark
Fast Mrmr
An improved implementation of the classical feature selection method: minimum Redundancy and Maximum Relevance (mRMR).
Stars: ✭ 67 (+168%)
Mutual labels:  spark
Spark Infotheoretic Feature Selection
This package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.
Stars: ✭ 123 (+392%)
Mutual labels:  spark
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-44%)
Mutual labels:  spark
pyspark-asyncactions
Asynchronous actions for PySpark
Stars: ✭ 30 (+20%)
Mutual labels:  pyspark
Kontextfrei
Writing application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (+168%)
Mutual labels:  spark
docker-spark
Apache Spark docker container image (Standalone mode)
Stars: ✭ 34 (+36%)
Mutual labels:  spark
awesome-AI-kubernetes
❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (+280%)
Mutual labels:  spark
301-360 of 456 similar projects