All Projects → Spark Movie Lens → Similar Projects or Alternatives

7196 Open source projects that are alternatives of or similar to Spark Movie Lens

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (+79.6%)

Mutual labels: jupyter-notebook, spark, big-data, bigdata

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (-90.47%)

Mutual labels: jupyter-notebook, spark, big-data, bigdata

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (-71.14%)

Mutual labels: spark, big-data, bigdata

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-79.87%)

Mutual labels: jupyter-notebook, spark, big-data

Cortx

CORTX Community Object Storage is 100% open source object storage uniquely optimized for mass capacity storage devices.

Stars: ✭ 426 (-42.82%)

Mutual labels: jupyter-notebook, big-data, bigdata

Optimus

🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

Stars: ✭ 986 (+32.35%)

Mutual labels: jupyter-notebook, spark, bigdata

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-98.26%)

Mutual labels: big-data, spark, bigdata

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+1375.3%)

Mutual labels: spark, big-data, bigdata

Spark R Notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 109 (-85.37%)

Mutual labels: jupyter-notebook, big-data, bigdata

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+659.19%)

Mutual labels: jupyter-notebook, spark, big-data

Mydatascienceportfolio

Applying Data Science and Machine Learning to Solve Real World Business Problems

Stars: ✭ 227 (-69.53%)

Mutual labels: jupyter-notebook, spark

Dtale

Visualizer for pandas data structures

Stars: ✭ 2,864 (+284.43%)

Mutual labels: jupyter-notebook, flask

Spark Practice

Apache Spark (PySpark) Practice on Real Data

Stars: ✭ 200 (-73.15%)

Mutual labels: jupyter-notebook, spark

Installations mac ubuntu windows

Installations for Data Science. Anaconda, RStudio, Spark, TensorFlow, AWS (Amazon Web Services).

Stars: ✭ 231 (-68.99%)

Mutual labels: jupyter-notebook, spark

awesome-coder-resources

编程路上加油站！------【持续更新中...欢迎star,欢迎常回来看看......】【内容：编程/学习/阅读资源，开源项目,面试题,网站,书,博客,教程等等】

Stars: ✭ 54 (-92.75%)

Mutual labels: big-data, bigdata

gan deeplearning4j

Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.

Stars: ✭ 19 (-97.45%)

Mutual labels: big-data, bigdata

SparkProgrammingInScala

Apache Spark Course Material

Stars: ✭ 57 (-92.35%)

Mutual labels: big-data, bigdata

yuzhouwan

Code Library for My Blog

Stars: ✭ 39 (-94.77%)

Mutual labels: spark, bigdata

spark-acid

ACID Data Source for Apache Spark based on Hive ACID

Stars: ✭ 91 (-87.79%)

Mutual labels: big-data, spark

NiFi-Rule-engine-processor

Drools processor for Apache NiFi

Stars: ✭ 34 (-95.44%)

Mutual labels: big-data, bigdata

Succinct

Enabling queries on compressed data.

Stars: ✭ 257 (-65.5%)

Mutual labels: spark, big-data

v6.dooring.public

可视化大屏解决方案, 提供一套可视化编辑引擎, 助力个人或企业轻松定制自己的可视化大屏应用.

Stars: ✭ 323 (-56.64%)

Mutual labels: big-data, bigdata

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (-85.1%)

Mutual labels: big-data, spark

Spark Jupyter Aws

A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support

Stars: ✭ 259 (-65.23%)

Mutual labels: jupyter-notebook, spark

Helk

The Hunting ELK

Stars: ✭ 3,097 (+315.7%)

Mutual labels: jupyter-notebook, spark

Sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Stars: ✭ 362 (-51.41%)

Mutual labels: spark, big-data

Azure Cosmosdb Spark

Apache Spark Connector for Azure Cosmos DB

Stars: ✭ 165 (-77.85%)

Mutual labels: jupyter-notebook, spark

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (-78.39%)

Mutual labels: jupyter-notebook, spark

Snap N Eat

Food detection and recommendation with deep learning

Stars: ✭ 229 (-69.26%)

Mutual labels: jupyter-notebook, flask

Hey Jetson

Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.

Stars: ✭ 161 (-78.39%)

Mutual labels: jupyter-notebook, flask

Clustering4Ever

C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.

Stars: ✭ 126 (-83.09%)

Mutual labels: big-data, bigdata

twitter-archive-reader

Full featured TypeScript Twitter archive reader and browser

Stars: ✭ 43 (-94.23%)

Mutual labels: big-data, bigdata

meetups-archivos

Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …

Stars: ✭ 60 (-91.95%)

Mutual labels: big-data, bigdata

Scalable Data Science Platform

Content for architecting a data science platform for products using Luigi, Spark & Flask.

Stars: ✭ 158 (-78.79%)

Mutual labels: jupyter-notebook, spark

data processing course

Some class materials for a data processing course using PySpark

Stars: ✭ 50 (-93.29%)

Mutual labels: spark, bigdata

awesome-AI-kubernetes

❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc

Stars: ✭ 95 (-87.25%)

Mutual labels: big-data, spark

Metorikku

A simplified, lightweight ETL Framework based on Apache Spark

Stars: ✭ 361 (-51.54%)

Mutual labels: spark, big-data

Bigdl

Building Large-Scale AI Applications for Distributed Big Data

Stars: ✭ 3,813 (+411.81%)

Mutual labels: spark, big-data

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (-95.44%)

Mutual labels: big-data, bigdata

Big Data Rosetta Code

Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code

Stars: ✭ 254 (-65.91%)

Mutual labels: spark, bigdata

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-98.12%)

Mutual labels: big-data, spark

Docker Spark Cluster

A simple spark standalone cluster for your testing environment purposses

Stars: ✭ 261 (-64.97%)

Mutual labels: spark, bigdata

Handyspark

HandySpark - bringing pandas-like capabilities to Spark dataframes

Stars: ✭ 158 (-78.79%)

Mutual labels: jupyter-notebook, spark

Beeva Best Practices

Best Practices and Style Guides in BEEVA

Stars: ✭ 335 (-55.03%)

Mutual labels: jupyter-notebook, big-data

Uproot3

ROOT I/O in pure Python and NumPy.

Stars: ✭ 312 (-58.12%)

Mutual labels: big-data, bigdata

Sidekick

High Performance HTTP Sidecar Load Balancer

Stars: ✭ 366 (-50.87%)

Mutual labels: spark, bigdata

Delta

An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.

Stars: ✭ 3,903 (+423.89%)

Mutual labels: spark, big-data

Opendata.cern.ch

Source code for the CERN Open Data portal

Stars: ✭ 411 (-44.83%)

Mutual labels: big-data, flask

Big data architect skills

一个大数据架构师应该掌握的技能

Stars: ✭ 400 (-46.31%)

Mutual labels: spark, bigdata

Enterprise gateway

A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.

Stars: ✭ 412 (-44.7%)

Mutual labels: jupyter-notebook, spark

Spline

Data Lineage Tracking And Visualization Solution

Stars: ✭ 306 (-58.93%)

Mutual labels: spark, bigdata

Pytorch classification

利用pytorch实现图像分类的一个完整的代码，训练，预测，TTA，模型融合，模型部署，cnn提取特征，svm或者随机森林等进行分类，模型蒸馏，一个完整的代码

Stars: ✭ 395 (-46.98%)

Mutual labels: jupyter-notebook, flask

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (-44.56%)

Mutual labels: jupyter-notebook, spark

Listenbrainz Server

Server for the ListenBrainz project

Stars: ✭ 420 (-43.62%)

Mutual labels: spark, big-data

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+2859.46%)

Mutual labels: spark, big-data

Bigdataie

大数据博客、笔试题、教程、项目、面经的整理

Stars: ✭ 445 (-40.27%)

Mutual labels: spark, bigdata

Courses

Quiz & Assignment of Coursera

Stars: ✭ 454 (-39.06%)

Mutual labels: jupyter-notebook, big-data

Justenoughscalaforspark

A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.