All Projects → pyspark-ML-in-Colab → Similar Projects or Alternatives

666 Open source projects that are alternatives of or similar to pyspark-ML-in-Colab

calcuMLator
An intelligently dumb calculator that uses machine learning
Stars: ✭ 30 (-6.25%)
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+6.25%)
Mutual labels:  hadoop, pyspark
Springboard-Data-Science-Immersive
No description or website provided.
Stars: ✭ 52 (+62.5%)
Mutual labels:  hadoop, pyspark
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+21.88%)
Mutual labels:  hadoop, pyspark
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+246.88%)
Mutual labels:  hadoop, pyspark
pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+125%)
Mutual labels:  pyspark, rdd
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (+6.25%)
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+368.75%)
Mutual labels:  hadoop, pyspark
Sales-Prediction
In depth analysis and forecasting of product sales based on the items, stores, transaction and other dependent variables like holidays and oil prices.
Stars: ✭ 56 (+75%)
Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Stars: ✭ 47 (+46.88%)
Mutual labels:  hadoop, rdd
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-21.87%)
Mutual labels:  hadoop, pyspark
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+1168.75%)
Mutual labels:  hadoop, pyspark
regression-python
In this repository you can find many different, small, projects which demonstrate regression techniques using python programming language
Stars: ✭ 15 (-53.12%)
awesome-computer-vision-models
A list of popular deep learning models related to classification, segmentation and detection problems
Stars: ✭ 419 (+1209.38%)
Soomvaar
Soomvaar is the repo which 🏩 contains different collection of 👨‍💻🚀code in Python and 💫✨Machine 👬🏼 learning algorithms📗📕 that is made during 📃 my practice and learning of ML and Python✨💥
Stars: ✭ 41 (+28.13%)
hive to es
同步Hive数据仓库数据到Elasticsearch的小工具
Stars: ✭ 21 (-34.37%)
Mutual labels:  hadoop
glmnetUtils
Utilities for glmnet
Stars: ✭ 60 (+87.5%)
Mutual labels:  regression-models
oshinko-s2i
This is a place to put s2i images and utilities for spark application builders for openshift
Stars: ✭ 16 (-50%)
Mutual labels:  pyspark
mnist-neural-network-deeplearnjs
🍃 Using a Neural Network to recognize MNIST digets in JavaScript.
Stars: ✭ 26 (-18.75%)
MLDemos
Machine Learning Demonstrations: A graphical interface to draw data, apply a diverse array of machine learning tools to it, and directly see the results in a visual and understandable manner.
Stars: ✭ 46 (+43.75%)
smart-data-lake
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (+146.88%)
Mutual labels:  hadoop
mlreef
The collaboration workspace for Machine Learning
Stars: ✭ 1,409 (+4303.13%)
sparsereg
a collection of modern sparse (regularized) linear regression algorithms.
Stars: ✭ 55 (+71.88%)
Mutual labels:  regression-models
rankpruning
🧹 Formerly for binary classification with noisy labels. Replaced by cleanlab.
Stars: ✭ 81 (+153.13%)
bihm
Bidirectional Helmholtz Machines
Stars: ✭ 40 (+25%)
qs-hadoop
大数据生态圈学习
Stars: ✭ 18 (-43.75%)
Mutual labels:  hadoop
srqm
An introductory statistics course for social scientists, using Stata
Stars: ✭ 43 (+34.38%)
Mutual labels:  regression-models
broomExtra
Helpers for regression analyses using `{broom}` & `{easystats}` packages 📈 🔍
Stars: ✭ 45 (+40.63%)
Mutual labels:  regression-models
cheapml
Machine Learning algorithms coded from scratch
Stars: ✭ 17 (-46.87%)
flask-spark-docker
Just a boilerplate for PySpark and Flask
Stars: ✭ 32 (+0%)
Mutual labels:  pyspark
greycat
GreyCat - Data Analytics, Temporal data, What-if, Live machine learning
Stars: ✭ 104 (+225%)
sia-cog
Various cognitive api for machine learning, vision, language intent alalysis. Covers traditional as well as deep learning model design and training.
Stars: ✭ 34 (+6.25%)
HDFS-Netdisc
基于Hadoop的分布式云存储系统 🌴
Stars: ✭ 56 (+75%)
Mutual labels:  hadoop
Data-pipeline-project
Data pipeline project
Stars: ✭ 18 (-43.75%)
Mutual labels:  hadoop
pycobra
python library implementing ensemble methods for regression, classification and visualisation tools including Voronoi tesselations.
Stars: ✭ 111 (+246.88%)
big-data-exploration
[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product
Stars: ✭ 43 (+34.38%)
Mutual labels:  hadoop
the-apache-ignite-book
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (+103.13%)
Mutual labels:  hadoop
kafka-twitter-spark-streaming
Counting Tweets Per User in Real-Time
Stars: ✭ 38 (+18.75%)
Mutual labels:  pyspark
learning-hadoop-and-spark
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+356.25%)
Mutual labels:  hadoop
openPDC
Open Source Phasor Data Concentrator
Stars: ✭ 109 (+240.63%)
Mutual labels:  hadoop
OLSTEC
OnLine Low-rank Subspace tracking by TEnsor CP Decomposition in Matlab: Version 1.0.1
Stars: ✭ 30 (-6.25%)
pyspark-for-data-processing
Code for my presentation: Using PySpark to Process Boat Loads of Data
Stars: ✭ 20 (-37.5%)
Mutual labels:  pyspark
bigdata-doc
大数据学习笔记,学习路线,技术案例整理。
Stars: ✭ 37 (+15.63%)
Mutual labels:  hadoop
hive-bigquery-storage-handler
Hive Storage Handler for interoperability between BigQuery and Apache Hive
Stars: ✭ 16 (-50%)
Mutual labels:  hadoop
anovos
Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark
Stars: ✭ 77 (+140.63%)
Mutual labels:  pyspark
Torrent-To-Google-Drive-Downloader
Simple notebook to stream torrent files to Google Drive using Google Colab and python3.
Stars: ✭ 256 (+700%)
Mutual labels:  colab-notebook
books-ML-and-DL
.pdf Format Books for Machine and Deep Learning
Stars: ✭ 105 (+228.13%)
ml course
"Learning Machine Learning" Course, Bogotá, Colombia 2019 #LML2019
Stars: ✭ 22 (-31.25%)
webhdfs
Node.js WebHDFS REST API client
Stars: ✭ 88 (+175%)
Mutual labels:  hadoop
Multi-Type-TD-TSR
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:
Stars: ✭ 174 (+443.75%)
dpkb
大数据相关内容汇总,包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词:Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse
Stars: ✭ 123 (+284.38%)
Mutual labels:  hadoop
deep-blueberry
If you've always wanted to learn about deep-learning but don't know where to start, then you might have stumbled upon the right place!
Stars: ✭ 17 (-46.87%)
LogAnalyzeHelper
论坛日志分析系统清洗程序(包含IP规则库,UDF开发,MapReduce程序,日志数据)
Stars: ✭ 33 (+3.13%)
Mutual labels:  hadoop
bbai
Set model hyperparameters using deterministic, exact algorithms.
Stars: ✭ 19 (-40.62%)
Mutual labels:  regression-models
xgboost-smote-detect-fraud
Can we predict accurately on the skewed data? What are the sampling techniques that can be used. Which models/techniques can be used in this scenario? Find the answers in this code pattern!
Stars: ✭ 59 (+84.38%)
spark-twitter-sentiment-analysis
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming
Stars: ✭ 55 (+71.88%)
Mutual labels:  pyspark
Statistical-Learning-using-R
This is a Statistical Learning application which will consist of various Machine Learning algorithms and their implementation in R done by me and their in depth interpretation.Documents and reports related to the below mentioned techniques can be found on my Rpubs profile.
Stars: ✭ 27 (-15.62%)
iis
Information Inference Service of the OpenAIRE system
Stars: ✭ 16 (-50%)
Mutual labels:  hadoop
MineColab
Run Minecraft Server on Google Colab.
Stars: ✭ 135 (+321.88%)
Mutual labels:  colab-notebook
learn-by-examples
Real-world Spark pipelines examples
Stars: ✭ 84 (+162.5%)
Mutual labels:  pyspark
1-60 of 666 similar projects