All Projects → Pyspark Example Project → Similar Projects or Alternatives

1517 Open source projects that are alternatives of or similar to Pyspark Example Project

Learn Data Science For Free
This repositary is a combination of different resources lying scattered all over the internet. The reason for making such an repositary is to combine all the valuable resources in a sequential manner, so that it helps every beginners who are in a search of free and structured learning resource for Data Science. For Constant Updates Follow me in …
Stars: ✭ 4,757 (+651.5%)
Mutual labels:  data-science
Cryptocurrency Price Prediction
Cryptocurrency Price Prediction Using LSTM neural network
Stars: ✭ 271 (-57.19%)
Mutual labels:  data-science
D2l Vn
Một cuốn sách tương tác về học sâu có mã nguồn, toán và thảo luận. Đề cập đến nhiều framework phổ biến (TensorFlow, Pytorch & MXNet) và được sử dụng tại 175 trường Đại học.
Stars: ✭ 402 (-36.49%)
Mutual labels:  data-science
Gophernotes
The Go kernel for Jupyter notebooks and nteract.
Stars: ✭ 3,100 (+389.73%)
Mutual labels:  data-science
Sparklearning
Learning Apache spark,including code and data .Most part can run local.
Stars: ✭ 558 (-11.85%)
Mutual labels:  spark
Facet
Human-explainable AI.
Stars: ✭ 269 (-57.5%)
Mutual labels:  data-science
User Machine Learning Tutorial
useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html
Stars: ✭ 393 (-37.91%)
Mutual labels:  data-science
Shogun
Shōgun
Stars: ✭ 2,859 (+351.66%)
Mutual labels:  data-science
Palladium
Framework for setting up predictive analytics services
Stars: ✭ 481 (-24.01%)
Mutual labels:  data-science
Awesome Mlops
😎 A curated list of awesome MLOps tools
Stars: ✭ 258 (-59.24%)
Mutual labels:  data-science
Boltons
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.
Stars: ✭ 5,671 (+795.89%)
Mutual labels:  data-science
Hub
Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+532.39%)
Mutual labels:  data-science
Open source demos
A collection of demos showcasing automated feature engineering and machine learning in diverse use cases
Stars: ✭ 391 (-38.23%)
Mutual labels:  data-science
Combo
(AAAI' 20) A Python Toolbox for Machine Learning Model Combination
Stars: ✭ 481 (-24.01%)
Mutual labels:  data-science
Datasciencepython
common data analysis and machine learning tasks using python
Stars: ✭ 4,442 (+601.74%)
Mutual labels:  data-science
Artificial Adversary
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (-45.02%)
Mutual labels:  data-science
Polyaxon
Machine Learning Platform for Kubernetes (MLOps tools for experimentation and automation)
Stars: ✭ 2,966 (+368.56%)
Mutual labels:  data-science
Awesome Opensource Data Engineering
An Awesome List of Open-Source Data Engineering Projects
Stars: ✭ 381 (-39.81%)
Mutual labels:  data-engineering
Baby Names Analysis
Data ETL & Analysis on the dataset 'Baby Names from Social Security Card Applications - National Data'.
Stars: ✭ 557 (-12.01%)
Mutual labels:  etl
Atlas
An Open Source, Self-Hosted Platform For Applied Deep Learning Development
Stars: ✭ 259 (-59.08%)
Mutual labels:  data-science
Dash Table
A First-Class Interactive DataTable for Dash
Stars: ✭ 382 (-39.65%)
Mutual labels:  data-science
Succinct
Enabling queries on compressed data.
Stars: ✭ 257 (-59.4%)
Mutual labels:  spark
Pdf
编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+1797.16%)
Mutual labels:  spark
Big Data Rosetta Code
Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Stars: ✭ 254 (-59.87%)
Mutual labels:  spark
Stats Maths With Python
General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
Stars: ✭ 381 (-39.81%)
Mutual labels:  data-science
Smile
Statistical Machine Intelligence & Learning Engine
Stars: ✭ 5,412 (+754.98%)
Mutual labels:  data-science
Holoclean
A Machine Learning System for Data Enrichment.
Stars: ✭ 344 (-45.66%)
Mutual labels:  data-science
sparkProjectTemplate.g8
Template for Spark Projects
Stars: ✭ 77 (-87.84%)
Mutual labels:  spark
Redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Stars: ✭ 20,147 (+3082.78%)
Mutual labels:  spark
Book
本项目收藏这些年来看过或者听过的一些不错的书籍,在整理文件时看见这些,发现删掉有点可惜,放着又太浪费空间,本着分享的原则,就把它们共享出来,一方面给需要的读者提供这些书籍,另一方面也是一种像知识库的积累吧
Stars: ✭ 47 (-92.58%)
Mutual labels:  spark
Pandapy
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)
Stars: ✭ 474 (-25.12%)
Mutual labels:  data-science
bandar-log
Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 20 (-96.84%)
Mutual labels:  etl
Bigdl
Building Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+502.37%)
Mutual labels:  spark
Ananas Desktop
A hackable data integration & analysis tool to enable non technical users to edit data processing jobs and visualise data on demand.
Stars: ✭ 551 (-12.95%)
Mutual labels:  etl
Awesome Twitter Data
A list of Twitter datasets and related resources.
Stars: ✭ 533 (-15.8%)
Mutual labels:  data-science
Bdp Dataplatform
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (-27.96%)
Mutual labels:  spark
Thesemicolon
This repository contains Ipython notebooks and datasets for the data analytics youtube tutorials on The Semicolon.
Stars: ✭ 345 (-45.5%)
Mutual labels:  data-science
daf-kylo
Kylo integration with PDND (previously DAF).
Stars: ✭ 20 (-96.84%)
Mutual labels:  spark
Prettypandas
A Pandas Styler class for making beautiful tables
Stars: ✭ 376 (-40.6%)
Mutual labels:  data-science
Spotify-Song-Recommendation-ML
UC Berkeley team's submission for RecSys Challenge 2018
Stars: ✭ 70 (-88.94%)
Mutual labels:  spark
Rio
A Swiss-Army Knife for Data I/O
Stars: ✭ 467 (-26.22%)
Mutual labels:  data-science
mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-92.1%)
Mutual labels:  pyspark
Michael S Data Science Curriculum
This is the companion curriculum to my guide to becoming a data scientist.
Stars: ✭ 375 (-40.76%)
Mutual labels:  data-science
spark-data-sources
Developing Spark External Data Sources using the V2 API
Stars: ✭ 36 (-94.31%)
Mutual labels:  spark
Engsoccerdata
English and European soccer results 1871-2020
Stars: ✭ 615 (-2.84%)
Mutual labels:  data-science
bigkube
Minikube for big data with Scala and Spark
Stars: ✭ 16 (-97.47%)
Mutual labels:  spark
Tensorflowonspark
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Stars: ✭ 3,748 (+492.1%)
Mutual labels:  spark
ETW2JSON
Tool and library to convert ETW logs to JSON files
Stars: ✭ 66 (-89.57%)
Mutual labels:  etl
Cookiecutter Data Science
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
Stars: ✭ 5,271 (+732.7%)
Mutual labels:  data-science
mqtt-to-kafka-bridge
Move your messages from MQTT to Apache Kafka in real-time 🚀
Stars: ✭ 21 (-96.68%)
Mutual labels:  etl
Wptools
Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
Stars: ✭ 371 (-41.39%)
Mutual labels:  data-science
SparkV
🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.
Stars: ✭ 24 (-96.21%)
Mutual labels:  spark
Probabilistic Programming And Bayesian Methods For Hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
Stars: ✭ 23,912 (+3677.57%)
Mutual labels:  data-science
Oap
Optimized Analytics Package for Spark* Platform
Stars: ✭ 343 (-45.81%)
Mutual labels:  spark
Pglogical
Logical Replication extension for PostgreSQL 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Stars: ✭ 455 (-28.12%)
Mutual labels:  etl
Csinva.github.io
Slides, paper notes, class notes, blog posts, and research on ML 📉, statistics 📊, and AI 🤖.
Stars: ✭ 342 (-45.97%)
Mutual labels:  data-science
Sparklens
Qubole Sparklens tool for performance tuning Apache Spark
Stars: ✭ 345 (-45.5%)
Mutual labels:  spark
Dev Setup
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
Stars: ✭ 5,590 (+783.1%)
Mutual labels:  spark
Getting Started With Genomics Tools And Resources
Unix, R and python tools for genomics and data science
Stars: ✭ 587 (-7.27%)
Mutual labels:  data-science
Data Science Your Way
Ways of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (-16.27%)
Mutual labels:  data-science
301-360 of 1517 similar projects