All Projects → Scalable Data Science → Similar Projects or Alternatives

977 Open source projects that are alternatives of or similar to Scalable Data Science

Ml Email Clustering
Email clustering with machine learning
Stars: ✭ 116 (-18.31%)
Mutual labels:  data-science
Real Time Stream Processing Engine
This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
Stars: ✭ 37 (-73.94%)
Mutual labels:  apache-spark
H2o Tutorials
Tutorials and training material for the H2O Machine Learning Platform
Stars: ✭ 1,305 (+819.01%)
Mutual labels:  data-science
Minerva Training Materials
Learn advanced data science on real-life, curated problems
Stars: ✭ 37 (-73.94%)
Mutual labels:  data-science
Datascicomp
A collection of popular Data Science Challenges/Competitions || Countdown timers to keep track of the entry deadlines.
Stars: ✭ 1,636 (+1052.11%)
Mutual labels:  data-science
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (+591.55%)
Mutual labels:  data-science
Sci Pype
A Machine Learning API with native redis caching and export + import using S3. Analyze entire datasets using an API for building, training, testing, analyzing, extracting, importing, and archiving. This repository can run from a docker container or from the repository.
Stars: ✭ 90 (-36.62%)
Mutual labels:  data-science
Dataconfs
A list of conferences connected with data worldwide.
Stars: ✭ 36 (-74.65%)
Mutual labels:  data-science
Keras Contrib
Keras community contributions
Stars: ✭ 1,532 (+978.87%)
Mutual labels:  data-science
Machinelearningcourse
A collection of notebooks of my Machine Learning class written in python 3
Stars: ✭ 35 (-75.35%)
Mutual labels:  data-science
Starcraft2 Replay Analysis
A jupyter notebook that provides analysis for StarCraft 2 replays
Stars: ✭ 90 (-36.62%)
Mutual labels:  data-science
Dvc
🦉Data Version Control | Git for Data & Models | ML Experiments Management
Stars: ✭ 9,004 (+6240.85%)
Mutual labels:  data-science
Coffee Quality Database
Building the Coffee Quality Institute Database
Stars: ✭ 141 (-0.7%)
Mutual labels:  data-science
Freeml
A List of Data Science/Machine Learning Resources (Mostly Free)
Stars: ✭ 974 (+585.92%)
Mutual labels:  data-science
Epfl
EPFL summaries & cheatsheets over 5 years (computer science, communication systems, data science and computational neuroscience).
Stars: ✭ 90 (-36.62%)
Mutual labels:  data-science
Open Solution Value Prediction
Open solution to the Santander Value Prediction Challenge 🐠
Stars: ✭ 34 (-76.06%)
Mutual labels:  data-science
Truvisory
This project is meant to provide resources to users who want to access good LinkedIn posts which contains resources to learn any Technology, Design, Self-Branding, Motivation etc. You can visit project by:
Stars: ✭ 116 (-18.31%)
Mutual labels:  data-science
Feagen
(deprecated) A fast and memory-efficient Python data engineering framework for machine learning.
Stars: ✭ 33 (-76.76%)
Mutual labels:  data-science
Repo2docker Action
GitHub Action for repo2docker
Stars: ✭ 88 (-38.03%)
Mutual labels:  data-science
Mljar Supervised
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀
Stars: ✭ 961 (+576.76%)
Mutual labels:  data-science
Stats337
Readings in applied data science
Stars: ✭ 1,625 (+1044.37%)
Mutual labels:  data-science
Tensorflow object counting api
🚀 The TensorFlow Object Counting API is an open source framework built on top of TensorFlow and Keras that makes it easy to develop object counting systems!
Stars: ✭ 956 (+573.24%)
Mutual labels:  data-science
Stocker
Financial Web Scraper & Sentiment Classifier
Stars: ✭ 87 (-38.73%)
Mutual labels:  data-science
Docker Iocaml Datascience
Dockerfile of Jupyter (IPython notebook) and IOCaml (OCaml kernel) with libraries for data science and machine learning
Stars: ✭ 30 (-78.87%)
Mutual labels:  data-science
Scipy 2017 Cython Tutorial
Material for the SciPy 2017 Cython tutorial
Stars: ✭ 114 (-19.72%)
Mutual labels:  data-science
Spark Flamegraph
Easy CPU Profiling for Apache Spark applications
Stars: ✭ 30 (-78.87%)
Mutual labels:  apache-spark
Cuesheet
A framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-39.44%)
Mutual labels:  apache-spark
Arcgis Python Api
Documentation and samples for ArcGIS API for Python
Stars: ✭ 954 (+571.83%)
Mutual labels:  data-science
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (-3.52%)
Mutual labels:  apache-spark
Python for ml
brief introduction to Python for machine learning
Stars: ✭ 29 (-79.58%)
Mutual labels:  data-science
Kaggle Competitions
There are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (-39.44%)
Mutual labels:  data-science
Mlnet Workshop
ML.NET Workshop to predict car sales prices
Stars: ✭ 29 (-79.58%)
Mutual labels:  data-science
Mlr
Machine Learning in R
Stars: ✭ 1,542 (+985.92%)
Mutual labels:  data-science
Machine Learning Open Source
Monthly Series - Machine Learning Top 10 Open Source Projects
Stars: ✭ 943 (+564.08%)
Mutual labels:  data-science
Topic Modeling Tool
A point-and-click tool for creating and analyzing topic models produced by MALLET.
Stars: ✭ 85 (-40.14%)
Mutual labels:  data-science
Steppy Toolkit
Curated set of transformers that make your work with steppy faster and more effective 🔭
Stars: ✭ 21 (-85.21%)
Mutual labels:  data-science
Spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+1111.97%)
Mutual labels:  apache-spark
Crime Analysis
Association Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-85.92%)
Mutual labels:  data-science
Pymrmr
Python3 binding to mRMR Feature Selection algorithm (currently not maintained)
Stars: ✭ 85 (-40.14%)
Mutual labels:  data-science
Spark Streaming Monitoring With Lightning
Plot live-stats as graph from ApacheSpark application using Lightning-viz
Stars: ✭ 15 (-89.44%)
Mutual labels:  apache-spark
Pythondata
repo for code published on pythondata.com
Stars: ✭ 113 (-20.42%)
Mutual labels:  data-science
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-90.14%)
Mutual labels:  apache-spark
Jupytemplate
Templates for jupyter notebooks
Stars: ✭ 85 (-40.14%)
Mutual labels:  data-science
Ripser.py
A Lean Persistent Homology Library for Python
Stars: ✭ 139 (-2.11%)
Mutual labels:  data-science
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+508.45%)
Mutual labels:  data-science
Sortingalgorithm.hayateshiki
Hayate-Shiki is an improved merge sort algorithm with the goal of "faster than quick sort".
Stars: ✭ 84 (-40.85%)
Mutual labels:  data-science
Scanpy
Single-Cell Analysis in Python. Scales to >1M cells.
Stars: ✭ 858 (+504.23%)
Mutual labels:  data-science
Kaggle Houseprices
Kaggle Kernel for House Prices competition https://www.kaggle.com/massquantity/all-you-need-is-pca-lb-0-11421-top-4
Stars: ✭ 113 (-20.42%)
Mutual labels:  data-science
Autodl
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (+501.41%)
Mutual labels:  data-science
Conferences
List of Machine Learning & Data Science Conferences
Stars: ✭ 83 (-41.55%)
Mutual labels:  data-science
Chrispher.github.com
Data Science
Stars: ✭ 8 (-94.37%)
Mutual labels:  data-science
Torchbear
🔥🐻 The Speakeasy Scripting Engine Which Combines Speed, Safety, and Simplicity
Stars: ✭ 128 (-9.86%)
Mutual labels:  data-science
Mri Analysis Pytorch
MRI analysis using PyTorch and MedicalTorch
Stars: ✭ 55 (-61.27%)
Mutual labels:  data-science
Recommenders
Best Practices on Recommendation Systems
Stars: ✭ 11,818 (+8222.54%)
Mutual labels:  data-science
Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (+657.75%)
Mutual labels:  data-science
Dltk
Deep Learning Toolkit for Medical Image Analysis
Stars: ✭ 1,249 (+779.58%)
Mutual labels:  data-science
Stumpy
STUMPY is a powerful and scalable Python library for modern time series analysis
Stars: ✭ 2,019 (+1321.83%)
Mutual labels:  data-science
Raspberryturk
The Raspberry Turk is a robot that can play chess—it's entirely open source, based on Raspberry Pi, and inspired by the 18th century chess playing machine, the Mechanical Turk.
Stars: ✭ 140 (-1.41%)
Mutual labels:  data-science
Azure Event Hubs Spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-1.41%)
Mutual labels:  apache-spark
Python For Data Science
A blog for data analytics using data science technologies
Stars: ✭ 139 (-2.11%)
Mutual labels:  data-science
301-360 of 977 similar projects