Top 1642 data-science open source projects

Awesome Datascience
📝 An awesome Data Science repository to learn and apply for real world problems.
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
Koalas: pandas API on Apache Spark
Deep Learning Book
Repository for "Introduction to Artificial Neural Networks and Deep Learning: A Practical Guide with Applications in Python"
Dash for Julia - A Julia interface to the Dash ecosystem for creating analytic web applications in Julia. No JavaScript required.
Golang RServe client. Use R from Go
Real-time sentiment analysis in Python using twitter's streaming api
The data journalism platform with built in training
Quickly download, clean up, and install public datasets into a database management system
OpenDS4All project, hosted by LF AI & Data
Python implementation of elastic-net regularized generalized linear models
Ntm One Shot Tf
One Shot Learning using Memory-Augmented Neural Networks (MANN) based on Neural Turing Machine architecture in Tensorflow
Data Mining Conferences
Ranking, acceptance rate, deadline, and publication tips
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
NER toolkit for HTML data
Applying Data Science and Machine Learning to Solve Real World Business Problems
R4ds Exercise Solutions
Exercise solutions to "R for Data Science"
R client for the Elasticsearch HTTP API
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
The code repository for projects and tutorials in R and Python that covers a variety of topics in data visualization, statistics sports analytics and general application of probability theory.
Statistical Learning
Lecture Slides and R Sessions for Trevor Hastie and Rob Tibshinari's "Statistical Learning" Stanford course
Ml Workspace
Machine Learning (Beginners Hub), information(courses, books, cheat sheets, live sessions) related to machine learning, data science and python is available
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
CardIO is a library for data science research of heart signals
Python package for creating beautiful interactive Chord Diagrams. Pro version available at
Reddit Hyped Stocks
A web application to explore currently hyped stocks on Reddit
CARTO Python package for data scientists
Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
A library for debugging/inspecting machine learning classifiers and explaining their predictions
A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning.
Source code and data analyses for the Sci-Hub Coverage Study
An intuitive library to extract features from time series
