All Categories → Data Processing → data-science

Top 1642 data-science open source projects

Interpretable machine learning with python
Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
Lets Plot
An open-source plotting library for statistical data.
Rumale
Rumale is a machine learning library in Ruby
Moderndive book
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse
Dapy
Easy-to-use data analysis / manipulation framework for humans
Disk.frame
Fast Disk-Based Parallelized Data Manipulation Framework for Larger-than-RAM Data
Glue
Linked Data Visualizations Across Multiple Files
Facebook data analyzer
Analyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more
Knowledge Repo
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
Heamy
A set of useful tools for competitive data science.
Spacy Stanza
💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
Atm
Auto Tune Models - A multi-tenant, multi-data system for automated machine learning (model selection and tuning).
Edward
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
Awesome Learn Datascience
📈 Curated list of resources to help you get started with Data Science
Awesome R
A curated list of awesome R packages, frameworks and software.
Dataframe Go
DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
Learn Data Science For Free
This repositary is a combination of different resources lying scattered all over the internet. The reason for making such an repositary is to combine all the valuable resources in a sequential manner, so that it helps every beginners who are in a search of free and structured learning resource for Data Science. For Constant Updates Follow me in …
Machine Learning Roadmap
A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
Palladium
Framework for setting up predictive analytics services
Combo
(AAAI' 20) A Python Toolbox for Machine Learning Model Combination
Pandapy
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)
Rio
A Swiss-Army Knife for Data I/O
Cookiecutter Data Science
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
Mlr3
mlr3: Machine Learning in R - next generation
Poutyne
A simplified framework and utilities for PyTorch
Gop
GoPlus - The Go+ language for engineering, STEM education, and data science
Datasciencepython
common data analysis and machine learning tasks using python
Python Causality Handbook
Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and sensitivity analysis.
Tensor House
A collection of reference machine learning and optimization models for enterprise operations: marketing, pricing, supply chain
Turbodbc
Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with the Python Database API Specification 2.0.
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Numpycnn
Building Convolutional Neural Networks From Scratch using NumPy
Awesome
Awesome resources on Bioinformatics, data science, machine learning, programming language (Python, Golang, R, Perl) and miscellaneous stuff.
Seglearn
Python module for machine learning time series:
Code search
Code For Medium Article: "How To Create Natural Language Semantic Search for Arbitrary Objects With Deep Learning"
Awesome Feature Engineering
A curated list of resources dedicated to Feature Engineering Techniques for Machine Learning
Opensource Roadmap Datascience
¡Camino a una educación autodidacta en Ciencia de Datos!
Jupyter pivottablejs
Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js
Data Science Learning Resources
A collection of data science and machine learning resources that I've found helpful (I only post what I've read!)
Fivethirtyeight
R package of data and code behind the stories and interactives at FiveThirtyEight
661-720 of 1642 data-science projects