All Categories → Data Processing → data-science

Top 1642 data-science open source projects

Metriculous
Measure and visualize machine learning model performance without the usual boilerplate.
Etl with python
ETL with Python - Taught at DWH course 2017 (TAU)
Linkedingiveaway
👨🏽‍🏫You can learn about anything over here. What Giveaways I do and why it's important in today's modern world. Are you interested in Giveaway's?🔋
Graphia
A visualisation tool for the creation and analysis of graphs
Seaborn
Statistical data visualization in Python
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
W2v
Word2Vec models with Twitter data using Spark. Blog:
Autowrap
Wrap existing D code for use in Python, Excel, C#
Terpene Profile Parser For Cannabis Strains
Parser and database to index the terpene profile of different strains of Cannabis from online databases
Python Hierarchical Clustering Exercises
Exercises for hierarchical clustering with Python 3 and scipy as Jupyter Notebooks
Oreilly Ai K8s Tutorial
Materials for the "AI on Kubernetes" tutorial at O'Reilly AI SF 2018
Ntds 2017
Material for the EPFL master course "A Network Tour of Data Science", edition 2017.
Collaborative Deep Learning For Recommender Systems
The hybrid model combining stacked denoising autoencoder with matrix factorization is applied, to predict the customer purchase behavior in the future month according to the purchase history and user information in the Santander dataset.
Storytelling With Data
Course materials for Dartmouth Course: Storytelling with Data (PSYC 81.09).
Verticapy
VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.
Datacomparer
dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Datascience Projects
A collection of personal data science projects
Etherscan Ml
Python Data Science and Machine Learning Library for the Ethereum and ERC-20 Blockchain
Lifetimes
Lifetime value in Python
Zarr.js
Javascript implementation of Zarr
Pythondataanalysis
The data and code that used in my book.
Fasttext
Unofficial implementation of the paper "Bag of Tricks for Efficient Text Classification" by Joulin et al.
Data Privacy For Data Scientists
A workshop on data privacy methods for data scientists.
Ml Template Azure
Template for getting started with automated ML Ops on Azure Machine Learning
Datumbox Framework
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Numerical Linear Algebra
Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
Skoot
A package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn friendly interface in an effort to expedite the modeling process.
Mckinsey Smartcities Traffic Prediction
Adventure into using multi attention recurrent neural networks for time-series (city traffic) for the 2017-11-18 McKinsey IronMan (24h non-stop) prediction challenge
Causalnex
A Python library that helps data scientists to infer causation rather than observing correlation.
Python data analysis and mining action
《python数据分析与挖掘实战》的代码笔记
10 Simple Hacks To Speed Up Your Data Analysis In Python
Some useful Tips and Tricks to speed up the data analysis process in Python.
Zenml
ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.
Diffgram
Data Annotation, Data Labeling, Annotation Tooling, Training Data for Machine Learning
Tadw
An implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).
Tidyverse
Easily install and load packages from the tidyverse
Sklearn Porter
Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
Rcongresso
Pacote R para acessar dados do congresso nacional.
421-480 of 1642 data-science projects