All Projects → Dirty_cat → Similar Projects or Alternatives

885 Open source projects that are alternatives of or similar to Dirty_cat

Boltzmannclean
Fill missing values in Pandas DataFrames using Restricted Boltzmann Machines
Stars: ✭ 23 (-91.12%)
Mutual labels:  data-science, data-cleaning
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+280.69%)
Mutual labels:  data-science, data-cleaning
My Journey In The Data Science World
📢 Ready to learn or review your knowledge!
Stars: ✭ 1,175 (+353.67%)
Mutual labels:  data-science, data-cleaning
Pandas Videos
Jupyter notebook and datasets from the pandas Q&A video series
Stars: ✭ 1,716 (+562.55%)
Mutual labels:  data-science, data-cleaning
Janitor
simple tools for data cleaning in R
Stars: ✭ 981 (+278.76%)
Mutual labels:  data-science, data-cleaning
Dat8
General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+485.33%)
Mutual labels:  data-science, data-cleaning
Klib
Easy to use Python library of customized functions for cleaning and analyzing data.
Stars: ✭ 192 (-25.87%)
Mutual labels:  data-science, data-cleaning
Tweetfeels
Real-time sentiment analysis in Python using twitter's streaming api
Stars: ✭ 249 (-3.86%)
Mutual labels:  data-science
HoloClean-Legacy-deprecated
A Machine Learning System for Data Enrichment.
Stars: ✭ 75 (-71.04%)
Mutual labels:  data-cleaning
Learningx
Deep & Classical Reinforcement Learning + Machine Learning Examples in Python
Stars: ✭ 241 (-6.95%)
Mutual labels:  data-science
Plotly Graphing Library For Matlab
Plotly Graphing Library for MATLAB®
Stars: ✭ 234 (-9.65%)
Mutual labels:  data-science
Roger
Golang RServe client. Use R from Go
Stars: ✭ 248 (-4.25%)
Mutual labels:  data-science
Cleaner.jl
A toolbox of simple solutions for common data cleaning problems.
Stars: ✭ 21 (-91.89%)
Mutual labels:  data-cleaning
Igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
Stars: ✭ 2,956 (+1041.31%)
Mutual labels:  data-science
allie
🤖 A machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers).
Stars: ✭ 93 (-64.09%)
Mutual labels:  data-cleaning
Pyglmnet
Python implementation of elastic-net regularized generalized linear models
Stars: ✭ 235 (-9.27%)
Mutual labels:  data-science
R-Learning-Journey
Some of the projects i made when starting to learn R for Data Science at the university
Stars: ✭ 19 (-92.66%)
Mutual labels:  data-cleaning
Atlas
An Open Source, Self-Hosted Platform For Applied Deep Learning Development
Stars: ✭ 259 (+0%)
Mutual labels:  data-science
Datascience
Curated list of Python resources for data science.
Stars: ✭ 3,051 (+1077.99%)
Mutual labels:  data-science
OpenRefine-ecology-lesson
Data Cleaning with OpenRefine for Ecologists
Stars: ✭ 20 (-92.28%)
Mutual labels:  data-cleaning
Awesome Datascience
📝 An awesome Data Science repository to learn and apply for real world problems.
Stars: ✭ 17,520 (+6664.48%)
Mutual labels:  data-science
Deepgraph
Analyze Data with Pandas-based Networks. Documentation:
Stars: ✭ 232 (-10.42%)
Mutual labels:  data-science
Pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Stars: ✭ 231 (-10.81%)
Mutual labels:  data-science
Ipython
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
Stars: ✭ 15,107 (+5732.82%)
Mutual labels:  data-science
Webstruct
NER toolkit for HTML data
Stars: ✭ 230 (-11.2%)
Mutual labels:  data-science
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-12.36%)
Mutual labels:  data-science
Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+1116.99%)
Mutual labels:  data-science
exemplary-ml-pipeline
Exemplary, annotated machine learning pipeline for any tabular data problem.
Stars: ✭ 23 (-91.12%)
Mutual labels:  data-cleaning
Voice Gender
Gender recognition by voice and speech analysis
Stars: ✭ 248 (-4.25%)
Mutual labels:  data-science
nepali-translator
Neural Machine Translation on the Nepali-English language pair
Stars: ✭ 29 (-88.8%)
Mutual labels:  data-cleaning
Cjworkbench
The data journalism platform with built in training
Stars: ✭ 244 (-5.79%)
Mutual labels:  data-science
FIFA-2019-Analysis
This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations
Stars: ✭ 28 (-89.19%)
Mutual labels:  data-cleaning
Retriever
Quickly download, clean up, and install public datasets into a database management system
Stars: ✭ 241 (-6.95%)
Mutual labels:  data-science
Sk Dist
Distributed scikit-learn meta-estimators in PySpark
Stars: ✭ 260 (+0.39%)
Mutual labels:  data-science
Opends4all
OpenDS4All project, hosted by LF AI & Data
Stars: ✭ 240 (-7.34%)
Mutual labels:  data-science
optimus
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+421.62%)
Mutual labels:  data-cleaning
Ntm One Shot Tf
One Shot Learning using Memory-Augmented Neural Networks (MANN) based on Neural Turing Machine architecture in Tensorflow
Stars: ✭ 238 (-8.11%)
Mutual labels:  data-science
foofah
Foofah: programming-by-example data transformation program synthesizer
Stars: ✭ 24 (-90.73%)
Mutual labels:  data-cleaning
Data Mining Conferences
Ranking, acceptance rate, deadline, and publication tips
Stars: ✭ 236 (-8.88%)
Mutual labels:  data-science
Keras
Deep Learning for humans
Stars: ✭ 53,476 (+20547.1%)
Mutual labels:  data-science
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (-14.67%)
Mutual labels:  data-science
Dowhy
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
Stars: ✭ 3,480 (+1243.63%)
Mutual labels:  data-science
Data Science Free
Free Resources For Data Science created by Shubham Kumar
Stars: ✭ 232 (-10.42%)
Mutual labels:  data-science
Ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Stars: ✭ 18,547 (+7061%)
Mutual labels:  data-science
Prodigy Recipes
🍳 Recipes for the Prodigy, our fully scriptable annotation tool
Stars: ✭ 229 (-11.58%)
Mutual labels:  data-science
bumblebee
🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)
Stars: ✭ 120 (-53.67%)
Mutual labels:  data-cleaning
Tablesaw
Java dataframe and visualization library
Stars: ✭ 2,785 (+975.29%)
Mutual labels:  data-science
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+1075.29%)
Mutual labels:  data-science
Course Nlp
A Code-First Introduction to NLP course
Stars: ✭ 3,029 (+1069.5%)
Mutual labels:  data-science
R4ds Exercise Solutions
Exercise solutions to "R for Data Science"
Stars: ✭ 226 (-12.74%)
Mutual labels:  data-science
Udacity-Data-Analyst-Nanodegree
Repository for the projects needed to complete the Data Analyst Nanodegree.
Stars: ✭ 31 (-88.03%)
Mutual labels:  data-cleaning
Deep Learning Book
Repository for "Introduction to Artificial Neural Networks and Deep Learning: A Practical Guide with Applications in Python"
Stars: ✭ 2,705 (+944.4%)
Mutual labels:  data-science
Functional intro to python
[tutorial]A functional, Data Science focused introduction to Python
Stars: ✭ 228 (-11.97%)
Mutual labels:  data-science
Alphatools
Quantitative finance research tools in Python
Stars: ✭ 226 (-12.74%)
Mutual labels:  data-science
Datacamp Python Data Science Track
All the slides, accompanying code and exercises all stored in this repo. 🎈
Stars: ✭ 250 (-3.47%)
Mutual labels:  data-science
Dash
Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.
Stars: ✭ 15,592 (+5920.08%)
Mutual labels:  data-science
Streamlit
Streamlit — The fastest way to build data apps in Python
Stars: ✭ 16,906 (+6427.41%)
Mutual labels:  data-science
errorlocate
Find and replace erroneous fields in data using validation rules
Stars: ✭ 19 (-92.66%)
Mutual labels:  data-cleaning
Deep Learning Machine Learning Stock
Stock for Deep Learning and Machine Learning
Stars: ✭ 240 (-7.34%)
Mutual labels:  data-science
Elastic
R client for the Elasticsearch HTTP API
Stars: ✭ 227 (-12.36%)
Mutual labels:  data-science
1-60 of 885 similar projects