All Projects → Dirty_cat → Similar Projects or Alternatives

885 Open source projects that are alternatives of or similar to Dirty_cat

exemplary-ml-pipeline
Exemplary, annotated machine learning pipeline for any tabular data problem.
Stars: ✭ 23 (-91.12%)
Mutual labels:  data-cleaning
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-15.83%)
Mutual labels:  data-science
Voice Gender
Gender recognition by voice and speech analysis
Stars: ✭ 248 (-4.25%)
Mutual labels:  data-science
Cardio
CardIO is a library for data science research of heart signals
Stars: ✭ 218 (-15.83%)
Mutual labels:  data-science
nepali-translator
Neural Machine Translation on the Nepali-English language pair
Stars: ✭ 29 (-88.8%)
Mutual labels:  data-cleaning
Tutorials
AI-related tutorials. Access any of them for free → https://towardsai.net/editorial
Stars: ✭ 204 (-21.24%)
Mutual labels:  data-science
Cjworkbench
The data journalism platform with built in training
Stars: ✭ 244 (-5.79%)
Mutual labels:  data-science
Sc17
SuperComputing 2017 Deep Learning Tutorial
Stars: ✭ 211 (-18.53%)
Mutual labels:  data-science
FIFA-2019-Analysis
This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations
Stars: ✭ 28 (-89.19%)
Mutual labels:  data-cleaning
Covid19za
Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Stars: ✭ 208 (-19.69%)
Mutual labels:  data-science
Retriever
Quickly download, clean up, and install public datasets into a database management system
Stars: ✭ 241 (-6.95%)
Mutual labels:  data-science
Eli5
A library for debugging/inspecting machine learning classifiers and explaining their predictions
Stars: ✭ 2,477 (+856.37%)
Mutual labels:  data-science
Sk Dist
Distributed scikit-learn meta-estimators in PySpark
Stars: ✭ 260 (+0.39%)
Mutual labels:  data-science
Scihub
Source code and data analyses for the Sci-Hub Coverage Study
Stars: ✭ 205 (-20.85%)
Mutual labels:  data-science
Opends4all
OpenDS4All project, hosted by LF AI & Data
Stars: ✭ 240 (-7.34%)
Mutual labels:  data-science
Python For Data Science
A collection of Jupyter Notebooks for learning Python for Data Science.
Stars: ✭ 205 (-20.85%)
Mutual labels:  data-science
optimus
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+421.62%)
Mutual labels:  data-cleaning
Estadistica Con R
Apuntes personales sobre estadística, machine learning y lenguaje de programación R
Stars: ✭ 201 (-22.39%)
Mutual labels:  data-science
Ntm One Shot Tf
One Shot Learning using Memory-Augmented Neural Networks (MANN) based on Neural Turing Machine architecture in Tensorflow
Stars: ✭ 238 (-8.11%)
Mutual labels:  data-science
Instascrape
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
Stars: ✭ 202 (-22.01%)
Mutual labels:  data-science
foofah
Foofah: programming-by-example data transformation program synthesizer
Stars: ✭ 24 (-90.73%)
Mutual labels:  data-cleaning
Fastpages
An easy to use blogging platform, with enhanced support for Jupyter Notebooks.
Stars: ✭ 2,888 (+1015.06%)
Mutual labels:  data-science
Data Mining Conferences
Ranking, acceptance rate, deadline, and publication tips
Stars: ✭ 236 (-8.88%)
Mutual labels:  data-science
Achoo
Achoo uses a Raspberry Pi to predict if my son will need his inhaler on any given day using weather, pollen, and air quality data. If the prediction for a given day is above a specified threshold, the Pi will email his school nurse, and myself, notifying her that he may need preemptive treatment. Community-sourced health monitoring!
Stars: ✭ 200 (-22.78%)
Mutual labels:  data-science
Keras
Deep Learning for humans
Stars: ✭ 53,476 (+20547.1%)
Mutual labels:  data-science
Ml Auto Baseball Pitching Overlay
⚾🤖⚾ Automatic baseball pitching overlay in realtime
Stars: ✭ 200 (-22.78%)
Mutual labels:  data-science
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (-14.67%)
Mutual labels:  data-science
Pytorch Geometric Yoochoose
This is a tutorial for PyTorch Geometric on the YooChoose dataset
Stars: ✭ 198 (-23.55%)
Mutual labels:  data-science
Dowhy
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
Stars: ✭ 3,480 (+1243.63%)
Mutual labels:  data-science
Cql
Categorical Query Language IDE
Stars: ✭ 196 (-24.32%)
Mutual labels:  data-science
Data Science Free
Free Resources For Data Science created by Shubham Kumar
Stars: ✭ 232 (-10.42%)
Mutual labels:  data-science
Tad
A desktop application for viewing and analyzing tabular data
Stars: ✭ 2,275 (+778.38%)
Mutual labels:  data-science
Ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Stars: ✭ 18,547 (+7061%)
Mutual labels:  data-science
Imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Stars: ✭ 194 (-25.1%)
Mutual labels:  data-science
Prodigy Recipes
🍳 Recipes for the Prodigy, our fully scriptable annotation tool
Stars: ✭ 229 (-11.58%)
Mutual labels:  data-science
Gophernet
A simple from-scratch neural net written in Go
Stars: ✭ 194 (-25.1%)
Mutual labels:  data-science
bumblebee
🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)
Stars: ✭ 120 (-53.67%)
Mutual labels:  data-cleaning
Machinelearningnotebooks
Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
Stars: ✭ 2,790 (+977.22%)
Mutual labels:  data-science
Tablesaw
Java dataframe and visualization library
Stars: ✭ 2,785 (+975.29%)
Mutual labels:  data-science
Plynx
PLynx is a domain agnostic platform for managing reproducible experiments and data-oriented workflows.
Stars: ✭ 192 (-25.87%)
Mutual labels:  data-science
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+1075.29%)
Mutual labels:  data-science
Speedml
Speedml is a Python package to speed start machine learning projects.
Stars: ✭ 192 (-25.87%)
Mutual labels:  data-science
R4ds Exercise Solutions
Exercise solutions to "R for Data Science"
Stars: ✭ 226 (-12.74%)
Mutual labels:  data-science
Uci Ml Api
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)
Stars: ✭ 190 (-26.64%)
Mutual labels:  data-science
Course Nlp
A Code-First Introduction to NLP course
Stars: ✭ 3,029 (+1069.5%)
Mutual labels:  data-science
Deep Learning Machine Learning Stock
Stock for Deep Learning and Machine Learning
Stars: ✭ 240 (-7.34%)
Mutual labels:  data-science
Elastic
R client for the Elasticsearch HTTP API
Stars: ✭ 227 (-12.36%)
Mutual labels:  data-science
Virgilio
Virgilio is developed and maintained by these awesome people. You can email us virgilio.datascience (at) gmail.com or join the Discord chat.
Stars: ✭ 13,200 (+4996.53%)
Mutual labels:  data-science
Alphatools
Quantitative finance research tools in Python
Stars: ✭ 226 (-12.74%)
Mutual labels:  data-science
Delbot
It understands your voice commands, searches news and knowledge sources, and summarizes and reads out content to you.
Stars: ✭ 191 (-26.25%)
Mutual labels:  data-science
Datacamp Python Data Science Track
All the slides, accompanying code and exercises all stored in this repo. 🎈
Stars: ✭ 250 (-3.47%)
Mutual labels:  data-science
Observations
Tools for loading standard data sets in machine learning
Stars: ✭ 190 (-26.64%)
Mutual labels:  data-science
Streamlit
Streamlit — The fastest way to build data apps in Python
Stars: ✭ 16,906 (+6427.41%)
Mutual labels:  data-science
Vec4ir
Word Embeddings for Information Retrieval
Stars: ✭ 188 (-27.41%)
Mutual labels:  data-science
errorlocate
Find and replace erroneous fields in data using validation rules
Stars: ✭ 19 (-92.66%)
Mutual labels:  data-cleaning
Pytorch Lightning
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
Stars: ✭ 16,641 (+6325.1%)
Mutual labels:  data-science
Darwinexlabs
Datasets, tools and more from Darwinex Labs - Prop Investing Arm & Quant Team @ Darwinex
Stars: ✭ 248 (-4.25%)
Mutual labels:  data-science
Full Stack Data Science
Full Stack Data Science in Python
Stars: ✭ 227 (-12.36%)
Mutual labels:  data-science
Machine Learning Resources
A curated list of awesome machine learning frameworks, libraries, courses, books and many more.
Stars: ✭ 226 (-12.74%)
Mutual labels:  data-science
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (-12.74%)
Mutual labels:  data-science
61-120 of 885 similar projects