All Projects → Dirty_cat → Similar Projects or Alternatives

885 Open source projects that are alternatives of or similar to Dirty_cat

Exemplary, annotated machine learning pipeline for any tabular data problem.

Stars: ✭ 23 (-91.12%)

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

Stars: ✭ 218 (-15.83%)

Mutual labels: data-science

Voice Gender

Gender recognition by voice and speech analysis

Stars: ✭ 248 (-4.25%)

Mutual labels: data-science

Cardio

CardIO is a library for data science research of heart signals

Stars: ✭ 218 (-15.83%)

Mutual labels: data-science

nepali-translator

Neural Machine Translation on the Nepali-English language pair

Stars: ✭ 29 (-88.8%)

Mutual labels: data-cleaning

Tutorials

AI-related tutorials. Access any of them for free → https://towardsai.net/editorial

Stars: ✭ 204 (-21.24%)

Mutual labels: data-science

Cjworkbench

The data journalism platform with built in training

Stars: ✭ 244 (-5.79%)

Mutual labels: data-science

Sc17

SuperComputing 2017 Deep Learning Tutorial

Stars: ✭ 211 (-18.53%)

Mutual labels: data-science

FIFA-2019-Analysis

This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations

Stars: ✭ 28 (-89.19%)

Mutual labels: data-cleaning

Covid19za

Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa

Stars: ✭ 208 (-19.69%)

Mutual labels: data-science

Retriever

Quickly download, clean up, and install public datasets into a database management system

Stars: ✭ 241 (-6.95%)

Mutual labels: data-science

Eli5

A library for debugging/inspecting machine learning classifiers and explaining their predictions

Stars: ✭ 2,477 (+856.37%)

Mutual labels: data-science

Sk Dist

Distributed scikit-learn meta-estimators in PySpark

Stars: ✭ 260 (+0.39%)

Mutual labels: data-science

Scihub

Source code and data analyses for the Sci-Hub Coverage Study

Stars: ✭ 205 (-20.85%)

Mutual labels: data-science

Opends4all

OpenDS4All project, hosted by LF AI & Data

Stars: ✭ 240 (-7.34%)

Mutual labels: data-science

Python For Data Science

A collection of Jupyter Notebooks for learning Python for Data Science.

Stars: ✭ 205 (-20.85%)

Mutual labels: data-science

optimus

🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Stars: ✭ 1,351 (+421.62%)

Mutual labels: data-cleaning

Estadistica Con R

Apuntes personales sobre estadística, machine learning y lenguaje de programación R

Stars: ✭ 201 (-22.39%)

Mutual labels: data-science

Ntm One Shot Tf

One Shot Learning using Memory-Augmented Neural Networks (MANN) based on Neural Turing Machine architecture in Tensorflow

Stars: ✭ 238 (-8.11%)

Mutual labels: data-science

Instascrape

Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically

Stars: ✭ 202 (-22.01%)

Mutual labels: data-science

foofah

Foofah: programming-by-example data transformation program synthesizer

Stars: ✭ 24 (-90.73%)

Mutual labels: data-cleaning

Fastpages

An easy to use blogging platform, with enhanced support for Jupyter Notebooks.

Stars: ✭ 2,888 (+1015.06%)

Mutual labels: data-science

Data Mining Conferences

Ranking, acceptance rate, deadline, and publication tips

Stars: ✭ 236 (-8.88%)

Mutual labels: data-science

Achoo

Achoo uses a Raspberry Pi to predict if my son will need his inhaler on any given day using weather, pollen, and air quality data. If the prediction for a given day is above a specified threshold, the Pi will email his school nurse, and myself, notifying her that he may need preemptive treatment. Community-sourced health monitoring!

Stars: ✭ 200 (-22.78%)

Mutual labels: data-science

Keras

Deep Learning for humans

Stars: ✭ 53,476 (+20547.1%)

Mutual labels: data-science

Ml Auto Baseball Pitching Overlay

⚾🤖⚾ Automatic baseball pitching overlay in realtime

Stars: ✭ 200 (-22.78%)

Mutual labels: data-science

Ploomber

A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.

Stars: ✭ 221 (-14.67%)

Mutual labels: data-science

Pytorch Geometric Yoochoose

This is a tutorial for PyTorch Geometric on the YooChoose dataset

Stars: ✭ 198 (-23.55%)

Mutual labels: data-science

Dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

Stars: ✭ 3,480 (+1243.63%)

Mutual labels: data-science

Cql

Categorical Query Language IDE

Stars: ✭ 196 (-24.32%)

Mutual labels: data-science

Data Science Free

Free Resources For Data Science created by Shubham Kumar

Stars: ✭ 232 (-10.42%)

Mutual labels: data-science

Tad

A desktop application for viewing and analyzing tabular data

Stars: ✭ 2,275 (+778.38%)

Mutual labels: data-science

Ray

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Stars: ✭ 18,547 (+7061%)

Mutual labels: data-science

Imodels

Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).

Stars: ✭ 194 (-25.1%)

Mutual labels: data-science

Prodigy Recipes

🍳 Recipes for the Prodigy, our fully scriptable annotation tool

Stars: ✭ 229 (-11.58%)

Mutual labels: data-science

Gophernet

A simple from-scratch neural net written in Go

Stars: ✭ 194 (-25.1%)

Mutual labels: data-science

bumblebee

🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)

Stars: ✭ 120 (-53.67%)

Mutual labels: data-cleaning

Machinelearningnotebooks

Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft

Stars: ✭ 2,790 (+977.22%)

Mutual labels: data-science

Tablesaw

Java dataframe and visualization library

Stars: ✭ 2,785 (+975.29%)

Mutual labels: data-science

Plynx

PLynx is a domain agnostic platform for managing reproducible experiments and data-oriented workflows.

Stars: ✭ 192 (-25.87%)

Mutual labels: data-science

Koalas

Koalas: pandas API on Apache Spark

Stars: ✭ 3,044 (+1075.29%)

Mutual labels: data-science

Speedml

Speedml is a Python package to speed start machine learning projects.

Stars: ✭ 192 (-25.87%)

Mutual labels: data-science

R4ds Exercise Solutions

Exercise solutions to "R for Data Science"

Stars: ✭ 226 (-12.74%)

Mutual labels: data-science

Uci Ml Api

Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)

Stars: ✭ 190 (-26.64%)

Mutual labels: data-science

Course Nlp

A Code-First Introduction to NLP course

Stars: ✭ 3,029 (+1069.5%)

Mutual labels: data-science

Deep Learning Machine Learning Stock

Stock for Deep Learning and Machine Learning

Stars: ✭ 240 (-7.34%)

Mutual labels: data-science

Elastic

R client for the Elasticsearch HTTP API

Stars: ✭ 227 (-12.36%)

Mutual labels: data-science

Virgilio

Virgilio is developed and maintained by these awesome people. You can email us virgilio.datascience (at) gmail.com or join the Discord chat.

Stars: ✭ 13,200 (+4996.53%)

Mutual labels: data-science

Alphatools

Quantitative finance research tools in Python

Stars: ✭ 226 (-12.74%)

Mutual labels: data-science

Delbot

It understands your voice commands, searches news and knowledge sources, and summarizes and reads out content to you.

Stars: ✭ 191 (-26.25%)

Mutual labels: data-science

Datacamp Python Data Science Track

All the slides, accompanying code and exercises all stored in this repo. 🎈

Stars: ✭ 250 (-3.47%)

Mutual labels: data-science

Observations

Tools for loading standard data sets in machine learning

Stars: ✭ 190 (-26.64%)

Mutual labels: data-science

Streamlit

Streamlit — The fastest way to build data apps in Python

Stars: ✭ 16,906 (+6427.41%)

Mutual labels: data-science

Vec4ir

Word Embeddings for Information Retrieval

Stars: ✭ 188 (-27.41%)

Mutual labels: data-science

errorlocate

Find and replace erroneous fields in data using validation rules

Stars: ✭ 19 (-92.66%)

Mutual labels: data-cleaning

Pytorch Lightning

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

Stars: ✭ 16,641 (+6325.1%)

Mutual labels: data-science

Darwinexlabs

Datasets, tools and more from Darwinex Labs - Prop Investing Arm & Quant Team @ Darwinex

Stars: ✭ 248 (-4.25%)

Mutual labels: data-science

Full Stack Data Science

Full Stack Data Science in Python

Stars: ✭ 227 (-12.36%)

Mutual labels: data-science

Machine Learning Resources

A curated list of awesome machine learning frameworks, libraries, courses, books and many more.

Stars: ✭ 226 (-12.74%)

Mutual labels: data-science

Gspread Pandas

A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.

Stars: ✭ 226 (-12.74%)

Mutual labels: data-science

61-120 of 885 similar projects

‹

›

next*5