Dat8General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+7480%)
RefinrCluster and merge similar char values: an R implementation of Open Refine clustering algorithms
Stars: ✭ 91 (+355%)
Bumblebee🚕 A spreadsheet-like data preparation web app that works over Optimus (pandas, dask, cuDF, dask-cuDF and PySpark)
Stars: ✭ 86 (+330%)
CleanFast and Easy Data Cleaning (in R)
Stars: ✭ 49 (+145%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+4830%)
Janitorsimple tools for data cleaning in R
Stars: ✭ 981 (+4805%)
Drugs Recommendation Using ReviewsAnalyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.
Stars: ✭ 35 (+75%)
BoltzmanncleanFill missing values in Pandas DataFrames using Restricted Boltzmann Machines
Stars: ✭ 23 (+15%)
PanderaA light-weight, flexible, and expressive pandas data validation library
Stars: ✭ 506 (+2430%)
NonechucksDeal with bad samples in your dataset dynamically, use Transforms as Filters, and more!
Stars: ✭ 304 (+1420%)
ValidateProfessional data validation for the R environment
Stars: ✭ 268 (+1240%)
Dirty catEncoding methods for dirty categorical variables
Stars: ✭ 259 (+1195%)
covid-19-data-cleanupScripts to cleanup data from https://github.com/CSSEGISandData/COVID-19
Stars: ✭ 25 (+25%)
nepali-translatorNeural Machine Translation on the Nepali-English language pair
Stars: ✭ 29 (+45%)
allie🤖 A machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers).
Stars: ✭ 93 (+365%)
foofahFoofah: programming-by-example data transformation program synthesizer
Stars: ✭ 24 (+20%)