Data-Analyst-NanodegreeThis repo consists of the projects that I completed as a part of the Udacity's Data Analyst Nanodegree's curriculum.
Stars: ✭ 13 (-48%)
openrefine-batchShell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
Stars: ✭ 76 (+204%)
qsvCSVs sliced, diced & analyzed.
Stars: ✭ 438 (+1652%)
optimus🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+5304%)
openrefine-dockerOpenRefine is a free, open source power tool for working with messy data and improving it. This repository contains Dockerbuild files for automated builds.
Stars: ✭ 19 (-24%)
openrefine-clientThe OpenRefine Python Client from Paul Makepeace provides a library for communicating with an OpenRefine server. This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, Mac). It is also available via Docker Hub, PyPI and Binder.
Stars: ✭ 67 (+168%)
conciliatorOpenRefine reconciliation services for VIAF, ORCID, and Open Library + framework for creating more.
Stars: ✭ 95 (+280%)
DatatestTools for test driven data-wrangling and data validation.
Stars: ✭ 238 (+852%)
R Ecology LessonData Analysis and Visualization in R for Ecologists
Stars: ✭ 218 (+772%)
QsacnpjPacote que trata e organiza os dados do Cadastro Nacional da Pessoa Jurídica (CNPJ)
Stars: ✭ 187 (+648%)
SjmiscData transformation and utility functions for R
Stars: ✭ 141 (+464%)
Data Forge JsJavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 139 (+456%)
HypertoolsA Python toolbox for gaining geometric insights into high-dimensional data
Stars: ✭ 1,678 (+6612%)
Uc R.github.ioMain repository for R programming courses @ University of Cincinnati, courses and tutorials that focus on data wrangling, exploration, visualization, and analysis with R.
Stars: ✭ 76 (+204%)
OpenrefineOpenRefine is a free, open source power tool for working with messy data and improving it
Stars: ✭ 8,531 (+34024%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+3844%)
Data Forge TsThe JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 967 (+3768%)
Moderndive bookStatistical Inference via Data Science: A ModernDive into R and the Tidyverse
Stars: ✭ 527 (+2008%)
ProseMicrosoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.
Stars: ✭ 470 (+1780%)
SqawkLike Awk but with SQL and table joins
Stars: ✭ 263 (+952%)
prostoProsto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (+116%)
mimirData-ish exploration through SQL+Uncertainty
Stars: ✭ 26 (+4%)
foofahFoofah: programming-by-example data transformation program synthesizer
Stars: ✭ 24 (-4%)
Chapter-2Code examples for Chapter 2 of Data Wrangling with JavaScript
Stars: ✭ 16 (-36%)
xploreA python package built for data scientist/analysts, AI/ML engineers for exploring features of a dataset in minimal number of lines of code for quick analysis before data wrangling and feature extraction.
Stars: ✭ 21 (-16%)
pandas-workshopAn introductory workshop on pandas with notebooks and exercises for following along.
Stars: ✭ 161 (+544%)
Data-Science-101Notes and tutorials on how to use python, pandas, seaborn, numpy, matplotlib, scipy for data science.
Stars: ✭ 19 (-24%)
Data-Wrangling-with-PythonSimplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices
Stars: ✭ 90 (+260%)
whyqddata wrangling simplicity, complete audit transparency, and at speed
Stars: ✭ 16 (-36%)