All Projects → jezcope → pyrefine

jezcope / pyrefine

Licence: MIT license
Execute OpenRefine JSON scripts without OpenRefine (or Java)

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to pyrefine

Hypertools
A Python toolbox for gaining geometric insights into high-dimensional data
Stars: ✭ 1,678 (+6612%)
Mutual labels:  data-wrangling
OpenRefine-ecology-lesson
Data Cleaning with OpenRefine for Ecologists
Stars: ✭ 20 (-20%)
Mutual labels:  openrefine
r-novice-inflammation
Programming with R
Stars: ✭ 142 (+468%)
Mutual labels:  data-wrangling
Sjmisc
Data transformation and utility functions for R
Stars: ✭ 141 (+464%)
Mutual labels:  data-wrangling
Datatest
Tools for test driven data-wrangling and data validation.
Stars: ✭ 238 (+852%)
Mutual labels:  data-wrangling
openrefine-client
The OpenRefine Python Client from Paul Makepeace provides a library for communicating with an OpenRefine server. This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, Mac). It is also available via Docker Hub, PyPI and Binder.
Stars: ✭ 67 (+168%)
Mutual labels:  openrefine
Python Ecology Lesson
Data Analysis and Visualization in Python for Ecologists
Stars: ✭ 116 (+364%)
Mutual labels:  data-wrangling
sql-novice-survey
Databases and SQL
Stars: ✭ 59 (+136%)
Mutual labels:  data-wrangling
Data Cleaning 101
Data Cleaning Libraries with Python
Stars: ✭ 243 (+872%)
Mutual labels:  data-wrangling
qsv
CSVs sliced, diced & analyzed.
Stars: ✭ 438 (+1652%)
Mutual labels:  data-wrangling
Web Database Analytics
Web scrapping and related analytics using Python tools
Stars: ✭ 175 (+600%)
Mutual labels:  data-wrangling
R Ecology Lesson
Data Analysis and Visualization in R for Ecologists
Stars: ✭ 218 (+772%)
Mutual labels:  data-wrangling
openrefine-docker
OpenRefine is a free, open source power tool for working with messy data and improving it. This repository contains Dockerbuild files for automated builds.
Stars: ✭ 19 (-24%)
Mutual labels:  openrefine
Data Forge Js
JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 139 (+456%)
Mutual labels:  data-wrangling
sql-ecology-lesson
Data Management with SQL for Ecologists
Stars: ✭ 37 (+48%)
Mutual labels:  data-wrangling
R Novice Gapminder
R for Reproducible Scientific Analysis
Stars: ✭ 127 (+408%)
Mutual labels:  data-wrangling
conciliator
OpenRefine reconciliation services for VIAF, ORCID, and Open Library + framework for creating more.
Stars: ✭ 95 (+280%)
Mutual labels:  openrefine
Data-Analyst-Nanodegree
This repo consists of the projects that I completed as a part of the Udacity's Data Analyst Nanodegree's curriculum.
Stars: ✭ 13 (-48%)
Mutual labels:  data-wrangling
openrefine-batch
Shell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
Stars: ✭ 76 (+204%)
Mutual labels:  openrefine
optimus
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+5304%)
Mutual labels:  data-wrangling

PyRefine

Documentation Status Updates

OpenRefine is a great tool for exploring and cleaning datasets prior to analysing them. It also records an undo history of all actions that you can export as a sort of script in JSON format. However, in order to execute that script on a new dataset, you need to manually import it through the graphical interface or set up a BatchRefine server, neither of which is quick.

PyRefine allows you to execute OpenRefine JSON scripts against datasets without firing up a full Java/OpenRefine server. It has a commandline tool for quick use, or you can use it as a library to integrate it into your pandas-based data analysis pipeline.

More details in this blog post.

Please note: PyRefine is still very much alpha-quality. It probably doesn't work exactly how you're expecting right now. That said, please try it out, and consider :doc:`contributing`!

Features

  • Execute OpenRefine JSON against a dataset from the command line
  • Execute OpenRefine JSON from a Python script

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].