All Categories → No Category → data-quality

Top 23 data-quality open source projects

Django-Data-quality-system
数据治理、数据质量检核/监控平台(Django+jQuery+MySQL)
qamd
QAMyData, a data quality assurance tool for SPSS, STATA, SAS and CSV files.
DataQualityDashboard
A tool to help improve data quality standards in observational data science.
TracIn
Implementation of Estimating Training Data Influence by Tracing Gradient Descent (NeurIPS 2020)
hooqu
hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to Python
Data-Quality-Analysis
The PEDSnet Data Quality Assessment Toolkit (OMOP CDM)
check-engine
Data validation library for PySpark 3.0.0
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
leila
Librería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co
great expectations action
A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.
contessa
Easy way to define, execute and store quality rules for your data.
soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
NBi
NBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…
hive compared bq
hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
1-23 of 23 data-quality projects