datatileA library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (+193.01%)
Mutual labels: data-quality-checks, data-quality, data-quality-monitoring
re-datare_data - fix data issues before your users & CEO would discover them 😊
Stars: ✭ 955 (+567.83%)
Mutual labels: data-quality-checks, data-quality, data-quality-monitoring
NBiNBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…
Stars: ✭ 102 (-28.67%)
Mutual labels: data-quality-checks, data-quality
hooquhooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to Python
Stars: ✭ 17 (-88.11%)
Mutual labels: data-quality-checks, data-quality
Data-Quality-AnalysisThe PEDSnet Data Quality Assessment Toolkit (OMOP CDM)
Stars: ✭ 19 (-86.71%)
Mutual labels: data-quality-checks, data-quality
penguin-datalayer-collectA data layer quality monitoring and validation module, this solution is part of the Raft Suite ecosystem.
Stars: ✭ 19 (-86.71%)
Mutual labels: data-quality, data-quality-monitoring
soda-sparkSoda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (-59.44%)
Mutual labels: data-quality
osm-data-classificationMigrated to: https://gitlab.com/Oslandia/osm-data-classification
Stars: ✭ 23 (-83.92%)
Mutual labels: data-quality
ohsome-quality-analystData quality estimations for OpenStreetMap
Stars: ✭ 28 (-80.42%)
Mutual labels: data-quality
versatile-data-kitVersatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+0.7%)
Mutual labels: data-quality
Applied Ml📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+12364.34%)
Mutual labels: data-quality
pyradPython Radar Data Processing
Stars: ✭ 42 (-70.63%)
Mutual labels: data-quality-monitoring
contessaEasy way to define, execute and store quality rules for your data.
Stars: ✭ 17 (-88.11%)
Mutual labels: data-quality
check-engineData validation library for PySpark 3.0.0
Stars: ✭ 29 (-79.72%)
Mutual labels: data-quality
TracInImplementation of Estimating Training Data Influence by Tracing Gradient Descent (NeurIPS 2020)
Stars: ✭ 165 (+15.38%)
Mutual labels: data-quality
hive compared bqhive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
Stars: ✭ 27 (-81.12%)
Mutual labels: data-quality
leilaLibrería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co
Stars: ✭ 56 (-60.84%)
Mutual labels: data-quality
great expectations actionA GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.
Stars: ✭ 66 (-53.85%)
Mutual labels: data-quality
qamdQAMyData, a data quality assurance tool for SPSS, STATA, SAS and CSV files.
Stars: ✭ 16 (-88.81%)
Mutual labels: data-quality
DataQualityDashboardA tool to help improve data quality standards in observational data science.
Stars: ✭ 62 (-56.64%)
Mutual labels: data-quality