datatileA library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (+2364.71%)
NBiNBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…
Stars: ✭ 102 (+500%)
re-datare_data - fix data issues before your users & CEO would discover them 😊
Stars: ✭ 955 (+5517.65%)
check-engineData validation library for PySpark 3.0.0
Stars: ✭ 29 (+70.59%)
versatile-data-kitVersatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+747.06%)
leilaLibrería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co
Stars: ✭ 56 (+229.41%)
great expectations actionA GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.
Stars: ✭ 66 (+288.24%)
dqlab-career-trackA collection of scripts written to complete DQLab Data Analyst Career Track 📊
Stars: ✭ 53 (+211.76%)
penguin-datalayer-collectA data layer quality monitoring and validation module, this solution is part of the Raft Suite ecosystem.
Stars: ✭ 19 (+11.76%)
contessaEasy way to define, execute and store quality rules for your data.
Stars: ✭ 17 (+0%)
soda-sparkSoda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (+241.18%)
hive compared bqhive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
Stars: ✭ 27 (+58.82%)
Applied Ml📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+104747.06%)
Pandas ProfilingCreate HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+48894.12%)
qamdQAMyData, a data quality assurance tool for SPSS, STATA, SAS and CSV files.
Stars: ✭ 16 (-5.88%)
DataQualityDashboardA tool to help improve data quality standards in observational data science.
Stars: ✭ 62 (+264.71%)
TracInImplementation of Estimating Training Data Influence by Tracing Gradient Descent (NeurIPS 2020)
Stars: ✭ 165 (+870.59%)