The main abnormal behaviors that this project can detect are: Violence, covering camera, Choking, lying down, Running, Motion in restricted areas. It provides much flexibility by allowing users to choose the abnormal behaviors they want to be detected and keeps track of every abnormal event to be reviewed. We used three methods to detect abnorma…

Stars: ✭ 35 (-78.79%)

Mutual labels: influence

soda-spark

Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes

Stars: ✭ 58 (-64.85%)

Mutual labels: data-quality

great expectations action

A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.

Stars: ✭ 66 (-60%)

Mutual labels: data-quality

Great expectations

Always know what to expect from your data.

Stars: ✭ 5,808 (+3420%)

Mutual labels: data-quality

Data-Quality-Analysis

The PEDSnet Data Quality Assessment Toolkit (OMOP CDM)

Stars: ✭ 19 (-88.48%)

Mutual labels: data-quality

contessa

Easy way to define, execute and store quality rules for your data.

Stars: ✭ 17 (-89.7%)

Mutual labels: data-quality

versatile-data-kit

Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.

Stars: ✭ 144 (-12.73%)

Mutual labels: data-quality

hive compared bq

hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.

Stars: ✭ 27 (-83.64%)

Mutual labels: data-quality

NBi

NBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…

Stars: ✭ 102 (-38.18%)

Mutual labels: data-quality

popular-github-template

📗 Repo Template: Make Your GitHub Repos More Popular

Stars: ✭ 16 (-90.3%)

Mutual labels: influence

roguelike-universe

Understanding game design inspiration of roguelike games via web scraping and network analysis.

Stars: ✭ 17 (-89.7%)

Mutual labels: influence

check-engine

Data validation library for PySpark 3.0.0

Stars: ✭ 29 (-82.42%)

Mutual labels: data-quality

Pandas Profiling

Create HTML profiling reports from pandas DataFrame objects

Stars: ✭ 8,329 (+4947.88%)

Mutual labels: data-quality

dqlab-career-track

A collection of scripts written to complete DQLab Data Analyst Career Track 📊

Stars: ✭ 53 (-67.88%)

Mutual labels: data-quality

hooqu

hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to Python

Stars: ✭ 17 (-89.7%)

Mutual labels: data-quality

datatile

A library for managing, validating, summarizing, and visualizing data.

Stars: ✭ 419 (+153.94%)

Mutual labels: data-quality

leila

Librería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co

Stars: ✭ 56 (-66.06%)

Mutual labels: data-quality

View All Similar Projects ➔

TracIn

Implementation of Estimating Training Data Influence by Tracing Gradient Descent

Goal: Identify the influence of training data points on F(data point at inference time).

Idea: Trace Stochastic Gradient Descent (Using the loss function as F)

Equation

Broader Impact

This work proposes a practical technique to understand the influence of training data points on loss functions/predictions/differentiable metrics. The technique is easier to apply than previously proposed techniques, and we hope it is widely used to understand the quality and influence of training data. For most real world applications, the impact of improving the quality of training data is simply to improve the quality of the model. In this sense, we expect the broader impact to be positive.

Most of the implementation in this repo will be in the form of colabs. Consider reading the FAQ before adapting to your own data.

Terminology

Proponents have positive scores proportional to loss reduction.
Opponents have negative scores proportional to loss enlargement.
Self-influence is the influence of a training point on its own loss.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

frederick0329 / TracIn

Programming Languages

Labels

Projects that are alternatives of or similar to TracIn

TracIn

Goal: Identify the influence of training data points on F(data point at inference time).

Idea: Trace Stochastic Gradient Descent (Using the loss function as F)

Equation

Broader Impact

Terminology