datatileA library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (-56.13%)
soda-sparkSoda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (-93.93%)
NBiNBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…
Stars: ✭ 102 (-89.32%)
Pandas ProfilingCreate HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+772.15%)
hooquhooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to Python
Stars: ✭ 17 (-98.22%)
penguin-datalayer-collectA data layer quality monitoring and validation module, this solution is part of the Raft Suite ecosystem.
Stars: ✭ 19 (-98.01%)
dbt ad reportingFivetran's ad reporting dbt package. Combine your Facebook, Google, Pinterest, Linkedin, Twitter, Snapchat and Microsoft advertising spend using this package.
Stars: ✭ 68 (-92.88%)
pyglotaranA Python library for Global and Target Analysis of time-resolved spectroscopy data
Stars: ✭ 33 (-96.54%)
tieba-zhuaqu百度贴吧分布式爬虫,用于贴吧数据挖掘。从贴吧维度和用户维度进行数据分析
Stars: ✭ 56 (-94.14%)
dflibIn-memory Java DataFrame library
Stars: ✭ 50 (-94.76%)
kuwalaKuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (-50.37%)
versatile-data-kitVersatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (-84.92%)
akshareAKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 5,155 (+439.79%)
uetaiCustom ML tracking experiment and debugging tools.
Stars: ✭ 17 (-98.22%)
LeTourDataSetEvery cyclist and stage of the Tour de France in two CSV files.
Stars: ✭ 61 (-93.61%)
genieGenie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)
Stars: ✭ 21 (-97.8%)
TracInImplementation of Estimating Training Data Influence by Tracing Gradient Descent (NeurIPS 2020)
Stars: ✭ 165 (-82.72%)
taller SparkRTaller SparkR para las Jornadas de Usuarios de R
Stars: ✭ 12 (-98.74%)
Fraud-Detection-in-Online-TransactionsDetecting Frauds in Online Transactions using Anamoly Detection Techniques Such as Over Sampling and Under-Sampling as the ratio of Frauds is less than 0.00005 thus, simply applying Classification Algorithm may result in Overfitting
Stars: ✭ 41 (-95.71%)
covidvizProfessional visualizations of COVID-19, emulating NYT, The Guardian, Washington Post, The Economist & others, using only Python & Altair.
Stars: ✭ 24 (-97.49%)
ipython-notebooksA collection of Jupyter notebooks exploring different datasets.
Stars: ✭ 43 (-95.5%)
check-engineData validation library for PySpark 3.0.0
Stars: ✭ 29 (-96.96%)
ria-jitLightweight and performant dynamic binary translation for RISC–V code on x86–64
Stars: ✭ 38 (-96.02%)
meta-csvA Clojure smart reader for CSV files
Stars: ✭ 20 (-97.91%)
PythonTipsDSPython Tips for Data Scientist
Stars: ✭ 23 (-97.59%)
golearn🔥 Golang basics and actual-combat (including: crawler, distributed-systems, data-analysis, redis, etcd, raft, crontab-task)
Stars: ✭ 36 (-96.23%)
online-course-recommendation-systemBuilt on data from Pluralsight's course API fetched results. Works with model trained with K-means unsupervised clustering algorithm.
Stars: ✭ 31 (-96.75%)
FDBeyeR tools for eyetracker workflows.
Stars: ✭ 101 (-89.42%)
vinumVinum is a SQL processor for Python, designed for data analysis workflows and in-memory analytics.
Stars: ✭ 57 (-94.03%)
metrics📈 What to measure, how to measure it.
Stars: ✭ 14 (-98.53%)
dbt-ml-preprocessingA SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
Stars: ✭ 128 (-86.6%)
stats📈 Useful notes and personal collections on statistics.
Stars: ✭ 16 (-98.32%)
hotmapWebGL Heatmap Viewer for Big Data and Bioinformatics
Stars: ✭ 13 (-98.64%)
computational-neuroscienceShort undergraduate course taught at University of Pennsylvania on computational and theoretical neuroscience. Provides an introduction to programming in MATLAB, single-neuron models, ion channel models, basic neural networks, and neural decoding.
Stars: ✭ 36 (-96.23%)
ggshakeRAn analysis and visualization R package that works with publicly available soccer data
Stars: ✭ 69 (-92.77%)
RepSePReproducible Self-Publishing - Demo Publications in the Most Common Formats
Stars: ✭ 14 (-98.53%)
elucidateconvenience functions to help researchers elucidate patterns in their data
Stars: ✭ 26 (-97.28%)
advanced-pandasPandas is a powerful tool for data exploration and analysis (including timeseries).
Stars: ✭ 22 (-97.7%)
dataViz CADiMaterials for the "Data Visualization" CADi workshop @ "Tecnológico de Monterrey"
Stars: ✭ 14 (-98.53%)
Naive-Resume-MatchingText Similarity Applied to resume, to compare Resumes with Job Descriptions and create a score to rank them. Similar to an ATS.
Stars: ✭ 27 (-97.17%)
dbt-clickhouseThe Clickhouse plugin for dbt (data build tool)
Stars: ✭ 77 (-91.94%)
iMOKAinteractive Multi Objective K-mer Analysis
Stars: ✭ 19 (-98.01%)
mixedvinesPython package for canonical vine copula trees with mixed continuous and discrete marginals
Stars: ✭ 36 (-96.23%)
8-Week-SQL-ChallengeCase study solutions for #8WeekSQLChallenge at https://8weeksqlchallenge.com
Stars: ✭ 43 (-95.5%)
MooseMOOSE - Platform for software and data analysis.
Stars: ✭ 110 (-88.48%)
ospiOpen Source Presence Infographic of Indian Startups
Stars: ✭ 25 (-97.38%)
lightdashAn open source alternative to Looker built using dbt. Made for analysts ❤️
Stars: ✭ 1,082 (+13.3%)
open-diggerOpen source analysis tools
Stars: ✭ 193 (-79.79%)