splinkImplementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (-72.37%)
yadfYet Another Dupes Finder
Stars: ✭ 32 (-95.11%)
Dedupe🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Stars: ✭ 3,241 (+394.81%)
mail-deduplicate📧 CLI to deduplicate mails from mail boxes.
Stars: ✭ 134 (-79.54%)
dduperFast block-level out-of-band BTRFS deduplication tool.
Stars: ✭ 108 (-83.51%)
naas⚙️ Schedule notebooks, run them like APIs, expose securely your assets: Jupyter as a viable ⚡️ Production environment
Stars: ✭ 219 (-66.56%)
TalismanStraightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Stars: ✭ 584 (-10.84%)
record-linkage-resourcesResources for tackling record linkage / deduplication / data matching problems
Stars: ✭ 67 (-89.77%)
entity-embedPyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Stars: ✭ 96 (-85.34%)
DQCS数据质量控制系统
Stars: ✭ 34 (-94.81%)
ResticFast, secure, efficient backup program
Stars: ✭ 15,105 (+2206.11%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-94.05%)
gallia-coreA schema-aware Scala library for data transformation
Stars: ✭ 44 (-93.28%)
YaEtlYet Another ETL in PHP
Stars: ✭ 60 (-90.84%)
FlutterIOTVisit our website for more Mobile and Web applications
Stars: ✭ 66 (-89.92%)
BETL-oldBETL. Meta data driven ETL generation using T-SQL
Stars: ✭ 17 (-97.4%)
neptune-client📒 Experiment tracking tool and model registry
Stars: ✭ 348 (-46.87%)
dask-sqlDistributed SQL Engine in Python using Dask
Stars: ✭ 271 (-58.63%)
dlinkDinky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Streaming & Batch and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.
Stars: ✭ 1,535 (+134.35%)
RE-VERBspeaker diarization system using an LSTM
Stars: ✭ 22 (-96.64%)
google-sheets-etlLive import all your Google Sheets to your data warehouse
Stars: ✭ 15 (-97.71%)
lm-scorer📃Language Model based sentences scoring library
Stars: ✭ 264 (-59.69%)
zdh server数据采集平台zdh,etl 处理服务
Stars: ✭ 53 (-91.91%)
zpaqfranzDeduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix
Stars: ✭ 86 (-86.87%)
neural inverse knittingCode for Neural Inverse Knitting: From Images to Manufacturing Instructions
Stars: ✭ 30 (-95.42%)
blockstack.js-oldThe Blockstack JS library for identity and authentication
Stars: ✭ 20 (-96.95%)
cogitoCogito Identity Management https://cogito.mobi
Stars: ✭ 14 (-97.86%)
fuzzychineseA small package to fuzzy match chinese words
Stars: ✭ 50 (-92.37%)
Hacktoberfest-2k19Just add pull requests to this repo and stand a chance to win a limited edition Hacktoberfest T-shirt.
Stars: ✭ 33 (-94.96%)
django-data-migrationData migration framework for Django that migrates legacy data into your new django app
Stars: ✭ 18 (-97.25%)
fuzzywuzzyFuzzy string matching for PHP
Stars: ✭ 60 (-90.84%)
DeepBumpNormal & height maps generation from single pictures
Stars: ✭ 185 (-71.76%)
apiary-data-lakeTerraform scripts for deploying Apiary Data Lake
Stars: ✭ 15 (-97.71%)
poa-popaDApp for proof of physical address (PoPA) attestation for validators of POA Network
Stars: ✭ 22 (-96.64%)
socratesPHP package to Validate and Extract information from National Identification Numbers.
Stars: ✭ 46 (-92.98%)
morph-kgcPowerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (-88.24%)
winterWInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.
Stars: ✭ 101 (-84.58%)
Yoyo-leafYoyo-leaf is an awesome command-line fuzzy finder.
Stars: ✭ 49 (-92.52%)
card-scanner-flutterA flutter package for Fast, Accurate and Secure Credit card & Debit card scanning
Stars: ✭ 82 (-87.48%)
mlappMLApp is a Python library for building scalable data science solutions that meet modern software engineering standards.
Stars: ✭ 42 (-93.59%)
CustomVisionMicrosoftToCoreMLDemoAppThis app recognises 3 hand signs - fist, high five and victory hand [ rock, paper, scissors basically :) ] with live feed camera. It uses a HandSigns.mlmodel which has been trained using Custom Vision from Microsoft.
Stars: ✭ 25 (-96.18%)
Learning-ResourcesThis repository contains curated, useful resources drafted by DSC Domain Leads.
Stars: ✭ 21 (-96.79%)
openrefine-batchShell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
Stars: ✭ 76 (-88.4%)
active-directory-androidAn android app that uses Azure AD and the ADAL library for authenticating the user and calling a web API using OAuth 2.0 access tokens.
Stars: ✭ 33 (-94.96%)
r2inferenceRidgeRun Inference Framework
Stars: ✭ 22 (-96.64%)
DevSoc21Official website for DEVSOC 21, our annual flagship hackathon.
Stars: ✭ 15 (-97.71%)
facematchFacematch is a tool to verifies if two photos contain the same person.
Stars: ✭ 62 (-90.53%)
ml-graphlab-boilerplateMachine learning boiler plate to get you started in minutes (graphlab + sframe + jupyter + docker)
Stars: ✭ 17 (-97.4%)
AuthenticationAuthentication examples for AspNetCore 3.1
Stars: ✭ 37 (-94.35%)
FlowMasterETL flow framework based on Yaml configs in Python
Stars: ✭ 19 (-97.1%)
zdh web大数据采集,抽取平台
Stars: ✭ 292 (-55.42%)
identity-siteThis is the Login.gov main website where the public is able to learn about their one account for government.
Stars: ✭ 28 (-95.73%)
identityazuretableThis project provides a high performance cloud solution for ASP.NET Identity Core using Azure Table storage replacing the Entity Framework / MSSQL provider.
Stars: ✭ 97 (-85.19%)
leetspeekOpen and collaborative content from leet hackers!
Stars: ✭ 11 (-98.32%)
opusNo description or website provided.
Stars: ✭ 22 (-96.64%)