LibpostalA C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
Stars: ✭ 3,312 (+1719.78%)
Mutual labels: deduplication
Dupandas📊 python package for performing deduplication using flexible text matching and cleaning in pandas dataframe
Stars: ✭ 20 (-89.01%)
Mutual labels: deduplication
VdoUserspace tools for managing VDO volumes.
Stars: ✭ 138 (-24.18%)
Mutual labels: deduplication
KopiaCross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.
Stars: ✭ 507 (+178.57%)
Mutual labels: deduplication
JdupesA powerful duplicate file finder and an enhanced fork of 'fdupes'.
Stars: ✭ 790 (+334.07%)
Mutual labels: deduplication
RmlintExtremely fast tool to remove duplicates and other lint from your filesystem
Stars: ✭ 996 (+447.25%)
Mutual labels: deduplication
UMICollapseAccelerating the deduplication and collapsing process for reads with Unique Molecular Identifiers (UMI). Heavily optimized for scalability and orders of magnitude faster than a previous tool.
Stars: ✭ 31 (-82.97%)
Mutual labels: deduplication
KvdoA pair of kernel modules which provide pools of deduplicated and/or compressed block storage.
Stars: ✭ 168 (-7.69%)
Mutual labels: deduplication
BorgmaticSimple, configuration-driven backup software for servers and workstations
Stars: ✭ 902 (+395.6%)
Mutual labels: deduplication
Spark LucenerddSpark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-37.36%)
Mutual labels: deduplication
RecordlinkageA toolkit for record linkage and duplicate detection in Python
Stars: ✭ 532 (+192.31%)
Mutual labels: deduplication
RdedupData deduplication engine, supporting optional compression and public key encryption.
Stars: ✭ 690 (+279.12%)
Mutual labels: deduplication
RltkRecord Linkage ToolKit (Find and link entities)
Stars: ✭ 71 (-60.99%)
Mutual labels: deduplication
AlertmanagerPrometheus Alertmanager
Stars: ✭ 4,574 (+2413.19%)
Mutual labels: deduplication
DejavuQuickly detect already witnessed data.
Stars: ✭ 151 (-17.03%)
Mutual labels: deduplication
lieuDedupe/batch geocode addresses and venues around the world with libpostal
Stars: ✭ 73 (-59.89%)
Mutual labels: deduplication
Fastcdc RsFastCDC implementation in Rust
Stars: ✭ 31 (-82.97%)
Mutual labels: deduplication
ResticFast, secure, efficient backup program
Stars: ✭ 15,105 (+8199.45%)
Mutual labels: deduplication
DupeguruFind duplicate files
Stars: ✭ 2,385 (+1210.44%)
Mutual labels: deduplication
FingerprintsMake it easier to compare and cross-reference the names of companies and people by applying strong normalisation.
Stars: ✭ 91 (-50%)
Mutual labels: deduplication