All Projects → IntraArchiveDeduplicator → Similar Projects or Alternatives

62 Open source projects that are alternatives of or similar to IntraArchiveDeduplicator

Spark Lucenerdd
Spark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (+31.03%)
Mutual labels:  deduplication
Rdedup
Data deduplication engine, supporting optional compression and public key encryption.
Stars: ✭ 690 (+693.1%)
Mutual labels:  deduplication
Person reid baseline pytorch
Pytorch ReID: A tiny, friendly, strong pytorch implement of object re-identification baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
Stars: ✭ 2,963 (+3305.75%)
Mutual labels:  image-search
Kvdo
A pair of kernel modules which provide pools of deduplicated and/or compressed block storage.
Stars: ✭ 168 (+93.1%)
Mutual labels:  deduplication
RocketMQDedupListener
RocketMQ消息幂等去重消费者,支持使用MySQL或者Redis做幂等表,开箱即用
Stars: ✭ 132 (+51.72%)
Mutual labels:  deduplication
pqlite
⚡ A fast embedded library for approximate nearest neighbor search
Stars: ✭ 141 (+62.07%)
Mutual labels:  image-search
Fastcdc Rs
FastCDC implementation in Rust
Stars: ✭ 31 (-64.37%)
Mutual labels:  deduplication
web-image-crawler
Code to download web-images
Stars: ✭ 15 (-82.76%)
Mutual labels:  image-search
Alertmanager
Prometheus Alertmanager
Stars: ✭ 4,574 (+5157.47%)
Mutual labels:  deduplication
Trace.moe
Anime Scene Search by Image
Stars: ✭ 3,231 (+3613.79%)
Mutual labels:  image-search
Lsh
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
Stars: ✭ 182 (+109.2%)
Mutual labels:  deduplication
splink
Implementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+108.05%)
Mutual labels:  deduplication
dedupsqlfs
Deduplicating filesystem via Python3, FUSE and SQLite
Stars: ✭ 24 (-72.41%)
Mutual labels:  deduplication
Dejavu
Quickly detect already witnessed data.
Stars: ✭ 151 (+73.56%)
Mutual labels:  deduplication
img classification deep learning
No description or website provided.
Stars: ✭ 19 (-78.16%)
Mutual labels:  image-search
Rltk
Record Linkage ToolKit (Find and link entities)
Stars: ✭ 71 (-18.39%)
Mutual labels:  deduplication
Fergun
An utility Discord bot written in C# using Discord.Net
Stars: ✭ 26 (-70.11%)
Mutual labels:  image-search
Borgmatic
Simple, configuration-driven backup software for servers and workstations
Stars: ✭ 902 (+936.78%)
Mutual labels:  deduplication
yadf
Yet Another Dupes Finder
Stars: ✭ 32 (-63.22%)
Mutual labels:  deduplication
Recordlinkage
A toolkit for record linkage and duplicate detection in Python
Stars: ✭ 532 (+511.49%)
Mutual labels:  deduplication
Google Images Download
Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!
Stars: ✭ 7,815 (+8882.76%)
Mutual labels:  image-search
lieu
Dedupe/batch geocode addresses and venues around the world with libpostal
Stars: ✭ 73 (-16.09%)
Mutual labels:  deduplication
Neural-Scam-Artist
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Stars: ✭ 18 (-79.31%)
Mutual labels:  deduplication
gencore
Generate duplex/single consensus reads to reduce sequencing noises and remove duplications
Stars: ✭ 91 (+4.6%)
Mutual labels:  deduplication
py-image-search-engine
Python Image Search Engine with OpenCV
Stars: ✭ 37 (-57.47%)
Mutual labels:  image-search
Data Matching Software
A list of free data matching and record linkage software.
Stars: ✭ 206 (+136.78%)
Mutual labels:  deduplication
dduper
Fast block-level out-of-band BTRFS deduplication tool.
Stars: ✭ 108 (+24.14%)
Mutual labels:  deduplication
Frost
A backup program that does deduplication, compression, encryption
Stars: ✭ 25 (-71.26%)
Mutual labels:  deduplication
Restic
Fast, secure, efficient backup program
Stars: ✭ 15,105 (+17262.07%)
Mutual labels:  deduplication
zpaqfranz
Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix
Stars: ✭ 86 (-1.15%)
Mutual labels:  deduplication
Dupeguru
Find duplicate files
Stars: ✭ 2,385 (+2641.38%)
Mutual labels:  deduplication
SmartImage
Reverse image search tool (SauceNao, ImgOps, trace.moe, and more)
Stars: ✭ 346 (+297.7%)
Mutual labels:  image-search
Vdo
Userspace tools for managing VDO volumes.
Stars: ✭ 138 (+58.62%)
Mutual labels:  deduplication
pupyl
🧿 Pupyl is a really fast image search library which you can index your own (millions of) images and find similar images in milliseconds.
Stars: ✭ 83 (-4.6%)
Mutual labels:  image-search
Fingerprints
Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.
Stars: ✭ 91 (+4.6%)
Mutual labels:  deduplication
nomenklatura
Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources
Stars: ✭ 158 (+81.61%)
Mutual labels:  deduplication
Rmlint
Extremely fast tool to remove duplicates and other lint from your filesystem
Stars: ✭ 996 (+1044.83%)
Mutual labels:  deduplication
weapp-saucenao
微信小程序: 识图娘
Stars: ✭ 19 (-78.16%)
Mutual labels:  image-search
Dupandas
📊 python package for performing deduplication using flexible text matching and cleaning in pandas dataframe
Stars: ✭ 20 (-77.01%)
Mutual labels:  deduplication
Jina
Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data
Stars: ✭ 12,618 (+14403.45%)
Mutual labels:  image-search
Jdupes
A powerful duplicate file finder and an enhanced fork of 'fdupes'.
Stars: ✭ 790 (+808.05%)
Mutual labels:  deduplication
fuzzysearch
A site that allows you to reverse image search millions of furry images in under a second
Stars: ✭ 34 (-60.92%)
Mutual labels:  image-search
Talisman
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Stars: ✭ 584 (+571.26%)
Mutual labels:  deduplication
Fast Reid
SOTA Re-identification Methods and Toolbox
Stars: ✭ 2,287 (+2528.74%)
Mutual labels:  image-search
Kopia
Cross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.
Stars: ✭ 507 (+482.76%)
Mutual labels:  deduplication
mail-deduplicate
📧 CLI to deduplicate mails from mail boxes.
Stars: ✭ 134 (+54.02%)
Mutual labels:  deduplication
Libpostal
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
Stars: ✭ 3,312 (+3706.9%)
Mutual labels:  deduplication
Milvus
An open-source vector database for embedding similarity search and AI applications.
Stars: ✭ 9,015 (+10262.07%)
Mutual labels:  image-search
UMICollapse
Accelerating the deduplication and collapsing process for reads with Unique Molecular Identifiers (UMI). Heavily optimized for scalability and orders of magnitude faster than a previous tool.
Stars: ✭ 31 (-64.37%)
Mutual labels:  deduplication
cargo-limit
Cargo with less noise: warnings are skipped until errors are fixed, Neovim integration, etc.
Stars: ✭ 105 (+20.69%)
Mutual labels:  deduplication
record-linkage-resources
Resources for tackling record linkage / deduplication / data matching problems
Stars: ✭ 67 (-22.99%)
Mutual labels:  deduplication
trace.moe-www
Anime Scene Search by Image
Stars: ✭ 16 (-81.61%)
Mutual labels:  image-search
entity-embed
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Stars: ✭ 96 (+10.34%)
Mutual labels:  deduplication
deduplication
Fast multi-threaded content-dependent chunking deduplication for Buffers in C++ with a reference implementation in Javascript. Ships with extensive tests, a fuzz test and a benchmark.
Stars: ✭ 59 (-32.18%)
Mutual labels:  deduplication
EfficientIR
人工智障本地图片检索工具 | An EfficientNet based image retrieval tool
Stars: ✭ 64 (-26.44%)
Mutual labels:  image-search
iqdb tagger
Search IQDB from CLI
Stars: ✭ 18 (-79.31%)
Mutual labels:  image-search
MoTIS
Mobile(iOS) Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP). Accepted at NAACL 2022.
Stars: ✭ 60 (-31.03%)
Mutual labels:  image-search
zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+652.87%)
Mutual labels:  deduplication
natural-language-joint-query-search
Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.
Stars: ✭ 143 (+64.37%)
Mutual labels:  image-search
google-this
🔎 A simple yet powerful module to retrieve organic search results and much more from Google.
Stars: ✭ 88 (+1.15%)
Mutual labels:  image-search
1-60 of 62 similar projects