All Projects → UMICollapse → Similar Projects or Alternatives

467 Open source projects that are alternatives of or similar to UMICollapse

eddie
No description or website provided.
Stars: ✭ 18 (-41.94%)
Mutual labels:  string-similarity, hamming
algoexpert-data-structures-algorithms
A collection of solutions for all problem statements on the AlgoExpert Coding Interview platform.
Stars: ✭ 134 (+332.26%)
Mutual labels:  data-structures
bioinf-commons
Bioinformatics library in Kotlin
Stars: ✭ 21 (-32.26%)
Mutual labels:  fastq
yadf
Yet Another Dupes Finder
Stars: ✭ 32 (+3.23%)
Mutual labels:  deduplication
stringbench
String matching algorithm benchmark
Stars: ✭ 31 (+0%)
Mutual labels:  string-search
Daily-Coding-DS-ALGO-Practice
A open source project🚀 for bringing all interview💥💥 and competative📘 programming💥💥 question under one repo📐📐
Stars: ✭ 255 (+722.58%)
Mutual labels:  data-structures
strsim
string similarity based on Dice's coefficient in go
Stars: ✭ 39 (+25.81%)
Mutual labels:  string-similarity
Interview-Prep-DS-Algo
No description or website provided.
Stars: ✭ 14 (-54.84%)
Mutual labels:  data-structures
fastq utils
Validation and manipulation of FASTQ files, scRNA-seq barcode pre-processing and UMI quantification.
Stars: ✭ 25 (-19.35%)
Mutual labels:  fastq
stringdistance
A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..
Stars: ✭ 60 (+93.55%)
Mutual labels:  string-similarity
Pairfq
Sync paired-end FASTA/Q files and keep singleton reads
Stars: ✭ 18 (-41.94%)
Mutual labels:  fastq
multi string replace
A fast multiple string replace library for ruby. Uses a C implementation of the Aho–Corasick Algorithm based on https://github.com/morenice/ahocorasick while adding support for on the fly multiple string replacement. Faster alternative to String.gsub when dealing with non-regex (exact match) use cases
Stars: ✭ 16 (-48.39%)
Mutual labels:  string-search
Java-Questions-and-Solutions
This repository aims to solve and create new problems from different spheres of coding. A path to help students to get access to solutions and discuss their doubts.
Stars: ✭ 34 (+9.68%)
Mutual labels:  data-structures
readfq
A simple tool to calculate reads number and total base count in FASTQ file
Stars: ✭ 19 (-38.71%)
Mutual labels:  fastq
Data-Structures-and-Algorithms
Data Structures and Algorithms implementation in Python
Stars: ✭ 31 (+0%)
Mutual labels:  data-structures
fastq-and-furious
Efficient handling of FASTQ files from Python
Stars: ✭ 49 (+58.06%)
Mutual labels:  fastq
baps-bgd.github.io
This repository is used to maintain the site of BAPS. Please read the README if you are willing to contribute.
Stars: ✭ 17 (-45.16%)
Mutual labels:  data-structures
strutil
Golang metrics for calculating string similarity and other string utility functions
Stars: ✭ 114 (+267.74%)
Mutual labels:  string-similarity
pheniqs
Fast and accurate sequence demultiplexing
Stars: ✭ 14 (-54.84%)
Mutual labels:  fastq
textics
📉 JavaScript Text Statistics that counts lines, words, chars, and spaces.
Stars: ✭ 36 (+16.13%)
Mutual labels:  string-search
pysdsl
Python bindings to Succinct Data Structure Library 2.0
Stars: ✭ 23 (-25.81%)
Mutual labels:  data-structures
zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+2012.9%)
Mutual labels:  deduplication
DS Algo
A repository to maintain various data structures and algorithms
Stars: ✭ 23 (-25.81%)
Mutual labels:  data-structures
splink
Implementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+483.87%)
Mutual labels:  deduplication
bin
My bioinfo toolbox
Stars: ✭ 42 (+35.48%)
Mutual labels:  fastq
Neural-Scam-Artist
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Stars: ✭ 18 (-41.94%)
Mutual labels:  deduplication
stance
Learned string similarity for entity names using optimal transport.
Stars: ✭ 27 (-12.9%)
Mutual labels:  string-similarity
record-linkage-resources
Resources for tackling record linkage / deduplication / data matching problems
Stars: ✭ 67 (+116.13%)
Mutual labels:  deduplication
acid-store
A library for secure, deduplicated, transactional, and verifiable data storage
Stars: ✭ 48 (+54.84%)
Mutual labels:  deduplication
Data-structures
Data Structures in Java
Stars: ✭ 13 (-58.06%)
Mutual labels:  data-structures
beda
Beda is a golang library for detecting how similar a two string
Stars: ✭ 34 (+9.68%)
Mutual labels:  string-similarity
js-data-structures-and-algorithms
JavaScript implementations of common data structure and algorithm concepts.
Stars: ✭ 31 (+0%)
Mutual labels:  data-structures
fuc
Frequently used commands in bioinformatics
Stars: ✭ 23 (-25.81%)
Mutual labels:  fastq
Data-Structures-and-Algorithm-C-
Hi folks🖐🏻 , I'm maintaining this repository, feel free to open a pull request and contribute! :)
Stars: ✭ 39 (+25.81%)
Mutual labels:  data-structures
IntraArchiveDeduplicator
Tool for managing data-deduplication within extant compressed archive files, along with a relatively performant BK tree implementation for fuzzy image searching.
Stars: ✭ 87 (+180.65%)
Mutual labels:  deduplication
OOP-In-CPlusPlus
An Awesome Repository On Object Oriented Programming In C++ Language. Ideal For Computer Science Undergraduates, This Repository Holds All The Resources Created And Used By Me - Code & Theory For One To Master Object Oriented Programming. Filled With Theory Slides, Number Of Programs, Concept-Clearing Projects And Beautifully Explained, Well Doc…
Stars: ✭ 27 (-12.9%)
Mutual labels:  data-structures
string-similarity-js
Lightweight string similarity function for javascript
Stars: ✭ 29 (-6.45%)
Mutual labels:  string-similarity
Python
Repositori untuk belajar pemrograman Python dalam bahasa Indonesia
Stars: ✭ 79 (+154.84%)
Mutual labels:  data-structures
Levenshtein
The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
Stars: ✭ 38 (+22.58%)
Mutual labels:  string-similarity
gencore
Generate duplex/single consensus reads to reduce sequencing noises and remove duplications
Stars: ✭ 91 (+193.55%)
Mutual labels:  deduplication
fq
Command line utility for manipulating Illumina-generated FastQ files.
Stars: ✭ 31 (+0%)
Mutual labels:  fastq
coding-interview-guide
A systematic coding interview guide
Stars: ✭ 76 (+145.16%)
Mutual labels:  data-structures
cargo-limit
Cargo with less noise: warnings are skipped until errors are fixed, Neovim integration, etc.
Stars: ✭ 105 (+238.71%)
Mutual labels:  deduplication
SPLiT-Seq demultiplexing
An unofficial demultiplexing strategy for SPLiT-seq RNA-Seq data
Stars: ✭ 20 (-35.48%)
Mutual labels:  fastq
ngs pipeline
Exome/Capture/RNASeq Pipeline Implementation using snakemake
Stars: ✭ 40 (+29.03%)
Mutual labels:  fastq
swift-algorithms-data-structs
📒 Algorithms and Data Structures in Swift. The used approach attempts to fully utilize the Swift Standard Library and Protocol-Oriented paradigm.
Stars: ✭ 42 (+35.48%)
Mutual labels:  data-structures
nullarbor
💾 📃 "Reads to report" for public health and clinical microbiology
Stars: ✭ 111 (+258.06%)
Mutual labels:  fastq
AhoCorasick
Aho-Corasick multi-string search for .NET and SQL Server.
Stars: ✭ 39 (+25.81%)
Mutual labels:  string-search
zpaqfranz
Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix
Stars: ✭ 86 (+177.42%)
Mutual labels:  deduplication
Data-Structures-and-Algorithms
Fundamentals of Data structures and algorithms in C++
Stars: ✭ 34 (+9.68%)
Mutual labels:  data-structures
entity-embed
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Stars: ✭ 96 (+209.68%)
Mutual labels:  deduplication
deduplication
Fast multi-threaded content-dependent chunking deduplication for Buffers in C++ with a reference implementation in Javascript. Ships with extensive tests, a fuzz test and a benchmark.
Stars: ✭ 59 (+90.32%)
Mutual labels:  deduplication
mail-deduplicate
📧 CLI to deduplicate mails from mail boxes.
Stars: ✭ 134 (+332.26%)
Mutual labels:  deduplication
landscape-of-programming
This repo aim to show you what to learn on the way to excellence.
Stars: ✭ 67 (+116.13%)
Mutual labels:  data-structures
dduper
Fast block-level out-of-band BTRFS deduplication tool.
Stars: ✭ 108 (+248.39%)
Mutual labels:  deduplication
cs-resources
Curated Computer Science and Programming Resource Guide
Stars: ✭ 42 (+35.48%)
Mutual labels:  data-structures
RocketMQDedupListener
RocketMQ消息幂等去重消费者,支持使用MySQL或者Redis做幂等表,开箱即用
Stars: ✭ 132 (+325.81%)
Mutual labels:  deduplication
py-algorithms
Algorithms and Data Structures, solutions to common CS problems.
Stars: ✭ 26 (-16.13%)
Mutual labels:  data-structures
algorithm-study
草莓奶昔的算法学习笔记(typescript/python)
Stars: ✭ 29 (-6.45%)
Mutual labels:  data-structures
naf
Nucleotide Archival Format - Compressed file format for DNA/RNA/protein sequences
Stars: ✭ 35 (+12.9%)
Mutual labels:  fastq
1-60 of 467 similar projects