Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance calculations and string search.
Accelerating the deduplication and collapsing process for reads with Unique Molecular Identifiers (UMI). Heavily optimized for scalability and orders of magnitude faster than a previous tool.
Linguist is a PHP library for parsing strings, it can extract and manipulate prefixed words in content ideal for working with @mentions, #topics and custom tags!
A fast multiple string replace library for ruby. Uses a C implementation of the Aho–Corasick Algorithm based on https://github.com/morenice/ahocorasick while adding support for on the fly multiple string replacement. Faster alternative to String.gsub when dealing with non-regex (exact match) use cases
From 2011: Quickly search for files in NTFS volumes parsing the Master File Table (MFT). A decent amount of how NTFS and MFT work was painstakingly reverse-engineered since it's undocumented.