Deep dives on how we built each rewrite, what we measured, and the exact reason .NET pulls ahead.
At GPT-2 scale .NET and NumPy are tied. Scale to Llama-3's 128k vocab and .NET pulls 1.53× ahead — because TensorPrimitives reuses a single SIMD buffer while NumPy keeps allocating.
Python's intervaltree stores intervals as Python objects in a balanced BST. An augmented interval tree in .NET backed by flat int[] arrays queries the same ranges 31× faster.
Python's difflib uses Ratcliff/Obershelp with per-call matrix allocation. We replaced it with a preallocated DP LCS buffer in .NET and eliminated GC pressure entirely.
dateutil.parser.parse handles any date format automatically using a heuristic ML-style tokenizer. DateTimeOffset.TryParseExact handles a known set of formats with a compiled lookup — 19× faster at 1M timestamps.
Python's qrcode library generates QR codes through a pure-Python Reed-Solomon encoder. QRCoder in .NET compiles the same error correction math to native code — 14× faster at 50,000 codes.
mistune is Python's fastest Markdown parser. Markdig is .NET's equivalent. Both parse CommonMark-compatible Markdown to HTML — Markdig does it 11× faster on 10,000 real-world documents.
Python's textdistance allocates a fresh O(m×n) matrix for every pair. One preallocated ArrayPool row in .NET, reused across all calls, cuts runtime from 14 seconds to under 200 ms at 100k pairs.
Whoosh is a pure-Python BM25 search engine. Lucene.NET is the same algorithm in C#. 1,000 queries across 100,000 documents — Lucene.NET is 9× faster on indexing, 22× on search throughput.
NLTK's Punkt tokenizer runs a trained ML model for sentence boundaries — smart but slow. A compiled regex pair in .NET gives equivalent quality 8× faster on 100 MB of plain text.
pypdf is a pure-Python PDF parser — every byte of every page goes through the CPython interpreter. PdfPig is a pure C# equivalent. Same algorithm, same data, 4–6× faster.
NetworkX stores graphs as Python dicts — every iteration dispatches millions of attribute lookups. A CSR sparse matrix in .NET with SIMD normalization cuts the same algorithm to a fraction of the time.