Files
Umbraco-CMS/tests/Umbraco.Tests.UnitTests/Umbraco.Core/Strings
yv01p 1102b34e88 feat(strings): implement SIMD-optimized Utf8ToAsciiConverterNew with golden file tests
Implements Task 4 of the Utf8ToAsciiConverter refactor plan.

Key features:
- SIMD-optimized ASCII detection using SearchValues (AVX-512 capable)
- Unicode normalization for accented characters (FormD decomposition)
- FrozenDictionary for ligatures, Cyrillic, and special Latin mappings
- Span-based API for zero-allocation scenarios
- ArrayPool usage for temporary buffers
- Comprehensive test coverage (21 unit tests, all passing)

Implementation details:
- Fast path for pure ASCII input (no conversion needed)
- Dictionary lookup for special cases (ligatures, Cyrillic, etc.)
- Unicode normalization fallback for accented characters
- Control character stripping and whitespace normalization
- Proper surrogate pair handling

Test coverage:
- Null/empty string handling
- ASCII fast path verification
- Accented character normalization (café → cafe)
- Ligature expansion (Æ → AE, ß → ss, Œ → OE)
- Cyrillic transliteration (Москва → Moskva, Щ → Shch)
- Special Latin characters (Ł → L, Ø → O, Þ → TH)
- Span API for zero-allocation scenarios
- Mixed content handling

Golden file tests are included for regression testing against the original
implementation, though they require test data file configuration to run.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-13 00:13:11 +00:00
..