Improve string normalization, and use that for airport name matching

Authored by vkrause on Sep 8 2018, 12:16 PM.

Description

Improve string normalization, and use that for airport name matching

So far we were just doing case folding, now we also do Unicode
decomposition to remove diacritic marks. This reduces the airport
string table size by ~5% without compromising quality.

This approach should also be helping for matching non-ASCII names in
IATA boarding passes to their normal spelling.

Details

Committed
vkrauseSep 8 2018, 12:16 PM
Parents
R1003:4cb73c4497dd: Further improve timezone disambiguation in border regions
Branches
Unknown
Tags
Unknown