Add a tool for generating character width tables
Summary:
The uni2characterwidth tool, converts Unicode Character Database files
into character width lookup tables. It uses a template file to place
the tables in a source code file together with a function for finding
the width for specified character. It also allows to generate few forms
of lists with width data for debug and test purposes, or for future use
as a replacement of Unicode files.
Set KONSOLE_BUILD_UNI2CHARACTERWIDTH cmake flag to build the tool.
Use --help argument for more detailed usage.
There is a possibility to generate separate "width" for Ambiguous
characters. It can be used to add ability to configure the characters
width in Konsole settings.
The example.template file contains all possible named tags, and some
additional tags to show how to use them.
CCBUG: 396435
Depends on D15756
Test Plan:
Download files listed below from 11.0.0 and emoji/11.0 directories
on https://unicode.org/Public/. You can also directly use URLs to the
files.
- UnicodeData.txt
- EastAsianWidth.txt
- emoji-data.txt
Generate any available list except compact-ranges (e.g. details):
uni2characterwidth \ -U UnicodeData.txt -A EastAsianWidth.txt -E emoji-data.txt \ -g details result.txt
The list should contain ranges for all possible widths
(-2, -1, 0, 1, 2). You can choose some characters with a width you know
and check how they were classified. -2 is a special non-standard width
for ambiguous characters, which can be overriden by adding -a 1 or
-a 2 parameter. With this flag, all ranges from -2 group should
disappear and become assigned to selected width (1 or 2).
Generate output using a template:
uni2characterwidth \ -U UnicodeData.txt -A EastAsianWidth.txt -E emoji-data.txt \ -g code,./template.example result.txt
Reviewers: Konsole, hindenburg
Reviewed By: Konsole, hindenburg
Subscribers: hindenburg, konsole-devel
Tags: Konsole
Differential Revision: https://phabricator.kde.org/D15757