Add a tool for generating character width tables

Authored by mglb on Sep 30 2018, 3:36 PM.

Description

Add a tool for generating character width tables

Summary:
The uni2characterwidth tool, converts Unicode Character Database files
into character width lookup tables. It uses a template file to place
the tables in a source code file together with a function for finding
the width for specified character. It also allows to generate few forms
of lists with width data for debug and test purposes, or for future use
as a replacement of Unicode files.

Set KONSOLE_BUILD_UNI2CHARACTERWIDTH cmake flag to build the tool.
Use --help argument for more detailed usage.

There is a possibility to generate separate "width" for Ambiguous
characters. It can be used to add ability to configure the characters
width in Konsole settings.

The example.template file contains all possible named tags, and some
additional tags to show how to use them.

CCBUG: 396435

Depends on D15756

Test Plan:
Download files listed below from 11.0.0 and emoji/11.0 directories
on https://unicode.org/Public/. You can also directly use URLs to the
files.

  • UnicodeData.txt
  • EastAsianWidth.txt
  • emoji-data.txt

Generate any available list except compact-ranges (e.g. details):

uni2characterwidth \
    -U UnicodeData.txt  -A EastAsianWidth.txt  -E emoji-data.txt \
    -g details  result.txt

The list should contain ranges for all possible widths
(-2, -1, 0, 1, 2). You can choose some characters with a width you know
and check how they were classified. -2 is a special non-standard width
for ambiguous characters, which can be overriden by adding -a 1 or
-a 2 parameter. With this flag, all ranges from -2 group should
disappear and become assigned to selected width (1 or 2).

Generate output using a template:

uni2characterwidth \
    -U UnicodeData.txt  -A EastAsianWidth.txt  -E emoji-data.txt \
    -g code,./template.example  result.txt

Reviewers: Konsole, hindenburg

Reviewed By: Konsole, hindenburg

Subscribers: hindenburg, konsole-devel

Tags: Konsole

Differential Revision: https://phabricator.kde.org/D15757

Details

Committed
hindenburgSep 30 2018, 4:22 PM
Reviewer
Konsole
Differential Revision
D15757: Add a tool for generating character width tables
Parents
R319:0f33ee504bc2: Move character width functions to Character class
Branches
Unknown
Tags
Unknown