Use libc's wcwidth() instead of own function
AbandonedPublic

Authored by mglb on Apr 15 2018, 11:22 PM.

Details

Reviewers
None
Group Reviewers
Konsole
Summary

Libc's wcwidth(), while not perfect, is used by most programs.
Implementation compatibility is necessary to avoid problems with string
width calculation and proper cursor movement.

The original wcwidth implementation files:

  • COPYING.Unicode
  • src/konsole_wcwidth.cpp
  • src/konsole_wcwidth.h

are not removed yet.

BUG: 378124
BUG: 392171
BUG: 339439

Depends on D12236

Test Plan

Vertical lines below should align

Code     Ch  Source bug
U+26A1  |⚑| 378124
U+2615  |β˜•| 392171
U+26EA  |β›ͺ| 392171
U+1D11E |π„ž | 339439

Diff Detail

Repository
R319 Konsole
Branch
arc/Use-libcs-wcwidth-instead-of-own-function (branched from master)
Lint
No Linters Available
Unit
No Unit Test Coverage
mglb requested review of this revision.Apr 15 2018, 11:22 PM
mglb created this revision.
mglb updated this revision to Diff 34164.May 14 2018, 10:54 PM

Update to master

sorry can you rebase this? it doesn't apply cleanly anylonger - also, can you not delete konsole_wcwidth.* for now

mglb updated this revision to Diff 34400.May 17 2018, 11:13 PM
  • git rebase master
  • restore original implementation files
  • mention original files in description
Restricted Application added a project: Konsole. Β· View Herald TranscriptMay 17 2018, 11:13 PM
Restricted Application added a subscriber: konsole-devel. Β· View Herald Transcript
mglb updated this revision to Diff 34401.May 17 2018, 11:14 PM
mglb edited the summary of this revision. (Show Details)
mglb removed a subscriber: konsole-devel.

Push new description

hindenburg added a comment.EditedMay 18 2018, 12:33 AM

Thanks, I wish there was a better way of rebasing other than asking owner to do it.

On the last line, the right | doesn't line up w/ the other rows

In vim, they line up - not git and not when I cat a file containing them.

mglb added a comment.May 18 2018, 2:33 AM

On the last line, the right | doesn't line up w/ the other rows

Sounds like glibc <2.26 (Unicode <9), where first three were not "wide".

Vim uses its own wcwidth.

QTest::addRow is only in Qt5.9 - you'll need to add a check - it fails on freebsd

Also, I'm a little concerned that we're going to start having to request what version of libc users's have in bug reports. And then have to research how that version uses wcwidth. Where as now, we know everyone uses konsole_wcwidth.

mglb updated this revision to Diff 34449.May 18 2018, 8:08 PM

Use QTest::newRow instead of QTest::addRow

mglb added a comment.May 18 2018, 9:34 PM

Also, I'm a little concerned that we're going to start having to request what version of libc users's have in bug reports. And then have to research how that version uses wcwidth.

You are right here. I have wcwidth maps from several glibcs, so I am able to just grep a code point and verify if it has different width in some version. Sadly, this is not easily downloadable from some reliable source, so it is not an option to recommend.
On the other hand, were there other bugs than "character width is wrong" or "cursor disappears on character/backspaces too much" (where it is obvious the width is wrong for the user), related to character width?

BTW. Why there is no "Please attach konsole --show-info-for-a-bug-report output to the bug" and respective switch in Konsole, and custom additional info defined by a program in "report bug" feature? Could help in some cases, not only with this.

Where as now, we know everyone uses konsole_wcwidth.

Sure, but we'll get less bugs than with konsole_wcwidth - people with newer libc won't have them, people with older version will get about as much as with konsole_wcwidth. Also, using system's wcwidth fixes bugs like "cursor in bash moves wrong" (392171) and similar problems with programs using system's wcwidth, for everyone.

The ultimate (and complex) solution I was thinking about would be something like:

  • width maps specified in files; user can create or download new ones
  • possibility to specify default map in profile (e.g. system's wcwidth-based, or generated straight from Unicode files)
  • possibility to specify separate map for app-screen programs (vim-based map for vim, tig-based for tig, etc)

This would solve problems with custom implementations, different libc, user preferences (including "Unicode is wrong, I want it this way" or improved powerline-like features). But due to complexity it probably won't appear soon.

OK, I agree that a debug/info command line would be a good idea. I wonder if that can be done in the crash system/dialog. We'd need it outside crashes as well.

On my FreeBSD, the width tests fail 20000-10 actual 1, expected 2

mglb abandoned this revision.May 21 2018, 8:29 PM

On my FreeBSD, the width tests fail 20000-10 actual 1, expected 2

Thanks for the info. I did a quick research and looks like width data is not provided for code points above 0xffff since FreeBSD 11 (it defaults to 1, or -1 when code point is not assigned). This is obviously not acceptable.

I am abandoning this revision. I'll generate width data from Unicode 10 and UTS#51 v5, and create a new review request.