Update display widths as part of updating Unicode
authorJohn Naylor <john.naylor@postgresql.org>
Thu, 26 Aug 2021 14:53:56 +0000 (10:53 -0400)
committerJohn Naylor <john.naylor@postgresql.org>
Thu, 26 Aug 2021 14:53:56 +0000 (10:53 -0400)
commitbab982161e0590746a2fd2a03043b27108b23ac6
treefd77c5e80bbc1f83da0379a6427b99dc5bb26393
parent1563ecbc1be8b8e5c57651cf5c87f90dea9aea8f
Update display widths as part of updating Unicode

The hardcoded "wide character" set in ucs_wcwidth() was last updated
around the Unicode 5.0 era.  This led to misalignment when printing
emojis and other codepoints that have since been designated
wide or full-width.

To fix and keep up to date, extend update-unicode to download the list
of wide and full-width codepoints from the offical sources.

In passing, remove some comments about non-spacing characters that
haven't been accurate since we removed the former hardcoded logic.

Jacob Champion

Reported and reviewed by Pavel Stehule
Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRCeX21O69YHxmykYySYyprZAqrKWWg0KoGKdjgqcGyygg@mail.gmail.com
src/common/unicode/.gitignore
src/common/unicode/Makefile
src/common/unicode/generate-unicode_east_asian_fw_table.pl [new file with mode: 0644]
src/common/wchar.c
src/include/common/unicode_east_asian_fw_table.h [new file with mode: 0644]