Changed isPrint for U+00AD SOFT HYPHEN to return true.

Summary:
This is consistent with MacOSX implementation, and most terminals
actually display this character (checked on gnome-terminal, lxterminal, lxterm,
Terminal.app, iterm2). Actually, this is in line with the ISO Latin 1 standard
(ISO 8859-1), which defines it differently from the Unicode Standard. More
information here: http://www.cs.tut.fi/~jkorpela/shy.html

Reviewers: gribozavr, jordan_rose

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187949 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Alexander Kornienko
2013-08-08 01:10:50 +00:00
parent bf473e2240
commit 6cd4f2a2b3
2 changed files with 193 additions and 187 deletions

View File

@@ -32,6 +32,12 @@ TEST(Locale, columnWidth) {
EXPECT_EQ(-1, columnWidth("aaaaaaaaaa\x01"));
EXPECT_EQ(-1, columnWidth("\342\200\213")); // 200B ZERO WIDTH SPACE
// 00AD SOFT HYPHEN is displayed on most terminals as a space or a dash. Some
// text editors display it only when a line is broken at it, some use it as a
// line-break hint, but don't display. We choose terminal-oriented
// interpretation.
EXPECT_EQ(1, columnWidth("\302\255"));
EXPECT_EQ(0, columnWidth("\314\200")); // 0300 COMBINING GRAVE ACCENT
EXPECT_EQ(1, columnWidth("\340\270\201")); // 0E01 THAI CHARACTER KO KAI
EXPECT_EQ(2, columnWidth("\344\270\200")); // CJK UNIFIED IDEOGRAPH-4E00
@@ -72,10 +78,8 @@ TEST(Locale, isPrint) {
EXPECT_EQ(false, isPrint(0x9F));
EXPECT_EQ(true, isPrint(0xAC));
// FIXME: Figure out if we want to treat SOFT HYPHEN as printable character.
#ifndef __APPLE__
EXPECT_EQ(false, isPrint(0xAD)); // SOFT HYPHEN
#endif // __APPLE__
EXPECT_EQ(true, isPrint(0xAD)); // SOFT HYPHEN is displayed on most terminals
// as either a space or a dash.
EXPECT_EQ(true, isPrint(0xAE));
// MacOS implementation doesn't think it's printable.