RESOLVED DUPLICATE of bug 178960 149170
Honor isolate bidi unicode codepoints
https://bugs.webkit.org/show_bug.cgi?id=149170
Summary Honor isolate bidi unicode codepoints
Myles C. Maxfield
Reported 2015-09-15 10:51:12 PDT
Currently, our isolation logic only works on a RenderElement level. It should work inside a single RenderText too.
Attachments
WIP (3.99 KB, patch)
2016-06-14 00:59 PDT, Myles C. Maxfield
no flags
Example (129 bytes, text/html)
2016-06-14 01:33 PDT, Myles C. Maxfield
no flags
Example 2 (187 bytes, text/html)
2016-06-14 01:38 PDT, Myles C. Maxfield
no flags
Patch (10.58 KB, patch)
2016-06-14 02:35 PDT, Myles C. Maxfield
simon.fraser: review+
Patch for committing (11.02 KB, patch)
2016-06-14 20:34 PDT, Myles C. Maxfield
no flags
Radar WebKit Bug Importer
Comment 1 2015-09-15 10:51:51 PDT
Myles C. Maxfield
Comment 2 2016-06-13 16:31:09 PDT
InlineBidiResolver::appendRun() will need to be fixed up, and potentially much more.
Myles C. Maxfield
Comment 3 2016-06-14 00:39:08 PDT
Currently, BidiResolverBase::createBidiRunsForLine() has no concept of U_RIGHT_TO_LEFT_ISOLATE. Because it doesn't appear as a case in any switch statements, the default is getting hit, which means we seem to ignore it completely (for the purposes of bidi).
Myles C. Maxfield
Comment 4 2016-06-14 00:45:15 PDT
These isolate characters don't force the complex text code path.
Myles C. Maxfield
Comment 5 2016-06-14 00:59:55 PDT
Myles C. Maxfield
Comment 6 2016-06-14 01:33:00 PDT
Myles C. Maxfield
Comment 7 2016-06-14 01:38:16 PDT
Created attachment 281247 [details] Example 2
Myles C. Maxfield
Comment 8 2016-06-14 02:19:10 PDT
Example 2 is explained by us not understanding how to deal with isolate characters inside BidiResolverBase::createBidiRunsForLine(). In particular, we have this general structure: UCharDirection last = ... while(true) { UCharDirection direction = u_charDirection(c); switch (direction) { case U_LEFT_TO_RIGHT: switch (last) { default: break; } case U_RIGHT_TO_LEFT: ... case U_WHITE_SPACE_NEUTRAL: default: break; } last = direction; } So, if the pattern of directions is: U_RIGHT_TO_LEFT U_WHITE_SPACE_NEUTRAL U_FIRST_STRONG_ISOLATE U_LEFT_TO_RIGHT When we hit the U_FIRST_STRONG_ISOLATE, we hit the outer default:. Then we set last = U_FIRST_STRONG_ISOLATE, and the next character is U_LEFT_TO_RIGHT, so we end up hitting the inner default:. So then we keep going, and never emit the run for the U_RIGHT_TO_LEFT parts. Compare this to what you would get if you removed the ISOLATE characters. In that case, it would work correctly because the default:s wouldn't get hit.
Myles C. Maxfield
Comment 9 2016-06-14 02:35:42 PDT
Myles C. Maxfield
Comment 10 2016-06-14 02:37:45 PDT
Simon Fraser (smfr)
Comment 11 2016-06-14 11:18:35 PDT
Comment on attachment 281249 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=281249&action=review > Source/WTF/ChangeLog:3 > + Honor unicode bidi unicode codepoints One too many "unicode"s here? > Source/WebCore/ChangeLog:12 > + Because BidiResolver doesn't have any concept of isolate Unicode code points, > + its output is unexpected when they are present in content. Instead of this, > + we should consider the code points the same as whitespace for the purposes > + of the bidi algorithm. This is a stop-gap measure until we can support the > + codepoints fully in our Bidi algorithm. This reads awkwardly. "BidiResolver doesn't have any concept of isolate Unicode code points, so produces unexpected output when they are present". Fix by considering such code points as whitespace in the bidi algorithm. This is..." You should reference a follow-up bug to fix this the correct way. > LayoutTests/fast/text/isolate-ignore-expected.html:4 > +This test makes sure that isolate codepoints are ignored (for the time being). "for the time being" is never a good thing to put into code or a test. It quickly becomes anachronistic.
Myles C. Maxfield
Comment 12 2016-06-14 11:58:15 PDT
Ryan Haddad
Comment 13 2016-06-14 12:42:51 PDT
Ryan Haddad
Comment 14 2016-06-14 17:29:59 PDT
(In reply to comment #13) > This change broke the Windows build: > <https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/ > builds/78422> The change was rolled out in <http://trac.webkit.org/changeset/202059>
Darin Adler
Comment 15 2016-06-14 18:03:06 PDT
Can we add a regression test?
Myles C. Maxfield
Comment 16 2016-06-14 19:41:13 PDT
This patch broke the Windows build because we're building against ICU 6.1 on Windows, which doesn't know what any of the U_*_ISOLATE constants are.
Myles C. Maxfield
Comment 17 2016-06-14 19:42:02 PDT
Myles C. Maxfield
Comment 18 2016-06-14 20:23:35 PDT
(In reply to comment #16) > This patch broke the Windows build because we're building against ICU 6.1 on > Windows, which doesn't know what any of the U_*_ISOLATE constants are. Even though the header file on Windows doesn't include the U_*_ISOLATE symbols, the library on Windows does correctly return the correct constants from u_charDirection().
Myles C. Maxfield
Comment 19 2016-06-14 20:26:35 PDT
(In reply to comment #15) > Can we add a regression test? I thought the test I added (fast/text/isolate-ignore.html) is a regression test. Are you referring to something else?
Myles C. Maxfield
Comment 20 2016-06-14 20:34:28 PDT
Created attachment 281320 [details] Patch for committing
WebKit Commit Bot
Comment 21 2016-06-14 20:36:50 PDT
Attachment 281320 [details] did not pass style-queue: ERROR: Source/WebCore/platform/text/BidiResolver.h:635: U_FIRST_STRONG_ISOLATE is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/WebCore/platform/text/BidiResolver.h:636: U_LEFT_TO_RIGHT_ISOLATE is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/WebCore/platform/text/BidiResolver.h:637: U_RIGHT_TO_LEFT_ISOLATE is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/WebCore/platform/text/BidiResolver.h:638: U_POP_DIRECTIONAL_ISOLATE is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] Total errors found: 4 in 8 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Commit Bot
Comment 22 2016-06-14 21:42:24 PDT
Comment on attachment 281320 [details] Patch for committing Clearing flags on attachment: 281320 Committed r202083: <http://trac.webkit.org/changeset/202083>
Darin Adler
Comment 23 2016-06-15 09:23:28 PDT
(In reply to comment #19) > I thought the test I added (fast/text/isolate-ignore.html) is a regression > test. Sorry, somehow I missed it!
mitz
Comment 24 2017-10-17 17:04:29 PDT
Can you retitle this bug so that it’s clear that it’s about isolate codepoints? Embed and override codepoints had been working since KHTML.
Myles C. Maxfield
Comment 25 2018-08-27 11:47:04 PDT
*** This bug has been marked as a duplicate of bug 178960 ***
Note You need to log in before you can comment on or make changes to this bug.