Bug 213253

Summary: Fix the case about NFC normalization in wpt/FileAPI/unicode.html
Product: WebKit Reporter: Tetsuharu Ohzeki [UTC+9] <tetsuharu.ohzeki>
Component: DOMAssignee: Nobody <webkit-unassigned>
Status: RESOLVED DUPLICATE    
Severity: Normal CC: ap, ysuzuki
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 213254    
Bug Blocks:    
Attachments:
Description Flags
Patch none

Description Tetsuharu Ohzeki [UTC+9] 2020-06-16 09:36:57 PDT
I'll attach a patch
Comment 1 Tetsuharu Ohzeki [UTC+9] 2020-06-16 09:50:19 PDT
Created attachment 402016 [details]
Patch
Comment 2 Alexey Proskuryakov 2020-06-17 18:12:03 PDT
Comment on attachment 402016 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=402016&action=review

> Source/WebCore/fileapi/BlobBuilder.cpp:74
> -    auto bytes = UTF8Encoding().encodeWithNormalization(text, UnencodableHandling::Entities);
> +    auto bytes = UTF8Encoding().encode(text, UnencodableHandling::Entities);

While this is correct in abstract, I don't think that we can do this before the normalization is performed in all code paths where decomposed text from macOS can enter WebKit. And once we do that, encode can stop normalizing.
Comment 3 Tetsuharu Ohzeki [UTC+9] 2020-06-17 18:36:24 PDT
(In reply to Alexey Proskuryakov from comment #2)
> Comment on attachment 402016 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=402016&action=review
> 
> > Source/WebCore/fileapi/BlobBuilder.cpp:74
> > -    auto bytes = UTF8Encoding().encodeWithNormalization(text, UnencodableHandling::Entities);
> > +    auto bytes = UTF8Encoding().encode(text, UnencodableHandling::Entities);
> 
> While this is correct in abstract, I don't think that we can do this before
> the normalization is performed in all code paths where decomposed text from
> macOS can enter WebKit. And once we do that, encode can stop normalizing.


This effect only for constructing `Blob` API from string on JavaScript.

The exist `TextEncoding::encode()` do normalization in it and it is widely used in evenrywhere in WebKit.
https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/platform/text/TextEncoding.cpp#L77

I had planned to add `TextEncoding::encodeWithoutNormalization()` but I also thought from the comment that it might be better to rename the exist implementation once, I filed bug 213254 and I created this patch built on it.
Comment 4 Alexey Proskuryakov 2020-06-18 09:24:17 PDT
Since blobs can be sent over the wire and/or processed by algorithms that were only tested on Windows, their content should not include decomposed strings coming from macOS. It is true that the test is accurate and we need to make it pass, but disabling normalization would be the last step for making it happen.
Comment 5 Tetsuharu Ohzeki [UTC+9] 2020-06-18 11:41:04 PDT
Umm..... it's a long way. Thank you for your explanation.
Comment 6 Tetsuharu Ohzeki [UTC+9] 2020-10-06 05:07:37 PDT

*** This bug has been marked as a duplicate of bug 217327 ***