Bug 216219

Summary: TextCodec refinements
Product: WebKit Reporter: Darin Adler <darin>
Component: TextAssignee: Darin Adler <darin>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, beidson, benjamin, cdumez, cmarcelo, esprehn+autocc, ews-watchlist, kangil.han, keith_miller, mark.lam, mmaxfield, msaboff, saam, sam, tzagallo, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
See Also: https://bugs.webkit.org/show_bug.cgi?id=216280
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch none

Description Darin Adler 2020-09-05 14:44:42 PDT
TextCodec refinements
Comment 1 Darin Adler 2020-09-05 15:15:20 PDT Comment hidden (obsolete)
Comment 2 Darin Adler 2020-09-05 15:30:39 PDT Comment hidden (obsolete)
Comment 3 Darin Adler 2020-09-05 18:20:01 PDT Comment hidden (obsolete)
Comment 4 Darin Adler 2020-09-05 18:34:24 PDT
Earlier versions of the patch had the BOM logic exactly backwards! Should be all better now.
Comment 5 Darin Adler 2020-09-05 22:06:03 PDT
Created attachment 408115 [details]
Patch
Comment 6 Darin Adler 2020-09-05 22:16:39 PDT
Comment on attachment 408115 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=408115&action=review

> Source/WebCore/platform/text/TextCodecCJK.cpp:195
> +    auto range = std::equal_range(jis0212, jis0212 + std::size(jis0212), std::pair<uint16_t, UChar>(pointer, 0), [](const auto& a, const auto& b) {

Should use std::begin/end here instead of std::size.
Comment 7 Darin Adler 2020-09-06 10:30:50 PDT
All green EWS now, so I guess it’s time for review.
Comment 8 Sam Weinig 2020-09-06 11:13:03 PDT
Comment on attachment 408115 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=408115&action=review

> Source/WebCore/platform/text/TextCodecCJK.cpp:542
> +            static constexpr std::array<UChar32, 63> iso2022JPKatakana {

Annoying that you have to provide the 63 here, but it looks like that is necessary if you want to use UChar32 explicitly, otherwise, you could do 

`static constexpr std::array iso2022JPKatakana { 0x3002, 0x300C, 0x300D ... }

and it would deduce the size for you.

> Source/WebCore/platform/text/TextCodecCJK.cpp:690
> +        auto table = new EUCKREncodingIndex;
> +        for (size_t i = 0; i < std::size(eucKRDecodingIndex); i++)
>              (*table)[i] = { eucKRDecodingIndex[i].second, eucKRDecodingIndex[i].first };

One of these days I will get around to adding a zip iterator to make this a bit nicer.
Comment 9 Darin Adler 2020-09-06 11:56:40 PDT
Comment on attachment 408115 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=408115&action=review

>> Source/WebCore/platform/text/TextCodecCJK.cpp:542
>> +            static constexpr std::array<UChar32, 63> iso2022JPKatakana {
> 
> Annoying that you have to provide the 63 here, but it looks like that is necessary if you want to use UChar32 explicitly, otherwise, you could do 
> 
> `static constexpr std::array iso2022JPKatakana { 0x3002, 0x300C, 0x300D ... }
> 
> and it would deduce the size for you.

I suspect that instead we could use a standard C array instead of a std::array here. Or maybe we should write this:

    std::array<UChar32, (0xFF9F - 0xFF61) + 1> ...

Or this:

    static_assert(std::size(iso2022JPKatakana) == (0xFF9F - 0xFF61) + 1);
Comment 10 EWS 2020-09-06 12:10:16 PDT
Committed r266681: <https://trac.webkit.org/changeset/266681>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 408115 [details].
Comment 11 Radar WebKit Bug Importer 2020-09-06 12:11:15 PDT
<rdar://problem/68431646>