Bug 245690 - string_utils decoder assumes UTF-8 and can crash when encountering other encodings.
Summary: string_utils decoder assumes UTF-8 and can crash when encountering other enco...
Status: RESOLVED DUPLICATE of bug 245742
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-09-26 12:40 PDT by Ryan Reno
Modified: 2022-09-29 09:32 PDT (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan Reno 2022-09-26 12:40:43 PDT
The Micro sign in latin-1 is encoded differently from UTF-8. The Python UTF-8 codec fails if it encounters this symbol because we assume by default UTF-8. In particular, I've encountered the failure when the commit message generator and style checker tools are trying to parse https://github.com/web-platform-tests/wpt/blob/master/content-security-policy/script-src/hash-always-converted-to-utf-8/iso-8859-1.html#L13 on import (LayoutTests/imported/w3c/web-platform-tests/content-security-policy/script-src/hash-always-converted-to-utf-8/iso-8859-1.html).

This causes crashes due to an unhandled UnicodeDecodeError in our string_utils decoder implementation.
Comment 1 Radar WebKit Bug Importer 2022-09-26 12:41:08 PDT
<rdar://problem/100423966>
Comment 2 Alexey Proskuryakov 2022-09-29 09:13:01 PDT
Is this what 254967@main fixed? Not entirely sure of the context in this bug.
Comment 3 Ryan Reno 2022-09-29 09:32:04 PDT
(In reply to Alexey Proskuryakov from comment #2)
> Is this what 254967@main fixed? Not entirely sure of the context in this bug.

Yes.

*** This bug has been marked as a duplicate of bug 245742 ***