RESOLVED DUPLICATE of bug 176225 254889
Support all of HTML's character entities in WebVTT
https://bugs.webkit.org/show_bug.cgi?id=254889
Summary Support all of HTML's character entities in WebVTT
Ahmad Saleem
Reported 2023-04-02 08:15:32 PDT
Hi Team, While going through Blink's commits, I came across another one, which can be explored in WebKit. Blink Commit - https://chromium.googlesource.com/chromium/src.git/+/80ccfaf557f5ad07e5de8bcc08e1aba84190b2a0 WPT Test Link - http://wpt.live/webvtt/parsing/cue-text-parsing/tests/entities.html Just wanted to raise so we can track it. Thanks! ____ @ap - if you can help, who should be informed on this and CC, it would be good to know for myself as well on who looks into WebVTT in WebKit.
Attachments
Alexey Proskuryakov
Comment 1 2023-04-02 18:12:53 PDT
*** This bug has been marked as a duplicate of bug 176225 ***
Karl Dubost
Comment 2 2023-04-02 18:32:36 PDT
Ahmad, Darin seems to have been the "recent" (2015) editor of this piece of code https://searchfox.org/wubkat/rev/64453e226bbd56f49b248f0f8816a72e5547e456/Source/WebCore/html/track/WebVTTTokenizer.cpp#120 Latest improvements about HTML Tokenization was done in Bug 140166 The spec is not obviously clear about it. Here's an example which shows yes HTML entities are possible. https://www.w3.org/TR/webvtt1/#example-4a66a3ef > To change that line to left-to-right base direction, start the line with an U+200E LEFT-TO-RIGHT MARK character (it can be escaped as "‎"). but it's an example. The test in http://wpt.live/webvtt/parsing/cue-text-parsing/tests/entities.html https://wpt.fyi/results/webvtt/parsing/cue-text-parsing/tests/entities.html?label=master&label=experimental&aligned it also shows Firefox failing the same test. Let's find out the commit for the test, maybe there is more information. https://github.com/web-platform-tests/wpt/commit/3c01711d2b0dffe60bea034340a83a40dbf17cc1 ha yes it's in the spec. I was looking for HTML entities instead of HTML Character reference. > HTML character reference in data state > Attempt to consume an HTML character reference, with no additional allowed character. > > If nothing is returned, append a U+0026 AMPERSAND character (&) to result. > > Otherwise, append the data of the character tokens that were returned to result. > > Then, in any case, set tokenizer state to the WebVTT data state, and jump to the step labeled next.
Note You need to log in before you can comment on or make changes to this bug.