WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED DUPLICATE of
bug 176225
254889
Support all of HTML's character entities in WebVTT
https://bugs.webkit.org/show_bug.cgi?id=254889
Summary
Support all of HTML's character entities in WebVTT
Ahmad Saleem
Reported
2023-04-02 08:15:32 PDT
Hi Team, While going through Blink's commits, I came across another one, which can be explored in WebKit. Blink Commit -
https://chromium.googlesource.com/chromium/src.git/+/80ccfaf557f5ad07e5de8bcc08e1aba84190b2a0
WPT Test Link -
http://wpt.live/webvtt/parsing/cue-text-parsing/tests/entities.html
Just wanted to raise so we can track it. Thanks! ____ @ap - if you can help, who should be informed on this and CC, it would be good to know for myself as well on who looks into WebVTT in WebKit.
Attachments
Add attachment
proposed patch, testcase, etc.
Alexey Proskuryakov
Comment 1
2023-04-02 18:12:53 PDT
*** This bug has been marked as a duplicate of
bug 176225
***
Karl Dubost
Comment 2
2023-04-02 18:32:36 PDT
Ahmad, Darin seems to have been the "recent" (2015) editor of this piece of code
https://searchfox.org/wubkat/rev/64453e226bbd56f49b248f0f8816a72e5547e456/Source/WebCore/html/track/WebVTTTokenizer.cpp#120
Latest improvements about HTML Tokenization was done in
Bug 140166
The spec is not obviously clear about it. Here's an example which shows yes HTML entities are possible.
https://www.w3.org/TR/webvtt1/#example-4a66a3ef
> To change that line to left-to-right base direction, start the line with an U+200E LEFT-TO-RIGHT MARK character (it can be escaped as "‎").
but it's an example. The test in
http://wpt.live/webvtt/parsing/cue-text-parsing/tests/entities.html
https://wpt.fyi/results/webvtt/parsing/cue-text-parsing/tests/entities.html?label=master&label=experimental&aligned
it also shows Firefox failing the same test. Let's find out the commit for the test, maybe there is more information.
https://github.com/web-platform-tests/wpt/commit/3c01711d2b0dffe60bea034340a83a40dbf17cc1
ha yes it's in the spec. I was looking for HTML entities instead of HTML Character reference.
> HTML character reference in data state > Attempt to consume an HTML character reference, with no additional allowed character. > > If nothing is returned, append a U+0026 AMPERSAND character (&) to result. > > Otherwise, append the data of the character tokens that were returned to result. > > Then, in any case, set tokenizer state to the WebVTT data state, and jump to the step labeled next.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug