RESOLVED FIXED 81123
cue.text fails for some track element cues
https://bugs.webkit.org/show_bug.cgi?id=81123
Summary cue.text fails for some track element cues
Sam Dutton
Reported 2012-03-14 09:20:04 PDT
cue.text appears to lead to 'Unexpected end of input' error. Go to http://samdutton.net/track/bug/cueText.html View console output. Does not appear to be any problem with .vtt file -- have tried resaving, etc. -- sometimes fails on different cues. Have also tried adding quotes to all values. 19.0.1065.0 canary
Attachments
Proposed patch (24.06 KB, patch)
2012-04-06 18:15 PDT, Eric Carlson
webkit.review.bot: commit-queue-
Archive of layout-test-results from ec2-cr-linux-04 (6.26 MB, application/zip)
2012-04-09 07:06 PDT, WebKit Review Bot
no flags
Proposed patch (27.48 KB, patch)
2012-04-13 15:22 PDT, Eric Carlson
no flags
Archive of layout-test-results from ec2-cr-linux-03 (6.40 MB, application/zip)
2012-04-14 04:45 PDT, WebKit Review Bot
no flags
Rebase patch (26.91 KB, patch)
2013-09-26 15:02 PDT, Brendan Long
no flags
Archive of layout-test-results from webkit-ews-16 for mac-mountainlion-wk2 (502.04 KB, application/zip)
2013-09-26 15:54 PDT, Build Bot
no flags
Archive of layout-test-results from webkit-ews-03 for mac-mountainlion (778.57 KB, application/zip)
2013-09-26 16:18 PDT, Build Bot
no flags
captions-long.vtt (9.37 KB, application/octet-stream)
2013-09-26 17:01 PDT, Eric Carlson
no flags
Archive of layout-test-results from webkit-ews-07 for mac-mountainlion (527.59 KB, application/zip)
2013-09-26 17:24 PDT, Build Bot
no flags
Buffer data in WebVTTParser (23.20 KB, patch)
2013-10-07 17:47 PDT, Brendan Long
no flags
Archive of layout-test-results from webkit-ews-16 for mac-mountainlion-wk2 (991.22 KB, application/zip)
2013-10-07 19:51 PDT, Build Bot
no flags
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion (689.50 KB, application/zip)
2013-10-07 20:13 PDT, Build Bot
no flags
Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2 (1.09 MB, application/zip)
2013-10-07 20:30 PDT, Build Bot
no flags
Archive of layout-test-results from webkit-ews-08 for mac-mountainlion (648.62 KB, application/zip)
2013-10-07 20:33 PDT, Build Bot
no flags
Fix buffer logic (22.99 KB, patch)
2013-10-08 13:11 PDT, Brendan Long
no flags
Archive of layout-test-results from webkit-ews-13 for mac-mountainlion-wk2 (672.42 KB, application/zip)
2013-10-08 13:51 PDT, Build Bot
no flags
Archive of layout-test-results from webkit-ews-07 for mac-mountainlion (674.93 KB, application/zip)
2013-10-08 14:00 PDT, Build Bot
no flags
Fix BOM and add NULL check for m_parser (23.60 KB, patch)
2013-10-08 15:01 PDT, Brendan Long
no flags
Use m_state to simplify BOM and Finished handling. (23.75 KB, patch)
2013-10-08 19:39 PDT, Brendan Long
no flags
Only use buffer when needed (23.60 KB, patch)
2013-10-10 01:03 PDT, Brendan Long
no flags
Simplify hasRequiredFileIdentifier (24.56 KB, patch)
2013-10-10 09:18 PDT, Brendan Long
no flags
Archive of layout-test-results from webkit-ews-13 for mac-mountainlion-wk2 (551.29 KB, application/zip)
2013-10-10 10:17 PDT, Build Bot
no flags
Archive of layout-test-results from webkit-ews-04 for mac-mountainlion (567.90 KB, application/zip)
2013-10-10 10:39 PDT, Build Bot
no flags
Fix end-of-line handling in hasRequiredFileIdentifier (24.63 KB, patch)
2013-10-10 11:33 PDT, Brendan Long
no flags
Eric Carlson
Comment 1 2012-03-15 06:51:49 PDT
(In reply to comment #0) > cue.text appears to lead to 'Unexpected end of input' error. > > Go to http://samdutton.net/track/bug/cueText.html > I greatly appreciate your bug reports, but it would be extremely helpful if your test cases also included a .mp4 video file. I investigate all text track bugs and have fixed most of the ones filed in the last six months. Because I use Safari for development, before I can examine one of your test cases I have to download all of the resources it uses, export an MPEG-4 version of your video, and re-host the example. Don't get me wrong, I would *much* rather have your WebM-only test cases than nothing at all! I only mean that it will save me a significant amount of time if you are able to create test cases with both WebM and MPEG-4 videos.
Sam Dutton
Comment 2 2012-03-15 08:14:24 PDT
> I greatly appreciate your bug reports, but it would be extremely helpful if your test cases also included a .mp4 video file. Great idea -- I've added this to all the test cases in the http://samdutton.net/track/bug directory.
Eric Carlson
Comment 3 2012-03-15 08:19:39 PDT
(In reply to comment #2) > > I greatly appreciate your bug reports, but it would be extremely helpful if your test cases also included a .mp4 video file. > > Great idea -- I've added this to all the test cases in the http://samdutton.net/track/bug directory. Thank you very much!
Radar WebKit Bug Importer
Comment 4 2012-03-15 08:29:22 PDT
Eric Carlson
Comment 5 2012-04-06 18:15:35 PDT
Created attachment 136115 [details] Proposed patch
WebKit Review Bot
Comment 6 2012-04-09 07:06:16 PDT
Comment on attachment 136115 [details] Proposed patch Attachment 136115 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/12372184 New failing tests: media/track/track-long-captions-file.html media/track/track-webvtt-tc028-unsupported-markup.html media/track/track-cues-pause-on-exit.html media/track/track-cues-sorted-before-dispatch.html
WebKit Review Bot
Comment 7 2012-04-09 07:06:21 PDT
Created attachment 136218 [details] Archive of layout-test-results from ec2-cr-linux-04 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-04 Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'> Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Eric Carlson
Comment 8 2012-04-13 15:21:13 PDT
(In reply to comment #7) > Created an attachment (id=136218) [details] > Archive of layout-test-results from ec2-cr-linux-04 > > The attached test failures were seen while running run-webkit-tests on the chromium-ews. > Bot: ec2-cr-linux-04 Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'> Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick The failing media tests are: track-webvtt-tc028-unsupported-markup.html - will fix track-cues-pause-on-exit.html - this is https://bugs.webkit.org/show_bug.cgi?id=80067 track-cues-sorted-before-dispatch.html - this is https://bugs.webkit.org/show_bug.cgi?id=77862
Eric Carlson
Comment 9 2012-04-13 15:22:15 PDT
Created attachment 137163 [details] Proposed patch
WebKit Review Bot
Comment 10 2012-04-14 04:45:46 PDT
Comment on attachment 137163 [details] Proposed patch Attachment 137163 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/12399746 New failing tests: media/track/track-long-captions-file.html media/track/track-cues-pause-on-exit.html media/track/track-cues-sorted-before-dispatch.html
WebKit Review Bot
Comment 11 2012-04-14 04:45:54 PDT
Created attachment 137205 [details] Archive of layout-test-results from ec2-cr-linux-03 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-03 Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'> Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Anna Cavender
Comment 12 2012-08-15 10:51:05 PDT
What ever happened to this bug and the associated patch? Eric, I have a hunch that this may fix the timeouts we are seeing on track-cues-sorted-before-dispatch.html. WDYT? http://test-results.appspot.com/dashboards/flakiness_dashboard.html#showExpectations=true&tests=track-cues-sorted-before-dispatch
Darin Adler
Comment 13 2013-05-14 10:33:56 PDT
Comment on attachment 137163 [details] Proposed patch This old patch looks fine. I’m setting review+ but not sure where we stand on it.
Brendan Long
Comment 14 2013-09-26 15:00:47 PDT
The patch refers to captions-webvtt/captions-long.vtt, but it doesn't seem to exist.
Brendan Long
Comment 15 2013-09-26 15:02:10 PDT
Created attachment 212756 [details] Rebase patch This isn't done yet, since captions-long.vtt doesn't seem to exist..
Build Bot
Comment 16 2013-09-26 15:54:26 PDT
Comment on attachment 212756 [details] Rebase patch Attachment 212756 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/2617023 New failing tests: media/track/track-cue-rendering-rtl.html media/track/track-webvtt-tc022-entities.html media/track/track-webvtt-tc023-markup.html media/track/track-long-captions-file.html media/track/track-webvtt-tc028-unsupported-markup.html media/track/track-webvtt-tc024-timestamp.html
Build Bot
Comment 17 2013-09-26 15:54:28 PDT
Created attachment 212763 [details] Archive of layout-test-results from webkit-ews-16 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-16 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 18 2013-09-26 16:18:23 PDT
Comment on attachment 212756 [details] Rebase patch Attachment 212756 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/2629002 New failing tests: media/track/track-cue-rendering-rtl.html media/track/track-webvtt-tc022-entities.html media/track/track-webvtt-tc023-markup.html media/track/track-long-captions-file.html media/track/track-webvtt-tc028-unsupported-markup.html media/track/track-webvtt-tc024-timestamp.html
Build Bot
Comment 19 2013-09-26 16:18:26 PDT
Created attachment 212766 [details] Archive of layout-test-results from webkit-ews-03 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-03 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Eric Carlson
Comment 20 2013-09-26 17:01:58 PDT
Created attachment 212769 [details] captions-long.vtt VTT file with 100 captions
Build Bot
Comment 21 2013-09-26 17:24:51 PDT
Comment on attachment 212756 [details] Rebase patch Attachment 212756 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/2635024 New failing tests: media/track/track-cue-rendering-rtl.html media/track/track-webvtt-tc022-entities.html media/track/track-webvtt-tc023-markup.html media/track/track-long-captions-file.html media/track/track-webvtt-tc028-unsupported-markup.html media/track/track-webvtt-tc024-timestamp.html
Build Bot
Comment 22 2013-09-26 17:24:54 PDT
Created attachment 212772 [details] Archive of layout-test-results from webkit-ews-07 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-07 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Brendan Long
Comment 23 2013-09-27 13:32:56 PDT
The way this changes tests is suspicious. Shouldn't a line break either be preserved, or be converted to a space, not just removed?
Brendan Long
Comment 24 2013-10-07 10:50:56 PDT
According to the spec[1]: > Line breaks in cues are honored. User agents will also insert extra line breaks if necessary to fit the cue in the cue's width. In general, therefore, authors are encouraged to write cues all on one line except when a line break is definitely necessary, and to not manually line-wrap for aesthetic reasons alone. So, the test change where \n is removed from the expected cues seems to be wrong.. [1] http://dev.w3.org/html5/webvtt/#cues-with-multiple-lines
Brendan Long
Comment 25 2013-10-07 17:47:09 PDT
Created attachment 213632 [details] Buffer data in WebVTTParser The running theme in a lot of bug reports (see also #95691 and possibly #97097) is that we cut off parsing in the wrong place. This patch adds a buffer. For some reason the test gets a cross-origin error on WebKitGTK though, and I'm not entirely sure if just removing deprecatedDidReceiveCachedResource is correct (although it seems to work, and the FIXME comment suggests that that's what we should do).
Build Bot
Comment 26 2013-10-07 19:51:46 PDT
Comment on attachment 213632 [details] Buffer data in WebVTTParser Attachment 213632 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/3422067 New failing tests: media/track/track-cue-rendering-rtl.html media/track/track-mode.html media/track/track-cues-pause-on-exit.html media/track/track-cue-mutable.html media/track/track-remove-by-setting-innerHTML.html media/track/track-cue-rendering-with-padding.html media/track/track-cue-rendering.html media/track/track-long-captions-file.html media/track/track-cue-nothing-to-render.html media/track/track-webvtt-tc003-newlines.html media/track/track-large-timestamp.html media/track/track-webvtt-tc002-bom.html media/track/track-active-cues.html media/track/track-cue-container-rendering-position.html
Build Bot
Comment 27 2013-10-07 19:51:49 PDT
Created attachment 213641 [details] Archive of layout-test-results from webkit-ews-16 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-16 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 28 2013-10-07 20:13:34 PDT
Comment on attachment 213632 [details] Buffer data in WebVTTParser Attachment 213632 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/3710079 New failing tests: media/track/track-cue-rendering-rtl.html media/track/track-mode.html media/track/track-cues-pause-on-exit.html media/track/track-cue-mutable.html media/track/track-remove-by-setting-innerHTML.html media/track/track-cue-rendering-with-padding.html media/track/track-cue-rendering.html media/track/track-long-captions-file.html media/track/track-cue-nothing-to-render.html media/track/track-webvtt-tc003-newlines.html media/track/track-large-timestamp.html media/track/track-webvtt-tc002-bom.html media/track/track-active-cues.html media/track/track-cue-container-rendering-position.html
Build Bot
Comment 29 2013-10-07 20:13:38 PDT
Created attachment 213643 [details] Archive of layout-test-results from webkit-ews-01 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-01 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Build Bot
Comment 30 2013-10-07 20:30:25 PDT
Comment on attachment 213632 [details] Buffer data in WebVTTParser Attachment 213632 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/3342063 New failing tests: media/track/track-cue-rendering-rtl.html media/track/track-mode.html media/track/track-cues-pause-on-exit.html media/track/track-cue-mutable.html media/track/track-remove-by-setting-innerHTML.html media/track/track-cue-rendering-with-padding.html media/track/track-cue-rendering.html media/track/track-long-captions-file.html media/track/track-cue-nothing-to-render.html media/track/track-webvtt-tc003-newlines.html media/track/track-large-timestamp.html media/track/track-webvtt-tc002-bom.html media/track/track-active-cues.html media/track/track-cue-container-rendering-position.html
Build Bot
Comment 31 2013-10-07 20:30:28 PDT
Created attachment 213645 [details] Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-09 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 32 2013-10-07 20:33:46 PDT
Comment on attachment 213632 [details] Buffer data in WebVTTParser Attachment 213632 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/3746012 New failing tests: media/track/track-cue-rendering-rtl.html media/track/track-mode.html media/track/track-cues-pause-on-exit.html media/track/track-cue-mutable.html media/track/track-remove-by-setting-innerHTML.html media/track/track-cue-rendering-with-padding.html media/track/track-cue-rendering.html media/track/track-long-captions-file.html media/track/track-cue-nothing-to-render.html media/track/track-webvtt-tc003-newlines.html media/track/track-large-timestamp.html media/track/track-webvtt-tc002-bom.html media/track/track-active-cues.html media/track/track-cue-container-rendering-position.html
Build Bot
Comment 33 2013-10-07 20:33:49 PDT
Created attachment 213646 [details] Archive of layout-test-results from webkit-ews-08 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-08 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Brendan Long
Comment 34 2013-10-08 13:11:05 PDT
Created attachment 213710 [details] Fix buffer logic My original version of this patch had a lot of little problems. deprecatedDidReceiveCachedResource() is definitely needed (although maybe it can be replaced with dataReceived()?). With that in place, m_cueParser->fileFinished() needs to go in notifyFinished() (so it doesn't get called every time we get data). I also added m_finished with an ASSERT so this doesn't bite anyone else in the future. I also messed up the handling of line breaks: According to the spec, any of \r, \n, and \r\n are all allowed. The current patch should fix all of these things, and the test passes on WebKitGTK (I'm building a different patch on my Mac, but I'll go to that if this fails on EWS again).
Build Bot
Comment 35 2013-10-08 13:51:28 PDT
Comment on attachment 213710 [details] Fix buffer logic Attachment 213710 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/3751126 New failing tests: media/track/track-cues-pause-on-exit.html media/track/track-webvtt-tc000-empty.html media/track/track-mode.html media/track/track-cue-container-rendering-position.html media/track/track-cue-mutable.html media/track/track-cue-rendering-rtl.html media/track/track-cue-rendering-with-padding.html media/track/track-load-error-readyState.html media/track/track-long-captions-file.html media/track/track-cue-nothing-to-render.html media/track/track-large-timestamp.html media/track/track-webvtt-tc002-bom.html media/track/track-addtrack-kind.html media/track/track-active-cues.html
Build Bot
Comment 36 2013-10-08 13:51:31 PDT
Created attachment 213716 [details] Archive of layout-test-results from webkit-ews-13 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-13 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 37 2013-10-08 14:00:34 PDT
Comment on attachment 213710 [details] Fix buffer logic Attachment 213710 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/3760061 New failing tests: media/track/track-cues-pause-on-exit.html media/track/track-webvtt-tc000-empty.html media/track/track-mode.html media/track/track-cue-container-rendering-position.html media/track/track-cue-mutable.html media/track/track-cue-rendering-rtl.html media/track/track-cue-rendering-with-padding.html media/track/track-load-error-readyState.html media/track/track-long-captions-file.html media/track/track-cue-nothing-to-render.html media/track/track-large-timestamp.html media/track/track-webvtt-tc002-bom.html media/track/track-active-cues.html
Build Bot
Comment 38 2013-10-08 14:00:38 PDT
Created attachment 213718 [details] Archive of layout-test-results from webkit-ews-07 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-07 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Brendan Long
Comment 39 2013-10-08 15:01:27 PDT
Created attachment 213723 [details] Fix BOM and add NULL check for m_parser There were two remaining problems: Because this patch converts to a String up-front, the BOM gets converted to a UTF-16 BOM (0xFEFF instead of 0xEFBBBF). This solution handles that comparing against String::fromUTF8(xEFxBBFF). I could do a more complicated version if you think it's important to explicitly check the original data for 0xEFBBFF. The other problem was that I didn't check if m_parser existed before using it in TrackLoader::notifyFinished().
Glenn Adams
Comment 40 2013-10-08 15:10:18 PDT
(In reply to comment #39) > Created an attachment (id=213723) [details] > Fix BOM and add NULL check for m_parser > > There were two remaining problems: Because this patch converts to a String up-front, the BOM gets converted to a UTF-16 BOM (0xFEFF instead of 0xEFBBBF). This solution handles that comparing against String::fromUTF8(xEFxBBFF). I could do a more complicated version if you think it's important to explicitly check the original data for 0xEFBBFF. The other problem was that I didn't check if m_parser existed before using it in TrackLoader::notifyFinished(). What do you mean by "check the original data for 0xEFBBBF"? Do you mean check UTF-8 encoded string for the presence of the three contiguous bytes 0xEF 0xBB 0xBF
Build Bot
Comment 41 2013-10-08 15:28:15 PDT
Comment on attachment 213723 [details] Fix BOM and add NULL check for m_parser Attachment 213723 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/3759070
Brendan Long
Comment 42 2013-10-08 15:29:10 PDT
(In reply to comment #40) > What do you mean by "check the original data for 0xEFBBBF"? Do you mean check UTF-8 encoded string for the presence of the three contiguous bytes 0xEF 0xBB 0xBF I mean that technically we can't check `line` for the UTF-8 BOM because it's a UTF-16 String, and UTF-16 and UTF-8 have different BOMs. The way this was handled previously was by using a Vector<char> instead of `line`. That seems wasteful now that we buffer everything, but I could take that approach if reviewers don't like this. I'm pretty sure it makes no difference, but I figured I'd mention it since it's "surprising".
Glenn Adams
Comment 43 2013-10-08 15:35:32 PDT
(In reply to comment #42) > (In reply to comment #40) > > What do you mean by "check the original data for 0xEFBBBF"? Do you mean check UTF-8 encoded string for the presence of the three contiguous bytes 0xEF 0xBB 0xBF > > I mean that technically we can't check `line` for the UTF-8 BOM because it's a UTF-16 String, and UTF-16 and UTF-8 have different BOMs. The way this was handled previously was by using a Vector<char> instead of `line`. That seems wasteful now that we buffer everything, but I could take that approach if reviewers don't like this. > > I'm pretty sure it makes no difference, but I figured I'd mention it since it's "surprising". If the originally encoded UTF-8 content has been converted to 16-bit UTF-16, then it is pointless checking for the UTF-8 encoding of BOM since it will have been converted to 0xFEFF. After conversion, the BOM could be discarded or ignored because it no longer has a role to play.
Build Bot
Comment 44 2013-10-08 15:44:13 PDT
Comment on attachment 213723 [details] Fix BOM and add NULL check for m_parser Attachment 213723 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/3422306
Brendan Long
Comment 45 2013-10-08 15:45:59 PDT
(In reply to comment #43) > If the originally encoded UTF-8 content has been converted to 16-bit UTF-16, then it is pointless checking for the UTF-8 encoding of BOM since it will have been converted to 0xFEFF. Yes, exactly. > After conversion, the BOM could be discarded or ignored because it no longer has a role to play. We're required to check for the BOM by the WebVTT spec. What I'm saying is that my code technically converts the data to UTF-16 and checks for a UTF-16 BOM instead of directly checking for a UTF-8 BOM. I don't think this changes anything, but it's worth pointing out.
Glenn Adams
Comment 46 2013-10-08 18:24:45 PDT
(In reply to comment #45) > (In reply to comment #43) > > If the originally encoded UTF-8 content has been converted to 16-bit UTF-16, then it is pointless checking for the UTF-8 encoding of BOM since it will have been converted to 0xFEFF. > > Yes, exactly. > > > After conversion, the BOM could be discarded or ignored because it no longer has a role to play. > > We're required to check for the BOM by the WebVTT spec. What I'm saying is that my code technically converts the data to UTF-16 and checks for a UTF-16 BOM instead of directly checking for a UTF-8 BOM. I don't think this changes anything, but it's worth pointing out. First, there is no specific requirement for use of the BOM. It is simply an optional three byte prefix prior to the string "WEBVTT", and it (BOM) only applies in the UTF-8 encoded form, and further, it is optional. So, if you've already transcoded from UTF-8 to UTF-16, then if there was a BOM in the UTF-8 source stream, it has been either discarded or transcoded to 0xFEFF. At that point, there is nothing else to do with it. It could have been dropped when transcoding into UTF-16 without changing any behavior.
Glenn Adams
Comment 47 2013-10-08 18:33:36 PDT
(In reply to comment #46) > (In reply to comment #45) > > (In reply to comment #43) > > > If the originally encoded UTF-8 content has been converted to 16-bit UTF-16, then it is pointless checking for the UTF-8 encoding of BOM since it will have been converted to 0xFEFF. > > > > Yes, exactly. > > > > > After conversion, the BOM could be discarded or ignored because it no longer has a role to play. > > > > We're required to check for the BOM by the WebVTT spec. What I'm saying is that my code technically converts the data to UTF-16 and checks for a UTF-16 BOM instead of directly checking for a UTF-8 BOM. I don't think this changes anything, but it's worth pointing out. > > First, there is no specific requirement for use of the BOM. It is simply an optional three byte prefix prior to the string "WEBVTT", and it (BOM) only applies in the UTF-8 encoded form, and further, it is optional. > > So, if you've already transcoded from UTF-8 to UTF-16, then if there was a BOM in the UTF-8 source stream, it has been either discarded or transcoded to 0xFEFF. At that point, there is nothing else to do with it. It could have been dropped when transcoding into UTF-16 without changing any behavior. Perhaps I should add by quoting Unicode Section 16.8 [1]: "Systems that use the byte order mark must recognize when an initial U+FEFF signals the byte order. In those cases, it is not part of the textual content and should be removed before processing, because otherwise it may be mistaken for a legitimate zero width no-break space." So you can discard it if you want. The only function of the UTF-8 BOM in WebVTT is for signature detection, and not determining byte order. [1] http://www.unicode.org/versions/Unicode6.2.0/ch16.pdf
Brendan Long
Comment 48 2013-10-08 19:00:32 PDT
Glenn, I wasn't saying that it's a problem (I wouldn't have included it in my patch if I thought it was), I just thought it was something reviewers should be aware of. I think I have an idea of how to handle this better anyway though, so hopefully it won't actually matter.
Glenn Adams
Comment 49 2013-10-08 19:14:56 PDT
(In reply to comment #48) > Glenn, I wasn't saying that it's a problem (I wouldn't have included it in my patch if I thought it was), I just thought it was something reviewers should be aware of. I think I have an idea of how to handle this better anyway though, so hopefully it won't actually matter. Sure. Just trying to make sure we're on the same page.
Brendan Long
Comment 50 2013-10-08 19:39:24 PDT
Created attachment 213743 [details] Use m_state to simplify BOM and Finished handling. I spent some time today and got DumpRenderTree working on my Mac again (apparently it's just broken in debug builds?), so this time I can verify on my own machine that the tests pass. I removed m_finished and just added a Finished state. I also added a CheckedBOM state, and moved the BOM check to near the top of parseBytes, which lets me check for the BOM while I'm still dealing with the raw data (this is simpler, and much less confusing).
Brendan Long
Comment 51 2013-10-10 01:03:19 PDT
Created attachment 213857 [details] Only use buffer when needed Ok, here's one more improvement. It bothered me that my patch was buffering everything, even if we didn't need it. This version only copies data to the buffer if we run out of data without finding a line ending. Because of this, collectNextLine() doesn't need to look at the contents of the buffer, since we already know it doesn't contain any line endings. More importantly, since it always contains one partial line, instead of removing from the front of the vector, we can just clear it, which is significantly more efficient.
Brendan Long
Comment 52 2013-10-10 01:05:58 PDT
I'm also planning to replace the `unsigned* position` parameters with `unsigned& position`, but I'd rather do that separately.
Brendan Long
Comment 53 2013-10-10 09:18:06 PDT
Created attachment 213890 [details] Simplify hasRequiredFileIdentifier The last thing that really bothered me in this patch was the substring creation in hasRequiredFileIdentifier. The new version just walks through the string, which avoids the copies and ends up simpler anyway. I also made the test use canplaythrough since the addtrack event was making it flaky. I think I'm actually done this time.
Build Bot
Comment 54 2013-10-10 10:17:52 PDT
Comment on attachment 213890 [details] Simplify hasRequiredFileIdentifier Attachment 213890 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/3879050 New failing tests: media/track/track-cue-rendering-rtl.html media/track/track-mode.html media/track/track-cues-pause-on-exit.html media/track/track-cue-mutable.html media/track/track-remove-by-setting-innerHTML.html media/track/track-cue-rendering-with-padding.html media/track/track-cue-nothing-to-render.html media/track/track-large-timestamp.html media/track/track-webvtt-tc002-bom.html media/track/track-active-cues.html media/track/track-cue-container-rendering-position.html
Build Bot
Comment 55 2013-10-10 10:17:55 PDT
Created attachment 213898 [details] Archive of layout-test-results from webkit-ews-13 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-13 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 56 2013-10-10 10:39:10 PDT
Comment on attachment 213890 [details] Simplify hasRequiredFileIdentifier Attachment 213890 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/3874044 New failing tests: media/track/track-cue-rendering-rtl.html media/track/track-mode.html media/track/track-cues-pause-on-exit.html media/track/track-cue-mutable.html media/track/track-remove-by-setting-innerHTML.html media/track/track-cue-rendering-with-padding.html media/track/track-cue-nothing-to-render.html media/track/track-large-timestamp.html media/track/track-webvtt-tc002-bom.html media/track/track-active-cues.html media/track/track-cue-container-rendering-position.html
Build Bot
Comment 57 2013-10-10 10:39:14 PDT
Created attachment 213900 [details] Archive of layout-test-results from webkit-ews-04 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-04 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Brendan Long
Comment 58 2013-10-10 11:33:07 PDT
Created attachment 213907 [details] Fix end-of-line handling in hasRequiredFileIdentifier I forgot to increment the position variable. Since it was hard to notice, I wrote the function differently so it's always obvious what the line position should be.
Brendan Long
Comment 59 2013-10-14 13:54:15 PDT
Is there anything more I need to do on this to get it reviewed/accepted?
Eric Carlson
Comment 60 2013-10-22 11:10:11 PDT
(In reply to comment #59) > Is there anything more I need to do on this to get it reviewed/accepted? Sorry, I have been out of the office for two weeks. I will get to this in the next day or two.
WebKit Commit Bot
Comment 61 2013-10-22 11:44:24 PDT
Comment on attachment 213907 [details] Fix end-of-line handling in hasRequiredFileIdentifier Clearing flags on attachment: 213907 Committed r157801: <http://trac.webkit.org/changeset/157801>
WebKit Commit Bot
Comment 62 2013-10-22 11:44:31 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.