WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED WONTFIX
93455
Extract DOM modifications in the HTML5 parser into HTMLParserDOMMutationQueue.
https://bugs.webkit.org/show_bug.cgi?id=93455
Summary
Extract DOM modifications in the HTML5 parser into HTMLParserDOMMutationQueue.
Kwang Yul Seo
Reported
2012-08-08 02:18:10 PDT
This patch extracts DOM modifications in the parser into DOMModifier and makes the tree builder and the element stack use it. DOMModifier can change its behavior dynamically by changing its strategy. The default strategy of DOMModifier is set to ImmediateDOMModifierStrategy which calls DOM operations synchronously. This refactoring is a preparation for speculative parsing. To run the parser in speculation mode, we must be able to queue up DOM modifications. We can do this by adding DeferredDOMModifierStategy which queues up DOM operations and execute them later. Both ImmediateDOMModifierStrategy and DeferredDOMModifierStategy eventually perform the same DOM operations. So I added DOMModifierExecutor which performs the actual DOM operations. DOMModifierStrategy classes should delegate to DOMModifierExecutor to remove code duplication. Changed HTMLDocumentParser to own a DOMModifier instance and pass it to HTMLTreeBuilder, HTMLConstructionSite and HTMLElementStack. This patch is not complete. HTMLConstructionSite still has DOM modifications including element creation. I will move the remaining DOM modifications into DOMModifier in a follow-up patch.
Attachments
Patch (not for landing)
(32.06 KB, patch)
2012-08-08 02:21 PDT
,
Kwang Yul Seo
no flags
Details
Formatted Diff
Diff
Patch (not for landing)
(39.82 KB, patch)
2012-08-08 05:13 PDT
,
Kwang Yul Seo
no flags
Details
Formatted Diff
Diff
Patch (not for landing)
(37.18 KB, patch)
2012-08-08 16:31 PDT
,
Kwang Yul Seo
no flags
Details
Formatted Diff
Diff
Patch
(99.63 KB, patch)
2012-08-09 05:18 PDT
,
Kwang Yul Seo
no flags
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Kwang Yul Seo
Comment 1
2012-08-08 02:21:02 PDT
Created
attachment 157160
[details]
Patch (not for landing)
Kwang Yul Seo
Comment 2
2012-08-08 02:26:35 PDT
Eric, I've implemented the idea we discussed in
Bug 93113
. Would you take a look at it? I am a bit worried about the indirection and virtual method overhead.
Kwang Yul Seo
Comment 3
2012-08-08 02:34:24 PDT
rniwa, I changed the name 'detachNode' to 'removeFromParent' in this patch.
Eric Seidel (no email)
Comment 4
2012-08-08 03:26:13 PDT
Interesting. Do we know how much of a regression this would be for perf?
Kwang Yul Seo
Comment 5
2012-08-08 04:02:24 PDT
(In reply to
comment #4
)
> Interesting. Do we know how much of a regression this would be for perf?
This patch causes a 0.3% performance regression on the html5-parser benchmark.
Kwang Yul Seo
Comment 6
2012-08-08 05:13:06 PDT
Created
attachment 157188
[details]
Patch (not for landing)
Kwang Yul Seo
Comment 7
2012-08-08 05:13:43 PDT
(In reply to
comment #6
)
> Created an attachment (id=157188) [details] > Patch (not for landing)
I added more DOM modifications to DOMModifier.
Kwang Yul Seo
Comment 8
2012-08-08 05:26:28 PDT
If this patch looks okay, I will move the rest of DOM modifications to DOMModifier and rebase NodeProxy patch (
https://github.com/kseo/webkit/commit/b9e6009013be65a629dcbebc7e4088c4c44f51e7
) and speculative parser implementation on top of this patch.
Gyuyoung Kim
Comment 9
2012-08-08 05:47:27 PDT
Comment on
attachment 157188
[details]
Patch (not for landing)
Attachment 157188
[details]
did not pass efl-ews (efl): Output:
http://queues.webkit.org/results/13459207
Build Bot
Comment 10
2012-08-08 05:52:59 PDT
Comment on
attachment 157188
[details]
Patch (not for landing)
Attachment 157188
[details]
did not pass win-ews (win): Output:
http://queues.webkit.org/results/13451774
Early Warning System Bot
Comment 11
2012-08-08 05:59:57 PDT
Comment on
attachment 157188
[details]
Patch (not for landing)
Attachment 157188
[details]
did not pass qt-ews (qt): Output:
http://queues.webkit.org/results/13451777
Early Warning System Bot
Comment 12
2012-08-08 06:09:31 PDT
Comment on
attachment 157188
[details]
Patch (not for landing)
Attachment 157188
[details]
did not pass qt-wk2-ews (qt): Output:
http://queues.webkit.org/results/13453624
Alexey Proskuryakov
Comment 13
2012-08-08 10:17:01 PDT
"Modifier" is not a noun that conveys much meaning. In a sense, half of WebCore code is a "DOM modifier". Is there a more specific way to describe the code you're extracting?
Ryosuke Niwa
Comment 14
2012-08-08 10:38:22 PDT
Comment on
attachment 157188
[details]
Patch (not for landing) View in context:
https://bugs.webkit.org/attachment.cgi?id=157188&action=review
> Source/WebCore/html/parser/DOMModifier.h:42 > +class DOMModifier {
The class name reflect the fact it's specific to parser.
> Source/WebCore/html/parser/DOMModifierExecutor.h:37 > +class DOMModifierExecutor {
Do we really need this class?
> Source/WebCore/html/parser/DOMModifierStrategy.h:42 > + virtual void beginParsingChildren(Node*) = 0; > + virtual void finishParsingChildren(Node*) = 0;
I am extremely concerned about the implication of all these new virtual function calls. If there will be only two strategies, it's probably better to embed if-else in the DOMModifier class itself instead of using polymorphism.
Kwang Yul Seo
Comment 15
2012-08-08 14:15:58 PDT
(In reply to
comment #13
)
> "Modifier" is not a noun that conveys much meaning. In a sense, half of WebCore code is a "DOM modifier". > > Is there a more specific way to describe the code you're extracting?
It's a "DOM modifier" specific to the HTML parser. After this patch, the rest of the HTML parser never (and must not) modifies the DOM. Why don't we call it HTMLParserDOMModifier? Does it sound better?
Ryosuke Niwa
Comment 16
2012-08-08 14:17:47 PDT
Comment on
attachment 157188
[details]
Patch (not for landing) I don't think we should be adding this many indirections and classes.
Kwang Yul Seo
Comment 17
2012-08-08 14:35:43 PDT
(In reply to
comment #16
)
> (From update of
attachment 157188
[details]
) > I don't think we should be adding this many indirections and classes.
Another option is to remove ImmediateDOMModifierStrategy and DOMModifierExecutor and embed DOMModifierExecutor as the default behavior in DOMModifier. void DOMModifier::beginParsingChildren(Node* node) { ASSERT(node); if (!m_strategy) { node->beginParsingChildren(); return; } m_strategy->beginParsingChildren(node); } Now we have only one additional NULL check in non-speculative parsing code paths. This also reduces the number of classes involved in this indirection.
Ryosuke Niwa
Comment 18
2012-08-08 14:43:38 PDT
(In reply to
comment #17
)
> (In reply to
comment #16
) > > (From update of
attachment 157188
[details]
[details]) > > I don't think we should be adding this many indirections and classes. > > Another option is to remove ImmediateDOMModifierStrategy and DOMModifierExecutor and embed DOMModifierExecutor as the default behavior in DOMModifier. > > void DOMModifier::beginParsingChildren(Node* node) > { > ASSERT(node); > if (!m_strategy) { > node->beginParsingChildren(); > return; > } > > m_strategy->beginParsingChildren(node); > } > > Now we have only one additional NULL check in non-speculative parsing code paths. This also reduces the number of classes involved in this indirection.
Why do we need this DOMModifier/Strategy classes at all? can't we just put the contents of each beginParsingChildren implementation right there?
Kwang Yul Seo
Comment 19
2012-08-08 15:05:57 PDT
(In reply to
comment #18
)
> Why do we need this DOMModifier/Strategy classes at all? can't we just put the contents of each beginParsingChildren implementation right there?
eseidel and I discussed this issue on
https://bugs.webkit.org/show_bug.cgi?id=93113
. DOMModifier is an abstraction to handle queueing of DOM operations (and cross-thread DOM mutation later) so that the parser can continue parsing speculatively while waiting for the parser blocking script to be loaded. If the speculation ends up succeeding, we perform the queued DOM operations at once. Otherwise, we throw them away. I admit DOMModifierStrategy might be an overkill here. I can certainly put the queueing of DOM operations directly in DOMModifier class. Let's listen to eseidel's opinion on this.
Kwang Yul Seo
Comment 20
2012-08-08 16:31:06 PDT
Created
attachment 157329
[details]
Patch (not for landing)
Ryosuke Niwa
Comment 21
2012-08-08 16:36:26 PDT
(In reply to
comment #19
)
> DOMModifier is an abstraction to handle queueing of DOM operations (and cross-thread DOM mutation later) so that the parser can continue parsing speculatively while waiting for the parser blocking script to be loaded. If the speculation ends up succeeding, we perform the queued DOM operations at once. Otherwise, we throw them away.
HTMLParserDOMMutationQueue? Anyhow, I would much prefer seeing code like: if (m_inSpeculativeParsing) { ~~ } else { ~~ } over m_strategy->X() where we have two X()'s. The latter pattern is nice if we're going to have more than few cases and we want it to be extensible but it appears that we only have two code paths in this case.
Kwang Yul Seo
Comment 22
2012-08-08 16:51:37 PDT
(In reply to
comment #21
)
> HTMLParserDOMMutationQueue?
It seems a bit weird to me that HTMLParserDOMMutationQueue can synchronously call DOM operations.
> Anyhow, I would much prefer seeing code like: > > if (m_inSpeculativeParsing) { > ~~ > } else { > ~~ > } > > over > > m_strategy->X() where we have two X()'s. > > The latter pattern is nice if we're going to have more than few cases and we want it to be extensible but it appears that we only have two code paths in this case.
That's exactly what I did in my previous try. At that time, I put DOM modifications into HTMLConstructionSite.
https://github.com/kseo/webkit/commit/cee9b035c7c14e73739467067fa96291d58e8082
e.g., void scheduleBeginParsingChildren(NodeProxy* node) { if (m_isSpeculating) m_taskQueue.append(bind(&HTMLConstructionSite::beginParsingChildrenTask, this, PassRefPtr<NodeProxy>(node))); else beginParsingChildrenTask(node); } The reason why I chose the latter pattern in this patch is to keep the speculative parsing code in a separate class. I thought the performance penalty caused by virtual method calls is acceptable in speculation code paths. However, I agree to your point. m_strategy->X() pattern is an overkill if there are only one or two strategies. I will see if I can do this better, or I will follow your suggestion! Thanks.
Kwang Yul Seo
Comment 23
2012-08-08 16:53:58 PDT
(In reply to
comment #20
)
> Created an attachment (id=157329) [details] > Patch (not for landing)
Changes over the previous patch: - Changed the name DOMModifier to HTMLParserDOMModifier because this class is specific to the HTML parser. - Removed ImmediateDOMModifierStrategy. Now HTMLParserDOMModifier synchronously calls DOM operations if no HTMLParserDOMModifierStrategy is provided. This removes virtual method calls in non-speculative parsing. - Kept HTMLParserDOMModifierExecutor. Otherwise, DeferredDOMModifierStategy (which will be added later) can't reuse the code.
Ryosuke Niwa
Comment 24
2012-08-08 16:59:57 PDT
(In reply to
comment #22
)
> (In reply to
comment #21
) > > HTMLParserDOMMutationQueue? > > It seems a bit weird to me that HTMLParserDOMMutationQueue can synchronously call DOM operations.
Why not? We have this pattern elsewhere in the WebKit. e.g. ScopedEventQueue. I also dislike the use of the term "modifier" here. "DOMModifier" doesn't tell us anything about the nature or the purpose of the class. All we know is that it mutates DOM, as Alexey pointed out, is what a vast majority of WebCore classes do. What we're trying to do here is to queue up possible DOM mutations when we're in the speculative parsing.
Ryosuke Niwa
Comment 25
2012-08-08 17:04:33 PDT
(In reply to
comment #22
)
> void scheduleBeginParsingChildren(NodeProxy* node) > { > if (m_isSpeculating) > m_taskQueue.append(bind(&HTMLConstructionSite::beginParsingChildrenTask, this, PassRefPtr<NodeProxy>(node))); > else > beginParsingChildrenTask(node); > }
This code is much clearer than the one we have in your patch here.
> The reason why I chose the latter pattern in this patch is to keep the speculative parsing code in a separate class. I thought the performance penalty caused by virtual method calls is acceptable in speculation code paths.
It's not about virtual function calls. It's about the clarity. By adding extra two years of classes, we're obscuring what the code does. Imagine you don't know anything about HTML parser and trying to decipher what's happening that code. After your patch, we have to go through 3 layers of classes to figure out what it does. Worse yet, just looking at the DOMModififer class is not sufficient to figure out what will happen because the actual implementation is hidden behind m_strategy. Now we have to find all subclasses of the strategy class and when and how they're associated with DOMModifier. That's A LOT more work than if we had a single boolean indicating whether we're in the speculative parsing or not. In that case, the intent of this class and what it does is self-evidently clear.
Kwang Yul Seo
Comment 26
2012-08-08 17:14:25 PDT
(In reply to
comment #24
)
> Why not? We have this pattern elsewhere in the WebKit. e.g. ScopedEventQueue.
I didn't know that. If this naming pattern is common in WebKit, I think HTMLParserDOMMutationQueue is okay!
> I also dislike the use of the term "modifier" here. "DOMModifier" doesn't tell us anything about the nature or the purpose of the class. All we know is that it mutates DOM, as Alexey pointed out, is what a vast majority of WebCore classes do.
My intention was to collect all DOM modifications into one class. Then, we can dynamically add or remove a strategy which queues DOM mutations. DOMModifier simply modifies the DOM, while DeferredDOMModifierStrategy queues up DOM mutations. But if we combine these classes into one as you suggested, then the name should be HTMLParserDOMMutationQueue.
Kwang Yul Seo
Comment 27
2012-08-08 17:17:14 PDT
(In reply to
comment #25
)
> It's not about virtual function calls. It's about the clarity. By adding extra two years of classes, we're obscuring what the code does. Imagine you don't know anything about HTML parser and trying to decipher what's happening that code. After your patch, we have to go through 3 layers of classes to figure out what it does. > > Worse yet, just looking at the DOMModififer class is not sufficient to figure out what will happen because the actual implementation is hidden behind m_strategy. Now we have to find all subclasses of the strategy class and when and how they're associated with DOMModifier. That's A LOT more work than if we had a single boolean indicating whether we're in the speculative parsing or not. In that case, the intent of this class and what it does is self-evidently clear.
I see! I will follow your suggestion and combine these classes into HTMLParserDOMMutationQueue.
Kwang Yul Seo
Comment 28
2012-08-09 05:18:01 PDT
Created
attachment 157441
[details]
Patch
Adam Barth
Comment 29
2012-08-09 10:50:17 PDT
Sorry for not commenting earlier. @skyul: I thought we were going to evaluate the speculative parser on a branch before landing more parser changes.
Adam Barth
Comment 30
2012-08-09 10:51:56 PDT
I see that you've already answered my question in
https://bugs.webkit.org/show_bug.cgi?id=90751#c10
Kwang Yul Seo
Comment 31
2012-08-09 16:28:30 PDT
(In reply to
comment #29
)
> Sorry for not commenting earlier. @skyul: I thought we were going to evaluate the speculative parser on a branch before landing more parser changes.
(In reply to
comment #30
)
> I see that you've already answered my question in
https://bugs.webkit.org/show_bug.cgi?id=90751#c10
Yeah, please take a look at these preparation patches. I won't land them until we decide to land the whole speculative parser. I should've turned off r+ flag in this case?
Eric Seidel (no email)
Comment 32
2013-03-05 02:02:50 PST
We took a different approach with the threaded parser work. I think this is now WONTFIX/dupe of
bug 106127
.
Zan Dobersek
Comment 33
2013-09-05 08:33:14 PDT
Comment on
attachment 157441
[details]
Patch The work on this bug has ceased, removing the r? flag on the patch.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug