RESOLVED FIXED 139812
REGRESSION(176609): Very high memory usage in Canvas/reuse.html performance test
https://bugs.webkit.org/show_bug.cgi?id=139812
Summary REGRESSION(176609): Very high memory usage in Canvas/reuse.html performance test
Csaba Osztrogonác
Reported 2014-12-19 00:35:39 PST
at least on EFL and GTK bots, because there is no malloc measuring on Mac bots. before: -------- https://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release%20%28Perf%29/builds/1382 Running Canvas/reuse.html (33 of 141) RESULT Canvas: reuse: Runs= 70.7723901718 runs/s median= 63.5780801678 runs/s, stdev= 13.4939255648 runs/s, min= 62.5708614903 runs/s, max= 97.1289885232 runs/s RESULT Canvas: reuse: JSHeap= 267628.8 bytes median= 269721.0 bytes, stdev= 3148.69720093 bytes, min= 261820.0 bytes, max= 269721.0 bytes RESULT Canvas: reuse: Malloc= 1009948.4 bytes median= 985512.0 bytes, stdev= 75773.0039181 bytes, min= 979760.0 bytes, max= 1235248.0 bytes Finished: 25.664726 s after: ------- https://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release%20%28Perf%29/builds/1383 Running Canvas/reuse.html (33 of 141) RESULT Canvas: reuse: Runs= 49.7027479102 runs/s median= 49.8827160444 runs/s, stdev= 22.7074182517 runs/s, min= 6.19658159381 runs/s, max= 124.313478825 runs/s RESULT Canvas: reuse: JSHeap= 267201.6 bytes median= 268569.0 bytes, stdev= 3111.07465609 bytes, min= 261788.0 bytes, max= 269689.0 bytes RESULT Canvas: reuse: Malloc= 11552632.0 bytes median= 9958680.0 bytes, stdev= 3720965.90803 bytes, min= 5353752.0 bytes, max= 23708440.0 bytes Finished: 42.262450 s Malloc= 1 009 948.4 bytes --> Malloc= 11 552 632.0 bytes --> 11x memory regression And after this change the EFL performance bot is very flakey, WTR consumes more than 8Gb memory for Canvas/reuse.html which causes swapping and causes OOM or timeout regularly. And of course swapping makes the whole performance measuring unreliable.
Attachments
Patch (3.07 KB, patch)
2015-02-03 12:07 PST, Chris Dumez
no flags
Csaba Osztrogonác
Comment 1 2014-12-19 00:36:18 PST
I have no time and plan to fix it myself, feel free to pick it up, if you are interested in it.
Chris Dumez
Comment 2 2014-12-19 10:14:04 PST
That is very interesting. Thanks for the report Ossy. I'll take a look at this soon.
Chris Dumez
Comment 3 2014-12-21 14:38:17 PST
This looks similar to https://bugs.webkit.org/show_bug.cgi?id=138915. However, that bug has been fixed and it seems Canvas/reuse.html has not recovered on the GTK bots.
Chris Dumez
Comment 4 2014-12-21 14:42:41 PST
I am not very familiar with these memory tests. Do these results really show that we are leaking memory or simply that we are allocating a lot more than we used to?
Csaba Osztrogonác
Comment 5 2014-12-21 14:53:09 PST
(In reply to comment #4) > I am not very familiar with these memory tests. Do these results really show > that we are leaking memory or simply that we are allocating a lot more than > we used to? These tests show that we allocates 10x more memory than previously, but nothing about leaking. But I thought that memory is leaking here, because WTR consumes more than 8G and then the machine started swapping. I checked the memory consumption in htop with EFL's WTR. I can take a look at it in january.
Csaba Osztrogonác
Comment 6 2014-12-23 09:17:20 PST
note: I skipped Canvas/reuse.html on the EFL performance bot locally as a workaround until proper fix.
Csaba Osztrogonác
Comment 7 2015-01-08 14:46:29 PST
> I can take a look at it in january. I won't have time for this bug in the near future, feel free to pick it up. Anyone from EFL/GTK maintainers?
Carlos Alberto Lopez Perez
Comment 8 2015-01-08 16:09:15 PST
(In reply to comment #6) > note: I skipped Canvas/reuse.html on the EFL performance > bot locally as a workaround until proper fix. This is interesting... How do you did it? I didn't know that the perf tests can be skipped.
Csaba Osztrogonác
Comment 9 2015-01-08 16:14:24 PST
(In reply to comment #8) > This is interesting... How do you did it? I didn't know that the perf tests > can be skipped. PerformanceTests/Skipped - but it isn't so sophisticated as TestExpectations for LayoutTests
Csaba Osztrogonác
Comment 10 2015-01-13 00:33:28 PST
(In reply to comment #6) > note: I skipped Canvas/reuse.html on the EFL performance > bot locally as a workaround until proper fix. Unfortunately I have to add this ugly local hack again and again, because the repository is wiped regularly because of some random SVN fail. I don't have resource to do it regularly and I won't do it. It would be great if the EFL maintainers could fix this serious and 1.5 months old regression. If nobody is interested in fixing this bug, I can't guarantee that the performance bot will provide valid results. If it starts swapping, the results will become useless.
Carlos Alberto Lopez Perez
Comment 11 2015-01-13 03:42:52 PST
(In reply to comment #10) > (In reply to comment #6) > > note: I skipped Canvas/reuse.html on the EFL performance > > bot locally as a workaround until proper fix. > > Unfortunately I have to add this ugly local hack again and again, > because the repository is wiped regularly because of some random > SVN fail. > > I don't have resource to do it regularly and I won't do it. > > It would be great if the EFL maintainers could fix this serious and 1.5 > months old regression. If nobody is interested in fixing this bug, I can't > guarantee that the performance bot will provide valid results. If it starts > swapping, the results will become useless. Maybe we can implement support for port-specific skipped lists. Something like: PerformanceTests/Skipped.efl that will add on top of the general list of PerformanceTests/Skipped list for the EFL port.
Csaba Osztrogonác
Comment 12 2015-01-13 03:46:56 PST
(In reply to comment #11) > Something like: PerformanceTests/Skipped.efl that will add on top of the > general list of PerformanceTests/Skipped list for the EFL port. I'd prefer fixing bugs instead of implementing a sophisticated mechanism to skip tests to hide real/serious/annoying bugs/regressions.
Csaba Osztrogonác
Comment 13 2015-01-30 01:40:34 PST
Any plan to fix this 2 months old serious regression? Or any plan to develop a sophisticated mechanism to be able to skip the failing test on EFL only? I had to skip it manually on the bot again because an SNV error caused a new checkout. :-/
Chris Dumez
Comment 14 2015-01-30 09:54:15 PST
(In reply to comment #13) > Any plan to fix this 2 months old serious regression? Or any plan to develop > a sophisticated mechanism to be able to skip the failing test on EFL only? > > I had to skip it manually on the bot again because an SNV error caused > a new checkout. :-/ I did take a look but I couldn't find what's wrong. Also, I cannot reproduce the issue on Mac so it makes it inconvenient to debug. On WebKit trunk, we are now keeping weak pointers to the canvas elements causing DOM timer throttling: Vector<WeakPtr<Element>> m_elementsCausingThrottling; My initial patch did cause leaking but the issue went away when I switched to WeakPtr later on. It seems EFK / GTK still experience leaking (or at least huge memory usage) but I have no idea why.
Csaba Osztrogonác
Comment 15 2015-02-02 04:20:00 PST
Hi Chris, thanks for checking this bug, but it would be better if EFL/GTK maintainers could debug and fix this serious regression. I checked, the bug is still valid at least on EFL, WebProcess consumes 6+ Gb memory on Canvas/reuse.html. It is a serious memory leak. Additionally I prepared a patch to skip this test on EFL - bug141152 .
Csaba Osztrogonác
Comment 16 2015-02-02 04:48:48 PST
I checked the bug is still valid on GTK too.
Carlos Garcia Campos
Comment 17 2015-02-02 09:09:46 PST
I debugged this for a while but couldn't find the problem :-(
Carlos Alberto Lopez Perez
Comment 18 2015-02-02 14:47:52 PST
I have patched run-perf-tests to allow a wrapper command: bug 141172. This is the output of: $ Tools/Scripts/run-perf-tests --no-show-results --platform gtk --release -2 --wrapper='valgrind --trace-children=yes --smc-check=all' Canvas/reuse.html http://sprunge.us/PhAb
Carlos Alberto Lopez Perez
Comment 19 2015-02-02 20:20:55 PST
I don't know what is the problem, but I think I know where is it. This is the trace I got: 85 void setScriptMadeNonUserObservableChangesToElement(Element& element) 86 { 87 m_scriptMadeNonUserObservableChanges = true; 88 m_elementsChangedOutsideViewport.add(&element); <------- http://trac.webkit.org/browser/trunk/Source/WebCore/page/DOMTimer.cpp?rev=179534#L88 89 } 90 313 void DOMTimer::scriptDidCauseElementRepaint(Element& element, bool mayRepaintNonDescendants) 314 { 315 if (!DOMTimerFireState::current) 316 return; 317 318 if (mayRepaintNonDescendants || element.mayCauseRepaintInsideViewport()) 319 DOMTimerFireState::current->setScriptMadeUserObservableChanges(); 320 else 321 DOMTimerFireState::current->setScriptMadeNonUserObservableChangesToElement(element); <------- http://trac.webkit.org/browser/trunk/Source/WebCore/page/DOMTimer.cpp?rev=179534#L321 322 } 280 void HTMLCanvasElement::notifyObserversCanvasChanged(const FloatRect& rect) 281 { 282 // Choke point for canvas drawing; notify DOMTimer of the event. 283 DOMTimer::scriptDidCauseElementRepaint(*this); <------- http://trac.webkit.org/browser/trunk/Source/WebCore/html/HTMLCanvasElement.cpp?rev=179534#L283 284 285 for (auto it = m_observers.begin(), end = m_observers.end(); it != end; ++it) 286 (*it)->canvasChanged(*this, rect); 287 } Seems that either commenting line 283 of Source/WebCore/html/HTMLCanvasElement.cpp or line 88 of Source/WebCore/page/DOMTimer.cpp fixes the problem. When running the perf test Canvas/reuse.html, the above trace gets executed around 24K times. I wonder if the memory issues have to be with adding the element so many times to m_elementsChangedOutsideViewport ?
Chris Dumez
Comment 20 2015-02-02 20:38:00 PST
(In reply to comment #19) > I don't know what is the problem, but I think I know where is it. > This is the trace I got: > > 85 void setScriptMadeNonUserObservableChangesToElement(Element& element) > 86 { > 87 m_scriptMadeNonUserObservableChanges = true; > 88 m_elementsChangedOutsideViewport.add(&element); <------- > http://trac.webkit.org/browser/trunk/Source/WebCore/page/DOMTimer. > cpp?rev=179534#L88 > 89 } > 90 > > > 313 void DOMTimer::scriptDidCauseElementRepaint(Element& element, bool > mayRepaintNonDescendants) > 314 { > 315 if (!DOMTimerFireState::current) > 316 return; > 317 > 318 if (mayRepaintNonDescendants || > element.mayCauseRepaintInsideViewport()) > 319 DOMTimerFireState::current->setScriptMadeUserObservableChanges(); > 320 else > 321 > DOMTimerFireState::current- > >setScriptMadeNonUserObservableChangesToElement(element); <------- > http://trac.webkit.org/browser/trunk/Source/WebCore/page/DOMTimer. > cpp?rev=179534#L321 > 322 } > > > 280 void HTMLCanvasElement::notifyObserversCanvasChanged(const FloatRect& > rect) > 281 { > 282 // Choke point for canvas drawing; notify DOMTimer of the event. > 283 DOMTimer::scriptDidCauseElementRepaint(*this); > <------- > http://trac.webkit.org/browser/trunk/Source/WebCore/html/HTMLCanvasElement. > cpp?rev=179534#L283 > 284 > 285 for (auto it = m_observers.begin(), end = m_observers.end(); it != > end; ++it) > 286 (*it)->canvasChanged(*this, rect); > 287 } > > > Seems that either commenting line 283 of > Source/WebCore/html/HTMLCanvasElement.cpp or line 88 of > Source/WebCore/page/DOMTimer.cpp fixes the problem. > > When running the perf test Canvas/reuse.html, the above trace gets executed > around 24K times. > > I wonder if the memory issues have to be with adding the element so many > times to m_elementsChangedOutsideViewport ? m_elementsChangedOutsideViewport is a HashSet so adding the same element many times should not blow up memory. However, m_elementsChangedOutsideViewport is ref-ing the elements. The reason this is supposed to be OK is because the HashSet gets cleared when the DOMTimerFireState object gets destroyed (which happens after the timer's callback is done executing). If m_elementsChangedOutsideViewport would not get cleared, for one reason or another, then we could keep a ref to the canvas elements and leak. It would be interesting to check if m_elementsChangedOutsideViewport gets cleared and when.
Carlos Garcia Campos
Comment 21 2015-02-03 03:10:48 PST
The problem is that it's not the same element, but a different one every time because the test is creating a lot of canvas elements. Before r176609, nobody was holding a ref of the canvases so they were released before the next one is created. And the new one was created using the same memory of the recently release one. Now the DOMTimerFireState is holding a reference of all canvases created, preventing the memory allocator to reuse the memory. DOMTimerFireState is indeed correctly destroyed, but when destroyed it contains a lot of canvases in m_elementsChangedOutsideViewport that are all correctly released as well. I wonder why this doesn't happen in mac, though.
Carlos Alberto Lopez Perez
Comment 22 2015-02-03 07:41:33 PST
(In reply to comment #21) > I wonder why this doesn't happen in mac, though. We don't know if it happens on Mac or not. Unfortunately the Mac perf bots don't report malloc stats. I don't know why. If you check the stdout from the bots, mac per bots always report "Malloc= 0.0 bytes" for every perf test: https://build.webkit.org/builders/Apple%20Mavericks%20Release%20%28Perf%29/builds/3834/steps/perf-test/logs/stdio https://build.webkit.org/builders/Apple%20Yosemite%20Release%20%28Perf%29/builds/911/steps/perf-test/logs/stdio Compare that with GTK or EFL: https://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release%20%28Perf%29/builds/1861/steps/perf-test/logs/stdio https://build.webkit.org/builders/EFL%20Linux%2064-bit%20Release%20WK2%20%28Perf%29/builds/4669/steps/perf-test/logs/stdio Also the charts at perf.webkit.org don't show any data for the malloc graphs on the mac bots. https://perf.webkit.org/#mode=charts&chartList=%5B%5B%22efl%22%2C%22Canvas%2Freuse%3AMalloc%22%5D%2C%5B%22gtk%22%2C%22Canvas%2Freuse%3AMalloc%22%5D%2C%5B%22mac-mavericks%22%2C%22Canvas%2Freuse%3AMalloc%22%5D%2C%5B%22mac-yosemite%22%2C%22Canvas%2Freuse%3AMalloc%22%5D%5D https://perf.webkit.org/#mode=charts&chartList=%5B%5B%22efl%22%2C%22Canvas%2Fterrain%3AMalloc%22%5D%2C%5B%22gtk%22%2C%22Canvas%2Fterrain%3AMalloc%22%5D%2C%5B%22mac-mavericks%22%2C%22Canvas%2Fterrain%3AMalloc%22%5D%2C%5B%22mac-yosemite%22%2C%22Canvas%2Fterrain%3AMalloc%22%5D%5D Maybe this memory leak in Canvas/reuse.html is also happening on Mac and nobody noticed so far ?
Carlos Alberto Lopez Perez
Comment 23 2015-02-03 07:56:27 PST
(In reply to comment #22) > (In reply to comment #21) > > I wonder why this doesn't happen in mac, though. > > We don't know if it happens on Mac or not. > > Unfortunately the Mac perf bots don't report malloc stats. I don't know why. > On Linux the results also don't inspire much confidence. I get: $ Tools/Scripts/run-perf-tests --platform gtk --release -2 --no-show-results Canvas/reuse.html Running 1 tests Running Canvas/reuse.html (1 of 1) RESULT Canvas: reuse: Runs= 65.1449948365 runs/s median= 62.5518991775 runs/s, stdev= 23.3810442109 runs/s, min= 40.9716005343 runs/s, max= 130.779642838 runs/s RESULT Canvas: reuse: JSHeap= 259981.2 bytes median= 262303.0 bytes, stdev= 3179.22177833 bytes, min= 254434.0 bytes, max= 262399.0 bytes RESULT Canvas: reuse: Malloc= 14886120.8 bytes median= 14636376.0 bytes, stdev= 4064045.72291 bytes, min= 10082480.0 bytes, max= 23832928.0 bytes Finished: 32.045871 s And looking at htop, WebKitWebProcess sucked nearly 8GB of RAM while running.
Carlos Garcia Campos
Comment 24 2015-02-03 09:44:53 PST
(In reply to comment #22) > Maybe this memory leak in Canvas/reuse.html is also happening on Mac and > nobody noticed so far ? Note that according to my analysis, this is not a memory leak, all memory allocated is correctly freed, the difference is that since r176609 the test requires a lot of more memory because a lot of canvases are created and they are released all together.
Carlos Alberto Lopez Perez
Comment 25 2015-02-03 09:50:35 PST
(In reply to comment #24) > (In reply to comment #22) > > Maybe this memory leak in Canvas/reuse.html is also happening on Mac and > > nobody noticed so far ? > > Note that according to my analysis, this is not a memory leak, all memory > allocated is correctly freed, the difference is that since r176609 the test > requires a lot of more memory because a lot of canvases are created and they > are released all together. Right, I'm re-titling the bug.
Chris Dumez
Comment 26 2015-02-03 10:05:19 PST
Looking at the code of this test, it appears it creates 62 canvas per timer firing. After creating and drawing to those 62 canvases, it calls window.GCController.collect() at the end of the Timer callback. Unfortunately, at this point, DOMTimerFireState is still holding a ref to those canvases so they don't get garbage collected. Between DOM timer firing, DOMTimerFireState is destroyed and the canvases are unref'd. They could then we garbage-collected at any point but this doesn't seem to happen on those bots. I am still thinking about how best address this problem. We could potentially use WeakPointers in DOMTimerFireState like I already do in DOMTimer.
Carlos Garcia Campos
Comment 27 2015-02-03 10:15:29 PST
Here the test creates a lot more than 62 canvases, see: DBG: [0x7fff318ed760] DOMTimerFireState DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: 15562 DBG: [0x7fff318ed760] DOMTimerFireState DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: 2480 DBG: [0x7fff318ed760] DOMTimerFireState DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: 2480 DBG: [0x7fff318ed760] DOMTimerFireState DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: 2480 DBG: [0x7fff318ed760] DOMTimerFireState DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: 2480 DBG: [0x7fff318ed760] DOMTimerFireState DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: 2480 Content-Type: text/plain :Runs -> [48.22815776870716, 48.52301992733968, 42.84591110766597, 18.28278068295747, 49.92928764631812] runs/s :JSHeap -> [256914, 262271, 262335, 262335, 262335] bytes :Malloc -> [10129328, 10084928, 10070336, 10070264, 10070224] bytes #EOF #EOF #EOF
Chris Dumez
Comment 28 2015-02-03 10:18:55 PST
(In reply to comment #27) > Here the test creates a lot more than 62 canvases, see: > > DBG: [0x7fff318ed760] DOMTimerFireState > DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: > 15562 > DBG: [0x7fff318ed760] DOMTimerFireState > DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: > 2480 > DBG: [0x7fff318ed760] DOMTimerFireState > DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: > 2480 > DBG: [0x7fff318ed760] DOMTimerFireState > DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: > 2480 > DBG: [0x7fff318ed760] DOMTimerFireState > DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: > 2480 > DBG: [0x7fff318ed760] DOMTimerFireState > DBG: [0x7fff318ed760] ~DOMTimerFireState: elementsChangedOutsideViewport: > 2480 > Content-Type: text/plain > > :Runs -> [48.22815776870716, 48.52301992733968, 42.84591110766597, > 18.28278068295747, 49.92928764631812] runs/s > > :JSHeap -> [256914, 262271, 262335, 262335, 262335] bytes > > :Malloc -> [10129328, 10084928, 10070336, 10070264, 10070224] bytes > > #EOF > #EOF > #EOF I see, well it is not obvious to me how many times testCreation() is called for each DOM timer firing because the test is using PerfTestRunner.measureRunsPerSecond(). In any case, I understand the issue now. I believe the right fix is to use WeakPointers like we already do in DOMTimer. I am working on this now and will propose a patch soon. Thanks a lot for investigating this.
Chris Dumez
Comment 29 2015-02-03 12:07:26 PST
Chris Dumez
Comment 30 2015-02-03 12:08:35 PST
Before I set the patch up for review, could someone from the GTK/EFL port confirm that it indeed fixes the problem you're seeing? Thanks!
Carlos Alberto Lopez Perez
Comment 31 2015-02-03 15:09:26 PST
(In reply to comment #30) > Before I set the patch up for review, could someone from the GTK/EFL port > confirm that it indeed fixes the problem you're seeing? Thanks! Tested it. At least on GTK, It fixes completely the problem :) Some numbers: When running Canvas/reuse.html, peak memory is. Before this patch -> mapped: 15695984K writeable/private: 13557976K shared: 32K After this patch -> mapped: 2423560K writeable/private: 286872K shared: 32K Thanks!
Chris Dumez
Comment 32 2015-02-03 15:13:23 PST
(In reply to comment #31) > (In reply to comment #30) > > Before I set the patch up for review, could someone from the GTK/EFL port > > confirm that it indeed fixes the problem you're seeing? Thanks! > > Tested it. At least on GTK, It fixes completely the problem :) > > Some numbers: When running Canvas/reuse.html, peak memory is. > > Before this patch -> mapped: 15695984K writeable/private: 13557976K > shared: 32K > After this patch -> mapped: 2423560K writeable/private: 286872K > shared: 32K > > > Thanks! Excellent, thanks for confirming.
Chris Dumez
Comment 33 2015-02-03 15:21:07 PST
Comment on attachment 245952 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=245952&action=review > Source/WebCore/page/DOMTimer.cpp:119 > + HashMap<Element*, WeakPtr<Element>> m_elementsChangedOutsideViewport; FYI, the reason I did not use a WeakGCMap is because it seems to require the values to be of Weak<> type. In this case, my values have WeakPtr<Element> type.
Geoffrey Garen
Comment 34 2015-02-03 16:06:55 PST
Comment on attachment 245952 [details] Patch r=me
WebKit Commit Bot
Comment 35 2015-02-03 16:51:14 PST
Comment on attachment 245952 [details] Patch Clearing flags on attachment: 245952 Committed r179581: <http://trac.webkit.org/changeset/179581>
WebKit Commit Bot
Comment 36 2015-02-03 16:51:19 PST
All reviewed patches have been landed. Closing bug.
Carlos Garcia Campos
Comment 37 2015-02-04 00:13:10 PST
Thank you Chris!
Csaba Osztrogonác
Comment 38 2015-02-04 07:02:34 PST
Thanks for the fix, it helped for EFL too, I already unskipped the test on the bot.
Carlos Alberto Lopez Perez
Comment 39 2015-02-04 07:37:51 PST
(In reply to comment #22) > (In reply to comment #21) > > I wonder why this doesn't happen in mac, though. > > We don't know if it happens on Mac or not. > > Unfortunately the Mac perf bots don't report malloc stats. I don't know why. > > If you check the stdout from the bots, mac per bots always report "Malloc= > 0.0 bytes" for every perf test: > Seems that the cause is that bmalloc don't implements reporting of stats. I reported that on bug 141247
Note You need to log in before you can comment on or make changes to this bug.