44528 – Improve FEGaussianBlur algorithm to access image buffer memory directly

RESOLVED FIXED 44528

Improve FEGaussianBlur algorithm to access image buffer memory directly

https://bugs.webkit.org/show_bug.cgi?id=44528

Summary Improve FEGaussianBlur algorithm to access image buffer memory directly

Alejandro G. Castro

Reported 2010-08-24 09:15:58 PDT

Currently the FEGaussianBlur uses CanvasPixelArrays objects to handle pixel operations, this is very slow, we need to add support to get and handle image buffer memory directly, avoiding copies if possible.

Attachments
Proposed patch (14.42 KB, patch) 2010-08-24 09:17 PDT, Alejandro G. Castro	no flags	Details Formatted Diff Diff
Proposed patch (14.45 KB, patch) 2010-08-24 09:44 PDT, Alejandro G. Castro	no flags	Details Formatted Diff Diff
Proposed patch (14.52 KB, patch) 2010-08-24 13:41 PDT, Alejandro G. Castro	no flags	Details Formatted Diff Diff
Proposed patch (14.76 KB, patch) 2010-08-27 04:37 PDT, Alejandro G. Castro	krit: review-	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Alejandro G. Castro

Comment 1 2010-08-24 09:17:40 PDT

Created attachment 65282 [details] Proposed patch

Eric Seidel (no email)

Comment 2 2010-08-24 09:31:57 PDT

Attachment 65282 [details] did not build on mac: Build output: http://queues.webkit.org/results/3768598

Alejandro G. Castro

Comment 3 2010-08-24 09:44:32 PDT

Created attachment 65284 [details] Proposed patch

WebKit Review Bot

Comment 4 2010-08-24 10:24:02 PDT

Attachment 65284 [details] did not build on chromium: Build output: http://queues.webkit.org/results/3790427

Alejandro G. Castro

Comment 5 2010-08-24 13:41:06 PDT

Created attachment 65314 [details] Proposed patch

Alejandro G. Castro

Comment 6 2010-08-27 01:28:25 PDT

After talking to dhyatt I'm going review the patch and create begin/end functions instead of data and dirty.

Alejandro G. Castro

Comment 7 2010-08-27 04:37:23 PDT

Created attachment 65698 [details] Proposed patch Replaced data/makeDirty with beginAccessData/endAccessData and reviewed some of the port functions to avoid a copy, probably someone of each port should check those functions are correct for each technology.

Dirk Schulze

Comment 8 2010-10-23 05:22:27 PDT

Comment on attachment 65698 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=65698&action=review Have you tried to use the original code, but take the WTF::ByteArray instead of the CanvasPixelArray? I'm interested in how much perf win this could bring. Surely not as much as your code, but still. Also it is not needed here to mark the surface as dirty. This is only necessary, if you want to access the context after pixel manipulations. That isn't the case here or on any other filter effect result. You also ignored, that some platforms may use unpremultiplied colors. Would be great if we can find a way to manage pixel access with premultiplied and unpremultiplied colors, that would be a win for the other filter effects as well. I know that you were working on that to make CSS shadows faster on Cairo. But it would still be useful for SVG filter effects (that are supported by WebKitGtk as well). > WebCore/platform/graphics/wx/ImageBufferWx.cpp:72 > + notImplemented(); indention wrong.

Alejandro G. Castro

Comment 9 2010-10-25 05:13:28 PDT

Thanks for the review krit. (In reply to comment #8) > (From update of attachment 65698 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=65698&action=review > > Have you tried to use the original code, but take the WTF::ByteArray instead of the CanvasPixelArray? I'm interested in how much perf win this could bring. Surely not as much as your code, but still. Also it is not needed here to mark the surface as dirty. This is only necessary, if you want to access the context after pixel manipulations. That isn't the case here or on any other filter effect result. > You also ignored, that some platforms may use unpremultiplied colors. Would be great if we can find a way to manage pixel access with premultiplied and unpremultiplied colors, that would be a win for the other filter effects as well. > Yeah, you are right, I'll try to check if there is any option and check the performance in that case.

Dirk Schulze

Comment 10 2010-10-25 05:46:04 PDT

(In reply to comment #9) > Thanks for the review krit. > > (In reply to comment #8) > > (From update of attachment 65698 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=65698&action=review > > > > Have you tried to use the original code, but take the WTF::ByteArray instead of the CanvasPixelArray? I'm interested in how much perf win this could bring. Surely not as much as your code, but still. Also it is not needed here to mark the surface as dirty. This is only necessary, if you want to access the context after pixel manipulations. That isn't the case here or on any other filter effect result. > > You also ignored, that some platforms may use unpremultiplied colors. Would be great if we can find a way to manage pixel access with premultiplied and unpremultiplied colors, that would be a win for the other filter effects as well. > > > > Yeah, you are right, I'll try to check if there is any option and check the performance in that case. I already checked and tested ByteArray* an unsgined char* and couldn't see a difference in performance. I took the attached SVG on bug 48174 as reference. On Mac put/ and getImageData just take 0.8% of the time, GaussianBlur algo itself took 96% of the time. On my machine I could reduce the rendering time from 31s to 21s just by eliminating the -> operators and accessing the pixel data via ByteArray instead of CanvasPixelArray. 21s is still to much, but I don't see huge improvement by implementing direct pixel data access on ImageBuffer right now. Even so I'd support it! Do you plan to investigate on this more?

Dirk Schulze

Comment 11 2010-10-25 05:48:41 PDT

(In reply to comment #10) > Do you plan to investigate on this more? Oh, I forgot to mention that the patch that makes use of ByteArray on FEGaussianBlur is already in trunk.

Alejandro G. Castro

Comment 12 2010-10-25 06:03:15 PDT

(In reply to comment #10) > > [...] > > I already checked and tested ByteArray* an unsgined char* and couldn't see a difference in performance. I took the attached SVG on bug 48174 as reference. On Mac put/ and getImageData just take 0.8% of the time, GaussianBlur algo itself took 96% of the time. On my machine I could reduce the rendering time from 31s to 21s just by eliminating the -> operators and accessing the pixel data via ByteArray instead of CanvasPixelArray. 21s is still to much, but I don't see huge improvement by implementing direct pixel data access on ImageBuffer right now. Even so I'd support it! > > Do you plan to investigate on this more? > Oh, nice, good to know. If you think it makes sense I can spend time doing some profiling and check how we could do it (integrate contextshadow I guess it is an option). Althougth I'm afraid that this operation is not going to be good enough if we can not make it in specific hardware or reduce the amount of blurring area.

Simon Fraser (smfr)

Comment 13 2010-12-11 19:30:53 PST