58293 – NRWT: doesn't support webarchives

RESOLVED FIXED 58293

NRWT: doesn't support webarchives

https://bugs.webkit.org/show_bug.cgi?id=58293

Summary NRWT: doesn't support webarchives

Chris Rogers

Reported 2011-04-11 18:00:27 PDT

Follow-up fix to DRT for audio (write Content-Length in header)

Attachments
Patch (2.39 KB, patch) 2011-04-11 18:03 PDT, Chris Rogers	no flags	Details Formatted Diff Diff
Patch (4.32 KB, patch) 2011-06-14 20:17 PDT, Dirk Pranke	eric: review+	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Chris Rogers

Comment 1 2011-04-11 18:03:53 PDT

Created attachment 89133 [details] Patch

Dirk Pranke

Comment 2 2011-04-11 18:24:46 PDT

patch looks good to me.

Alexey Proskuryakov

Comment 3 2011-04-11 22:09:17 PDT

What is special about audio files that they need Content-Length? We don't write it for any other types.

Dirk Pranke

Comment 4 2011-04-11 23:00:21 PDT

(In reply to comment #3) > What is special about audio files that they need Content-Length? We don't write it for any other types. We write the Content-Length for image files. While it's true that run-webkit-tests could probably scan over the bytes and look for the #EOF sentinel, this makes the parsing on the client side trivial, and it doesn't seem to hurt much.

Alexey Proskuryakov

Comment 5 2011-04-11 23:36:34 PDT

The same dump() function that is modified here saves application/pdf and application/x-webarchive without Content-Length - why is audio different?

Dirk Pranke

Comment 6 2011-04-12 11:33:59 PDT

(In reply to comment #5) > The same dump() function that is modified here saves application/pdf and application/x-webarchive without Content-Length - why is audio different? It's a good question. the NRWT code processes text/plain a line at a time. I'm not actually sure if or how it handles PDF and webarchives, so I'll double check.

Dirk Pranke

Comment 7 2011-04-12 19:10:24 PDT

Okay, the NRWT code was skipping all of the webarchive files -> not so good. I am implementing support for them now. It appears that all of the .webarchive files manage to end with a newline, causing the subsequent "#EOF" sentinel from DRT to be on a line by itself, so NRWT continues to work. .WAV files do not (currently) make that guarantee, and so NRWT's existing output-parsing will break. I need to continue working on the patch and doing more testing (to confirm w/ PDFs and files of other content types, for example). I also need to double check the logic against ORWT to see if ORWT is doing something more sophisticated to find where the sentinels are. Thanks for asking the question that revealed this deficiency.

Dirk Pranke

Comment 8 2011-04-12 19:11:18 PDT

cancelling the review for now.

Dirk Pranke

Comment 9 2011-05-31 16:31:32 PDT

Okay, it turns out that I can make NRWT handle base64-encoded wav output without needing a content-length header, and I think it doesn't break anything else (see bug 61819). I *suspect* that all of the pdf and webarchive files end up having a newline at the end so they didn't end up tripping up NRWT, although NRWT doesn't currently support webarchives and I haven't investigated PDFs yet. I am going to change the subject and use this bug to track adding support for the other output types to NRWT so I can continue to investigate this, but this'll take this bug off the blocking path for webaudio.

Dirk Pranke

Comment 10 2011-05-31 18:21:03 PDT

In response to Alexey's question in comment #5 (why is webaudio different from webarchive or pdf) it looks like all of the webarchive files's expected output ends in a newline, and we don't actually generate any pdf output at this point. (although there appear to be tests that once did, they all generate rendertrees + images now). ORWT's code says that it expects "#EOF" to show up on a separate line, so in the absence of any other patches, we'd have to modify the base64-encoded output to also include a trailing newline. That seems like kind of a goofy change, so instead I have modified ORWT to match against any line that ends in a #EOF$ and if there is content preceding the #EOF, stick it into the returned result (see bug 57992 for the patch in question). I will still leave this open until NRWT supports webarchives correctly.

Dirk Pranke

Comment 11 2011-06-14 20:17:30 PDT

Created attachment 97225 [details] Patch

Eric Seidel (no email)

Comment 12 2011-06-14 21:08:39 PDT

Comment on attachment 97225 [details] Patch That's it? Seems simple. If this works, great.

Eric Seidel (no email)

Comment 13 2011-06-14 21:09:36 PDT

The PDF tests were added by me and immediately disabled. They were for printing. TUrns out that PDFs encode system information (including username!) so it's not possible to generate them in a system agnostic way... or at least didn't seem that way.

Dirk Pranke

Comment 14 2011-06-14 22:10:38 PDT

(In reply to comment #12) > (From update of attachment 97225 [details]) > That's it? Seems simple. If this works, great. Yeah, as far as I can tell, .webarchives are just XML, so all we had to do way make the code aware of the other extension. I've edited the subject to remove reference to PDFs.

Dirk Pranke

Comment 15 2011-06-15 17:49:58 PDT

Committed r88986: <http://trac.webkit.org/changeset/88986>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Other

OS OS X 10.5

Product WebKit

Component New Bugs

Assignee

Dirk Pranke

Reported

2011-04-11 18:00 PDT

Modified

2011-06-15 17:49 PDT History

CC List

6 users Show

URL

Keywords

Depends on

Blocks

34984 58625

Dependencies

tree graph