RESOLVED FIXED30683
svn-apply's fixChangeLogPatch function seems broken
https://bugs.webkit.org/show_bug.cgi?id=30683
Summary svn-apply's fixChangeLogPatch function seems broken
Eric Seidel (no email)
Reported 2009-10-22 11:39:45 PDT
svn-apply's fixChangeLogPatch function seems broken I've not been able to get it to work yet in my testing. I'm going to see if I can fix it...
Attachments
Patch v1 (10.14 KB, patch)
2009-10-22 16:31 PDT, Eric Seidel (no email)
no flags
Share this code in more places (15.92 KB, patch)
2009-10-23 13:23 PDT, Eric Seidel (no email)
no flags
Updated ChangeLog (16.12 KB, patch)
2009-10-23 13:39 PDT, Eric Seidel (no email)
no flags
Fix as suggested by Dave. Still includes method moves so as not to depend on bug 30791. (19.99 KB, patch)
2009-10-26 15:25 PDT, Eric Seidel (no email)
no flags
Update after bug 30791 (10.26 KB, patch)
2009-10-27 09:01 PDT, Eric Seidel (no email)
no flags
Eric Seidel (no email)
Comment 1 2009-10-22 11:59:23 PDT
Actually, looks like it just doesn't work if more than the dateline are the same (like the reviewer line, as is the case when you have multiple unreviewed patches in ones git checkout).
Eric Seidel (no email)
Comment 2 2009-10-22 16:31:12 PDT
Created attachment 41695 [details] Patch v1
Eric Seidel (no email)
Comment 3 2009-10-22 16:32:17 PDT
I'm not quite sure how I go about testing this. I would love to test it, but we have no unit testing for our perl scripts. :( It's possible I could use the python infrastructure for testing some of this.
Eric Seidel (no email)
Comment 4 2009-10-23 13:23:02 PDT
Created attachment 41745 [details] Share this code in more places
Eric Seidel (no email)
Comment 5 2009-10-23 13:23:30 PDT
I can't believe we had FOUR copies of this function. No wonder our perl scripts are a sewer.
Eric Seidel (no email)
Comment 6 2009-10-23 13:24:36 PDT
Eh, my comment above was unfair. I can understand the desire to have these scripts be stand-alone, but I don't think that it's worth the cost of having copy/paste code like this. At least not functions as huge and complicated as this one.
Eric Seidel (no email)
Comment 7 2009-10-23 13:39:08 PDT
Created attachment 41746 [details] Updated ChangeLog
Eric Seidel (no email)
Comment 8 2009-10-23 14:25:49 PDT
*** Bug 26730 has been marked as a duplicate of this bug. ***
Eric Seidel (no email)
Comment 9 2009-10-23 14:29:17 PDT
*** Bug 26865 has been marked as a duplicate of this bug. ***
David Kilzer (:ddkilzer)
Comment 10 2009-10-23 15:45:25 PDT
(In reply to comment #6) > Eh, my comment above was unfair. I can understand the desire to have these > scripts be stand-alone, but I don't think that it's worth the cost of having > copy/paste code like this. At least not functions as huge and complicated as > this one. At the time I wrote this code, I was told that it was more important for the scripts to be standalone than to share code.
Eric Seidel (no email)
Comment 11 2009-10-23 15:48:29 PDT
(In reply to comment #10) > (In reply to comment #6) > > Eh, my comment above was unfair. I can understand the desire to have these > > scripts be stand-alone, but I don't think that it's worth the cost of having > > copy/paste code like this. At least not functions as huge and complicated as > > this one. > > At the time I wrote this code, I was told that it was more important for the > scripts to be standalone than to share code. Hum... It would that now that other code is being shared using VCSUtils.pm we should re-visit that decision. If we decide to start sharing more code there is definitely a bunch more we can share yet. I am very interested in your input on such a decision. Obviously my bias above has already been stated. :)
David Kilzer (:ddkilzer)
Comment 12 2009-10-23 16:56:21 PDT
(In reply to comment #11) > > Hum... It would that now that other code is being shared using VCSUtils.pm we > should re-visit that decision. If we decide to start sharing more code there > is definitely a bunch more we can share yet. I am very interested in your > input on such a decision. Obviously my bias above has already been stated. :) Since we now have a way to share code using VCSUtils.pm, I'm all for doing that. I originally wanted to share the code as well.
David Kilzer (:ddkilzer)
Comment 13 2009-10-23 17:31:34 PDT
Comment on attachment 41746 [details] Updated ChangeLog I'm on vacation through Oct. 28, so I can't really give a detailed review until then. However, I see issues with this patch that require an r-: - ChangeLog doesn't mention all the scripts affected. - The new method in VCSUtils.pm has a debugging print() statement in it. - The method Name added to the list of exported methods should be alphabetized. - The method in VCSUtils.pm completely breaks the contract of the original method--that the entire patch should be returned unaltered if the method can't fix it. - The method in VCSUtils.pm makes too many assumptions about the format of the patch, e.g., that the patch's first line of code is the fourth line of the text passed in. While rare, there are some patches that don't include an "Index" or "diff" preamble line. I think the best approach to fix this is to move the unaltered method into VCSUtils.pm first, then write a simple test harness in Perl that uses the VCSUtils module to read a patch from stdin and write the altered result to stdout. Then this can be used to write test cases for different types of ChangeLog patches, then the method can be fixed for the case that this bug was filed for.
Eric Seidel (no email)
Comment 14 2009-10-26 15:00:48 PDT
(In reply to comment #13) Thanks for the review! > - The new method in VCSUtils.pm has a debugging print() statement in it. Oops. Fixed. > - The method Name added to the list of exported methods should be alphabetized. Fixed. > - The method in VCSUtils.pm completely breaks the contract of the original > method--that the entire patch should be returned unaltered if the method can't > fix it. My bad. Fixed. > - The method in VCSUtils.pm makes too many assumptions about the format of the > patch, e.g., that the patch's first line of code is the fourth line of the text > passed in. While rare, there are some patches that don't include an "Index" or > "diff" preamble line. OK. Fixed. > I think the best approach to fix this is to move the unaltered method into > VCSUtils.pm first Ok. bug 30791. > then write a simple test harness in Perl that uses the > VCSUtils module to read a patch from stdin and write the altered result to > stdout. Then this can be used to write test cases for different types of > ChangeLog patches, then the method can be fixed for the case that this bug was > filed for. Eh. Given our complete lack of unit testing in our perl scripts, perl's seeming lack of built-in unit testing, I have little desire to invent a system. So I've added some tests to scm_unittest.py instead. Those drive svn-apply as part of scm.py and should provide adequate coverage for this change. Patch coming up!
Eric Seidel (no email)
Comment 15 2009-10-26 15:25:01 PDT
Created attachment 41906 [details] Fix as suggested by Dave. Still includes method moves so as not to depend on bug 30791.
David Kilzer (:ddkilzer)
Comment 16 2009-10-27 00:01:33 PDT
(In reply to comment #14) > Eh. Given our complete lack of unit testing in our perl scripts, perl's > seeming lack of built-in unit testing, I have little desire to invent a system. > > So I've added some tests to scm_unittest.py instead. Those drive svn-apply as > part of scm.py and should provide adequate coverage for this change. Perl does have built-in testing. See manpages for Test::Harness or Test::Simple--one of them comes with Perl.
Eric Seidel (no email)
Comment 17 2009-10-27 09:01:31 PDT
Created attachment 41956 [details] Update after bug 30791
Eric Seidel (no email)
Comment 18 2009-10-27 11:58:55 PDT
(In reply to comment #16) > (In reply to comment #14) > > Eh. Given our complete lack of unit testing in our perl scripts, perl's > > seeming lack of built-in unit testing, I have little desire to invent a system. > > > > So I've added some tests to scm_unittest.py instead. Those drive svn-apply as > > part of scm.py and should provide adequate coverage for this change. > > Perl does have built-in testing. See manpages for Test::Harness or > Test::Simple--one of them comes with Perl. Interesting. It would appear that both are included on Mac OS X Leopard. I do not know about other distros, but that's good to know! I think the scm_unittest.py testing should be sufficient for this fix. It covers the desired change, and has the added benefit of being a full-system test of svn-apply (which we did not have before). But it's good to know that Test::* seem to be installed on mac os x for the future.
Eric Seidel (no email)
Comment 19 2009-11-03 12:03:42 PST
Ping?
David Kilzer (:ddkilzer)
Comment 20 2009-11-03 12:57:53 PST
(In reply to comment #19) > Ping? I'll try to look at this today.
David Kilzer (:ddkilzer)
Comment 21 2009-11-04 10:35:43 PST
Comment on attachment 41956 [details] Update after bug 30791 > diff --git a/WebKitTools/Scripts/VCSUtils.pm b/WebKitTools/Scripts/VCSUtils.pm > index e1e0bc2..ebf1190 100644 > --- a/WebKitTools/Scripts/VCSUtils.pm > +++ b/WebKitTools/Scripts/VCSUtils.pm > @@ -347,56 +347,64 @@ sub gitdiff2svndiff($) > return $_; > } > > +# The diff(1) command is greedy when matching lines, so a new ChangeLog entry will > +# have lines of context at the top of a patch when the existing entry has the same > +# date and author as the new entry. Alter the ChangeLog patch so > +# that the added lines ("+") in the patch always start at the beginning of the > +# patch and there are no initial lines of context. > sub fixChangeLogPatch($) > { > + my $patch = shift; # $patch will only contain patch fragments for ChangeLog. > + > + $patch =~ /(\r?\n)/; > + my $lineEnding = $1; > + my @patchLines = split(/$lineEnding/, $patch); > + > + # e.g. 2009-06-03 Eric Seidel <eric@webkit.org> > + my $dateLineRegexpString = '^\+(\d{4}-\d{2}-\d{2})' # Consume the leading '+' and the date. > + . '\s+(.+)\s+' # Consume the name. > + . '<([^<>]+)>$'; # And finally the email address. > + > + # Figure out where the patch contents start and stop. > + my $patchHeaderIndex; > + my $firstContentIndex; > + my $trailingContextIndex; > + my $dateIndex; > + my $patchEndIndex = scalar(@patchLines); > + for (my $index = 0; $index < @patchLines; ++$index) { > + my $line = $patchLines[$index]; > + if ($line =~ /^\@\@ -\d+,\d+ \+\d+,\d+ \@\@$/) { # e.g. @@ -1,5 +1,18 @@ > + if ($patchHeaderIndex) { > + $patchEndIndex = $index; # We only bother to fix up the first patch fragment. > + last; > } > + $patchHeaderIndex = $index; > } > + $firstContentIndex = $index if ($patchHeaderIndex && !$firstContentIndex && $line =~ /^\+[^+]/); # Only match after finding patchHeaderIndex, otherwise we'd match "+++". > + $dateIndex = $index if ($line =~ /$dateLineRegexpString/); > + $trailingContextIndex = $index if ($firstContentIndex && !$trailingContextIndex && $line =~ /^ /); > } > + my $contentLineCount = $trailingContextIndex - $firstContentIndex; > + my $trailingContextLineCount = $patchEndIndex - $trailingContextIndex; > + > + # If we didn't find a date line in the content then this is not a patch we should try and fix. > + return $patch if (!$dateIndex); > + > + # We only need to do anything if the date line is not the first content line. > + return $patch if ($dateIndex == $firstContentIndex); > + > + # Write the new patch. > + my $totalNewContentLines = $contentLineCount + $trailingContextLineCount; > + $patchLines[$patchHeaderIndex] = "@@ -1,$trailingContextLineCount +1,$totalNewContentLines @@"; # Write a new header. > + my @repeatedLines = splice(@patchLines, $dateIndex, $trailingContextIndex - $dateIndex); # The date line and all the content after it that diff saw as repeated. > + splice(@patchLines, $firstContentIndex, 0, @repeatedLines); # Move the repeated content to the top. > + foreach my $line (@repeatedLines) { > + $line =~ s/^\+/ /; > + } > + splice(@patchLines, $trailingContextIndex, $patchEndIndex, @repeatedLines); # Replace trailing context with the repeated content. > + splice(@patchLines, $patchHeaderIndex + 1, $firstContentIndex - $patchHeaderIndex - 1); # Remove any leading context. > > + return join($lineEnding, @patchLines) . "\n"; # patch(1) expects an extra trailing newline. > } After thinking about this some more, there's an even easier way to mangle the patch. Basically, assuming the patch is 0-or-more unchanged lines, 1-or-more added lines (+), and then 1-or-more unchanged lines, all you really have to do is separate out the added lines from the unchanged lines, reorder the added lines starting with the date line and wrapping around to the top, then output the reordered added lines plus the first three lines of unchanged lines. (This works because the added lines always contain the complete patch, but because of the vagaries of diff, they are sometimes out of order.) r=me to unblock other issues. We can revisit this later if needed.
Eric Seidel (no email)
Comment 22 2009-11-04 22:18:36 PST
(In reply to comment #21) > After thinking about this some more, there's an even easier way to mangle the > patch. Basically, assuming the patch is 0-or-more unchanged lines, 1-or-more > added lines (+), and then 1-or-more unchanged lines, all you really have to do > is separate out the added lines from the unchanged lines, reorder the added > lines starting with the date line and wrapping around to the top, then output > the reordered added lines plus the first three lines of unchanged lines. (This > works because the added lines always contain the complete patch, but because of > the vagaries of diff, they are sometimes out of order.) > > r=me to unblock other issues. We can revisit this later if needed. I'm not sure I understand. What you describe sounds a lot like what I tried to do in the code. There must be some subtly I'm missing in your explanation.
WebKit Commit Bot
Comment 23 2009-11-04 23:07:08 PST
Comment on attachment 41956 [details] Update after bug 30791 Clearing flags on attachment: 41956 Committed r50547: <http://trac.webkit.org/changeset/50547>
WebKit Commit Bot
Comment 24 2009-11-04 23:07:15 PST
All reviewed patches have been landed. Closing bug.
David Kilzer (:ddkilzer)
Comment 25 2009-11-05 05:20:09 PST
(In reply to comment #22) > (In reply to comment #21) > > After thinking about this some more, there's an even easier way to mangle the > > patch. Basically, assuming the patch is 0-or-more unchanged lines, 1-or-more > > added lines (+), and then 1-or-more unchanged lines, all you really have to do > > is separate out the added lines from the unchanged lines, reorder the added > > lines starting with the date line and wrapping around to the top, then output > > the reordered added lines plus the first three lines of unchanged lines. (This > > works because the added lines always contain the complete patch, but because of > > the vagaries of diff, they are sometimes out of order.) > > I'm not sure I understand. What you describe sounds a lot like what I tried to > do in the code. There must be some subtly I'm missing in your explanation. Yes, but it seemed like the algorithm was more complicated than it needed to be in that it still had too many state variables. Still, it's an improvement over my previous code, which was even more complex. I don't feel any particular need to rewrite it again, though. :)
Eric Seidel (no email)
Comment 26 2009-11-10 11:34:19 PST
*** Bug 30687 has been marked as a duplicate of this bug. ***
Chris Jerdonek
Comment 27 2009-12-27 10:44:05 PST
FYI, I have a proposed revision to fixChangeLogPatch: https://bugs.webkit.org/show_bug.cgi?id=32919 I've added unit tests as suggested here.
Note You need to log in before you can comment on or make changes to this bug.