Skip to content

Fix how line numbers are counted in Patch#removed_lines #183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions lib/git_diff_parser/patch.rb
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ def removed_lines
patch_position: patch_position
)
lines << line
when NOT_REMOVED_LINE

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong, but I think we should only be incrementing line_number on unchanged lines, i.e. when /^\s/.

Here is a fix and a failing test: cjoudrey@a37ad98#diff-5167fa12a2c6d40aaa8884e40efa7ac4

Explanation:

Original file:

type Query {
  a: String
  b: String
  c: String
}

Patch:

@@ -1,5 +1,5 @@
 type Query {
+  d: String
   a: String
   b: String
-  c: String
 }

The number of the line being removed here is number 4 and patch position 5.

With this current pull request, the number is 5 and patch position is 5.

line_number += 1
end

Expand Down
2 changes: 1 addition & 1 deletion spec/git_diff_parser/patch_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ module GitDiffParser
patch = Patch.new(patch_body)

expect(patch.removed_lines.size).to eq(7)
expect(patch.removed_lines.map(&:number)).to eq [11, 36, 37, 38, 39, 40, 48]
expect(patch.removed_lines.map(&:number)).to eq [14, 39, 39, 39, 39, 39, 54]
Copy link

@thomthom thomthom Feb 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is patch reporting back the same lines multiple times?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at patch.diff, the original line numbers looks like what I would expect. It have 4 chunks - 3 three which removes lines. The third chunk removes five lines, which refer to the sequence [36, 37, 38, 39, 40]. To return [39, 39, 39, 39, 39] doesn't appear to make sense to me.

Also, you replaced line 11 with line 14, but according to the patch line 14 is an addition in the second chunk.

Copy link
Author

@joroshiba joroshiba Feb 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple lines in a row deleted. If you were to look at this in a graphical diff side by side you would see something similar were 5 lines point down to one line in the new version.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So really what is happening here is a block of 5 lines was deleted from line 39 if that makes sense.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would change the reported line numbers from reporting the line number of the source to reporting the line numbers of the patched result. Not what I'd expect from such an API. Given a list of the original line numbers removed it's trivial to compute the target line numbers.

Given the original source, and this diff, I would expect to be able to iterate the collection of removed line numbers and use that as indicies to access each removed line.

This new sequence only makes sense if you apply the patch line by line while iterating the array of removed line numbers.

Take for instance how GitHub display diffs;
image

If you create such a view you want the line numbers for the original source. You iterate those, mark the lines with a removed class. Then for the other view you apply the patch, and then iterate the line numbers of the added lines and mark them added.

This would become very difficult to handle if you collapse the numbers like you do in this PR.

Copy link
Author

@joroshiba joroshiba Feb 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So first of all, currently this simply reports line numbers incorrectly. It is reporting the start of the patch + number of lines removed with no regard to any changed or non changed lines. The line numbers don't match up at all.

In your pictured example above it would tell you that lines 1, 2, 3, and 4 were removed.

Secondly, the way it is setup is to track line numbers in the new file not the old file. Take for instance the below screenshot. From the fixture, I'm reporting that a line was removed which would otherwise reside at line 54. Previously it reported that it deleted a line at line 48 which is simply incorrect, and I'm not clear on what you want it to report. Do you want it to report that line 58 from the old file no longer exists? Then how do interpret it in terms of the new file? Or are you saying that if it was two lines deleted you want it to say two lines where deleted which would otherwise exist at lines 54 and 55 if they had never been deleted?
screen shot 2018-02-12 at 4 38 56 pm

One what that might make it clearer might be to make this a contain blocks of deleted lines, start line from where in current file it would otherwise exist, and an array of lines, but I'd argue that is unneeded, and would break the existing api, I would consider it less than ideal.

FWIW my use case if checking the current line of a file and seeing what line it was in the previous commit (current line number + changed line numbers before this - deleted lines before it). This would not work if there was say a 20 line block deleted before my current line and each was reported as being at a line which doesn't exist.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or are you saying that if it was two lines deleted you want it to say two lines where deleted which would otherwise exist at lines 54 and 55 if they had never been deleted?

Yes - I would expect the array of removed lines to refer to the original data. Old + Patch = New
A patch is something you apply to Old - in order to resolve what New is.

One what that might make it clearer might be to make this a contain blocks of deleted lines, start line from where in current file it would otherwise exist

Yea, Ranges would have been an option to describe sequences of changes. But as you say:

and would break the existing api,

It's also a breaking change to change how a sequence of change is reported. Currently it's an incrementing set of indicies. But this PR would change that to define ranges as a set of idential numbers in sequence. That will break logic in existing users of this gem. Which is why I don't think it's appropriate.

Fixing incorrect numbering is done thing, but changing how a range of changes is reported is an unrelated change which I don't think should be conflated into the same PR.

If the PR only accounted fixing the line numbering I think the test should expect:

[14, 38, 39, 40, 41, 42, 58]

I'm taking the starting line number of each chunk and incrementing it until I hit upon removed lines - then reporting those removed lines.

FWIW my use case if checking the current line of a file and seeing what line it was in the previous commit (current line number + changed line numbers before this - deleted lines before it). This would not work if there was say a 20 line block deleted before my current line and each was reported as being at a line which doesn't exist.

That explains the rationale for the range reporting changes in this PR - but it's still a fundamental breaking API change.

I can see the usefulness of having some mapping of line numbers, going both way in a patch. But I think that should be added as new functionality that doesn't break existing usage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current change is not breaking because the line numbers don't work at all right now.

I don't think the current edition works quite how you think it does, the patch numbers indicate a line number within the diff for that given file (although they all seem to have an off by one error since line numbers start at 1 and the index of a loop starts at 0). A patch is per file not per change, so counting line numbers for the whole diff can be not very useful.

The line numbers take the start of a given block of changes, takes the starting line for a block of changes from the RANGE_INFORMATION_LINE for the new file, and then adds to it for each removed line, but not any other lines so, to illustrate how this doesn't work:

For the current gem with this hypothetical block in a diff (lets assume there are no removed lines previous to this):

15 @@ -15,7 +22,7 @@ def stuff
16
17     method_call
18 +   add_a_line
19 -   delete_a_line
20 -   and_another
21 +   add_in_place
22 -   one_last_remove   

The results for removed lines patch numbers would be [18, 19, 21] and the line numbers would be [22, 23, 24]. You can see how this behaivior currently works if you look at the fixture file and compare it to the previous test.

I was attempting to be consistent it doing what it seemed the original was attempting to do, which is report the line numbers the deleted lines would exist at in the new file, since it is currently using the new file line numbers, just counts the line numbers incorrectly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this so it will work the way you were expecting. I see where you are coming from where old + patch = new.

I do believe this is more inconsistent with what was originally done, but it is quite possibly more correct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I have been using this gem for a project where I have some manual parsing done to resolve the line numbers correct. This recent version would allow me to rely on the gem instead of my own variation.

expect(patch.removed_lines.map(&:patch_position)).to eq [4, 21, 22, 23, 24, 25, 36]
end

Expand Down