-
Notifications
You must be signed in to change notification settings - Fork 15
Fix how line numbers are counted in Patch#removed_lines #183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… non removed line
spec/git_diff_parser/patch_spec.rb
Outdated
@@ -27,7 +27,7 @@ module GitDiffParser | |||
patch = Patch.new(patch_body) | |||
|
|||
expect(patch.removed_lines.size).to eq(7) | |||
expect(patch.removed_lines.map(&:number)).to eq [11, 36, 37, 38, 39, 40, 48] | |||
expect(patch.removed_lines.map(&:number)).to eq [14, 39, 39, 39, 39, 39, 54] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is patch
reporting back the same lines multiple times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at patch.diff
, the original line numbers looks like what I would expect. It have 4 chunks - 3 three which removes lines. The third chunk removes five lines, which refer to the sequence [36, 37, 38, 39, 40]
. To return [39, 39, 39, 39, 39]
doesn't appear to make sense to me.
Also, you replaced line 11
with line 14
, but according to the patch line 14 is an addition in the second chunk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple lines in a row deleted. If you were to look at this in a graphical diff side by side you would see something similar were 5 lines point down to one line in the new version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So really what is happening here is a block of 5 lines was deleted from line 39 if that makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would change the reported line numbers from reporting the line number of the source to reporting the line numbers of the patched result. Not what I'd expect from such an API. Given a list of the original line numbers removed it's trivial to compute the target line numbers.
Given the original source, and this diff, I would expect to be able to iterate the collection of removed line numbers and use that as indicies to access each removed line.
This new sequence only makes sense if you apply the patch line by line while iterating the array of removed line numbers.
Take for instance how GitHub display diffs;
If you create such a view you want the line numbers for the original source. You iterate those, mark the lines with a removed
class. Then for the other view you apply the patch, and then iterate the line numbers of the added lines and mark them added
.
This would become very difficult to handle if you collapse the numbers like you do in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So first of all, currently this simply reports line numbers incorrectly. It is reporting the start of the patch + number of lines removed with no regard to any changed or non changed lines. The line numbers don't match up at all.
In your pictured example above it would tell you that lines 1, 2, 3, and 4 were removed.
Secondly, the way it is setup is to track line numbers in the new file not the old file. Take for instance the below screenshot. From the fixture, I'm reporting that a line was removed which would otherwise reside at line 54. Previously it reported that it deleted a line at line 48 which is simply incorrect, and I'm not clear on what you want it to report. Do you want it to report that line 58 from the old file no longer exists? Then how do interpret it in terms of the new file? Or are you saying that if it was two lines deleted you want it to say two lines where deleted which would otherwise exist at lines 54 and 55 if they had never been deleted?
One what that might make it clearer might be to make this a contain blocks of deleted lines, start line from where in current file it would otherwise exist, and an array of lines, but I'd argue that is unneeded, and would break the existing api, I would consider it less than ideal.
FWIW my use case if checking the current line of a file and seeing what line it was in the previous commit (current line number + changed line numbers before this - deleted lines before it). This would not work if there was say a 20 line block deleted before my current line and each was reported as being at a line which doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or are you saying that if it was two lines deleted you want it to say two lines where deleted which would otherwise exist at lines 54 and 55 if they had never been deleted?
Yes - I would expect the array of removed lines to refer to the original data. Old + Patch = New
A patch is something you apply to Old
- in order to resolve what New
is.
One what that might make it clearer might be to make this a contain blocks of deleted lines, start line from where in current file it would otherwise exist
Yea, Ranges would have been an option to describe sequences of changes. But as you say:
and would break the existing api,
It's also a breaking change to change how a sequence of change is reported. Currently it's an incrementing set of indicies. But this PR would change that to define ranges as a set of idential numbers in sequence. That will break logic in existing users of this gem. Which is why I don't think it's appropriate.
Fixing incorrect numbering is done thing, but changing how a range of changes is reported is an unrelated change which I don't think should be conflated into the same PR.
If the PR only accounted fixing the line numbering I think the test should expect:
[14, 38, 39, 40, 41, 42, 58]
I'm taking the starting line number of each chunk and incrementing it until I hit upon removed lines - then reporting those removed lines.
FWIW my use case if checking the current line of a file and seeing what line it was in the previous commit (current line number + changed line numbers before this - deleted lines before it). This would not work if there was say a 20 line block deleted before my current line and each was reported as being at a line which doesn't exist.
That explains the rationale for the range reporting changes in this PR - but it's still a fundamental breaking API change.
I can see the usefulness of having some mapping of line numbers, going both way in a patch. But I think that should be added as new functionality that doesn't break existing usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current change is not breaking because the line numbers don't work at all right now.
I don't think the current edition works quite how you think it does, the patch numbers indicate a line number within the diff for that given file (although they all seem to have an off by one error since line numbers start at 1 and the index of a loop starts at 0). A patch is per file not per change, so counting line numbers for the whole diff can be not very useful.
The line numbers take the start of a given block of changes, takes the starting line for a block of changes from the RANGE_INFORMATION_LINE for the new file, and then adds to it for each removed line, but not any other lines so, to illustrate how this doesn't work:
For the current gem with this hypothetical block in a diff (lets assume there are no removed lines previous to this):
15 @@ -15,7 +22,7 @@ def stuff
16
17 method_call
18 + add_a_line
19 - delete_a_line
20 - and_another
21 + add_in_place
22 - one_last_remove
The results for removed lines patch numbers would be [18, 19, 21] and the line numbers would be [22, 23, 24]. You can see how this behaivior currently works if you look at the fixture file and compare it to the previous test.
I was attempting to be consistent it doing what it seemed the original was attempting to do, which is report the line numbers the deleted lines would exist at in the new file, since it is currently using the new file line numbers, just counts the line numbers incorrectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this so it will work the way you were expecting. I see where you are coming from where old + patch = new.
I do believe this is more inconsistent with what was originally done, but it is quite possibly more correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I have been using this gem for a project where I have some manual parsing done to resolve the line numbers correct. This recent version would allow me to rely on the gem instead of my own variation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing another take on this.
Now I guess we have to see of the repo owner agrees. :)
I have a branch ready which adds upon this to add the data I originally wanted (a memoized function which gives us the lines which those deleted lines would reside out in new file), plus solving my need (a function which given a line from the current file, tells you which line it resided at in the old file, if it existed). This work is dependent on this branch, I can either include it in this PR or wait for this one to go through and then open up the branch or fold it into this PR. I think they are seperate issues, but would like to see both go in, so waiting till this gets in for now. |
Thanks, I'm in a vacation so after that I'll check this.
…On Sat, Mar 3, 2018, 03:57 Jordan Oroshiba ***@***.***> wrote:
@sanemat <https://github.com/sanemat>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#183 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEmuJeqy4JlsalW_SBh0eJi_D_jqcvwks5taaQXgaJpZM4SC2N9>
.
|
@@ -93,6 +93,8 @@ def removed_lines | |||
) | |||
lines << line | |||
line_number += 1 | |||
when NOT_REMOVED_LINE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be wrong, but I think we should only be incrementing line_number
on unchanged lines, i.e. when /^\s/
.
Here is a fix and a failing test: cjoudrey@a37ad98#diff-5167fa12a2c6d40aaa8884e40efa7ac4
Explanation:
Original file:
type Query {
a: String
b: String
c: String
}
Patch:
@@ -1,5 +1,5 @@
type Query {
+ d: String
a: String
b: String
- c: String
}
The number of the line being removed here is number 4
and patch position 5
.
With this current pull request, the number is 5
and patch position is 5
.
@@ -1,7 +1,7 @@ | |||
module GitDiffParser | |||
# Parsed patch | |||
class Patch | |||
RANGE_INFORMATION_LINE = /^@@ .+\+(?<line_number>\d+),/ | |||
RANGE_INFORMATION_LINE = /^@@ -(?<old_line_number>\d+),.+\+(?<line_number>\d+),/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can, tell ,\d+
should be optional.
For example:
@@ -1 +1 @@
-type Query { field: String! }
+type Query { field2: String! }
I believe we can change this line to:
RANGE_INFORMATION_LINE = /^@@ -(?<old_line_number>\d+)(,\d+)? \+(?<line_number>\d+)(,\d+)?/
I wrote a failing test and fix here: cjoudrey@858a945
What's the status of this PR? Anything missing from the PR author? Or are we waiting for review? |
@sandstrom in all honesty, this is 4 years old and I should have done something 4 years ago. Other priorities, I had forked and moved on. I think everyone can agree the behavior is better but I don't have the context really anymore on @cjoudrey to say whether or not those changes should be made, and I have no ability to merge. As far as I know the project I was working on which was pointing to my fork is still doing so, but I've moved on since then. |
@joroshiba Alright, just curious since we ran into this a few weeks ago 😄 Thanks for the update! |
Non removed lines should be counted towards the line number exclusively, test was simply incorrect previously. Manually checked against the diff to confirm this.