Skip to content

Initial lua support #1962

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Mar 19, 2024
Merged

Initial lua support #1962

merged 19 commits into from
Mar 19, 2024

Conversation

fidgetingbits
Copy link
Collaborator

@fidgetingbits fidgetingbits commented Oct 20, 2023

What

Adds support for the lua programming language

Checklist

  • Recorded tests for the new language
  • Used "change" / "clear" instead of "take" for selection tests to make recorded tests easier to read
  • Added a few specific tests that use "chuck" instead of "change" to test removal behaviour when it's interesting, especially:
    • "chuck arg" with single argument in list
    • "chuck arg" with multiple arguments in list
    • "chuck item" with single argument in list
    • "chuck item" with multiple arguments in list
  • Added @textFragment captures. Usually you want to put these on comment and string nodes. This enables "take round" to work within comments and strings.
  • Added a test for "change round" inside a string, eg "hello (there)"
  • [-] Supported "type" both for type annotations (eg foo: string) and declarations (eg interface Foo {}) (and added tests for this behaviour 😊)
  • Supported "item" both for map pairs and list entries (with tests of course)

Scope Support

Supported Tested Term Capture Definition Comment
list @list List type equivalent -
inside list @list.interior Inside of a list -
map @map Dictionary type equivalent -
inside map @map.interior Inside of a dictionary -
key @collectionKey Dictionary key equivalent -
funk @namedFunction A named function declaration -
inside funk @namedFunction.interior The inside of a lambda declaration -
funk name @functionName Name of declared function -
lambda @anonymousFunction A lambda declaration -
inside lambda @anonymousFunction.interior The inside of a lambda declaration -
name @name Variable name -
value @value Right-hand-side value of an assignment -
value @value Value returned from a function -
value @value Value of a key-value pair -
state @statement Any single coded statement -
if state @ifStatement An if conditional block -
condition @condition Condition of an if block -
condition @condition Condition of a while loop -
condition @condition Condition of a do while style loop -
- - condition @condition Condition of a for loop -
condition @condition Condition of a ternary expression -
branch @branch The resulting code associated with a conditional expression -
comment @comment Code comment -
string @string Single line strings -
string @string Multi-line strings #1962 (comment)
- @textFragment Used to capture string-type nodes (strings and comments) -
call @functionCall A function call (not a function definition) -
callee @functionCallee Name of the function being called -
arg @argumentOrParameter Arguments to functions and calls -
_ _ class @class Class or structure declaration -
_ _ inside class @class.interior The inside of a class declaration -
_ _ class name @className Name of class or structure declaration -
_ _ type @type Type declarations -

@fidgetingbits fidgetingbits marked this pull request as draft October 20, 2023 02:06
@fidgetingbits
Copy link
Collaborator Author

fidgetingbits commented Oct 25, 2023

I have a query like this for matching names (and a similar one for values):

(assignment_statement
    (variable_list
        name: (_) @name
        ;; Handle multiple variable declarators in one statement, eg
        ;;!! local b, c = "Hello", "World"
        ;;!  ------^--^-------------------
        (#allow-multiple! @name)
    ) @_.leading.end.startOf
) @_.domain
(variable_declaration
    (assignment_statement
        (variable_list
            name: (_) @name
            ;; Handle multiple variable declarators in one statement, eg
            ;;!! local b, c = "Hello", "World"
            ;;!  ------^--^-------------------
            (#allow-multiple! @name)
        ) @_.leading.end.startOf
    )
) @_.domain

Originally I wanted to only use an additional line (variable_declaration) @name.domain instead of repeating the whole longer assignment_statement part, but it seems that because name isn't actually part of the query for variable_declaration part itself it won't associate with names from a separate match inside of a (variable_declaration) node. My hope was that
because assignment_statement sometimes (not always) is an inner node of variable_declaration, then the name captures from that might get associated with the domain of variable_declaration. Is this possible somehow without repeating the duplication I've done above?

EDIT: Specific case above doesn't actually matter now, because due to new chuck behavior they have to be different queries anyways. But still curious about the general problem of applying a domain to a query that isn't actually capturing anything itself.

@fidgetingbits
Copy link
Collaborator Author

fidgetingbits commented Oct 25, 2023

I was starting to work on the ;;!! style documentation for this and realized something interesting.

Atm for lua if I have local a = 1 and chuck name, it will result in local = 1.

This behavior is consistent with:

  • rust let a = 1 becomes let = 1
  • cpp string name = "foo"; becomes string = "foo";

However, javascript and typescript are different: const a = 1; becomes 1;

Should we make sure all of these languages are actually consistent? It seems like a generic enough coding construct that it seems like it should be consistent across cursorless.

I think it's probably reasonable to just follow the way it works in js/ts, so I'm tempted to do that for lua as well, but maybe this discussion already took place when whoever implemented rust and cpp did it.

@AndreasArvidsson
Copy link
Member

Typescript is our reference language and ideally the other languages would mirror typescripts behavior went appropriate.

@pokey
Copy link
Member

pokey commented Oct 25, 2023

Yeah those other languages just haven't been updated to next gen scopes. It's harder to do that fancy removal stuff in our legacy scope defs

@fidgetingbits
Copy link
Collaborator Author

Seems there isn't a way to target the 'let / const' part in ts or 'global' scope part in python. I couldnt find any issues talking about this from a quick scan either, just curious what the thoughts are on why that doesnt exist or if there are plans. Lua uses 'local' for variable scoping and it seems that type of syntax might be a useful target too? Like 'scope' or something.

@fidgetingbits fidgetingbits marked this pull request as ready for review October 27, 2023 02:02
@fidgetingbits
Copy link
Collaborator Author

I think this is probably good for review now and has most of what's needed. Possibly some missing iterations and domains and weird edge cases I didn't think of.

Two small things:

First, I didn't bother trying to play with the private.fieldAccess stuff yet as I figure you folks can first settle how it will work first. Just noting it in case you want me to poke around with it or give feedback on how it might work in lua.

Second, even though lua doesn't support a ternary, there is an idiomatic equivalent which is of the form local max = x > y and x or y. This functions exactly like a ternary in practice, but I wasn't really sure if I should try to special case something so that it supports branches in the same way that a real ternary in another language would work. I just decided to leave it out for now. But if you think it should be added right away, let me know.

@fidgetingbits
Copy link
Collaborator Author

I just realized I forgot to add multi-line string support, which won't work with the existing cursorless string parsing, similar to what I ran into for nix. lua uses [[ abc ]] for multi-lines.

I did actually add support for language-specific overrides in the nix PR #1911 . I could break that logic out into a separate PR so that this could benefit from it early, or can wait to implement lua multi-line string support after the nix PR lands.

@pokey
Copy link
Member

pokey commented Nov 8, 2023

I did actually add support for language-specific overrides in the nix PR #1911 . I could break that logic out into a separate PR so that this could benefit from it early, or can wait to implement lua multi-line string support after the nix PR lands.

I would be tempted to do that, because a) we prefer atomic PRs to the extent practical, and b) that PR has lots of interesting discussion points 😅

@pokey
Copy link
Member

pokey commented Nov 8, 2023

First, I didn't bother trying to play with the private.fieldAccess stuff yet as I figure you folks can first settle how it will work first. Just noting it in case you want me to poke around with it or give feedback on how it might work in lua.

Up to you; if you think it would be useful, go for it

Second, even though lua doesn't support a ternary, there is an idiomatic equivalent which is of the form local max = x > y and x or y. This functions exactly like a ternary in practice, but I wasn't really sure if I should try to special case something so that it supports branches in the same way that a real ternary in another language would work. I just decided to leave it out for now. But if you think it should be added right away, let me know.

Hmm. Agreed it feels a bit special. I could go either way

@pokey
Copy link
Member

pokey commented Nov 8, 2023

Seems there isn't a way to target the 'let / const' part in ts or 'global' scope part in python. I couldnt find any issues talking about this from a quick scan either, just curious what the thoughts are on why that doesnt exist or if there are plans. Lua uses 'local' for variable scoping and it seems that type of syntax might be a useful target too? Like 'scope' or something.

Maybe? You can always get it using "first token state" or just a hat

@pokey pokey mentioned this pull request Nov 8, 2023
@pokey
Copy link
Member

pokey commented Nov 8, 2023

I love your support tables; filed #2010 to discuss how to formalize them

@fidgetingbits
Copy link
Collaborator Author

I did actually add support for language-specific overrides in the nix PR #1911 . I could break that logic out into a separate PR so that this could benefit from it early, or can wait to implement lua multi-line string support after the nix PR lands.

I would be tempted to do that, because a) we prefer atomic PRs to the extent practical, and b) that PR has lots of interesting discussion points 😅

Created PR #2012, though may not be quite as easy as planned.

@pokey
Copy link
Member

pokey commented Nov 15, 2023

"branch"

image

Not sure bout this one ⬆️🤔

image

I can see the logic of this one ⬆️, though, and it's consistent with the above. Idk @josharian @AndreasArvidsson wdyt?

@AndreasArvidsson
Copy link
Member

"branch"

image

Not sure bout this one ⬆️🤔
image

I can see the logic of this one ⬆️, though, and it's consistent with the above. Idk @josharian @AndreasArvidsson wdyt?

I can definitely see the logic, but I do wonder if we want to include end

@pokey
Copy link
Member

pokey commented Nov 15, 2023

"condition"

image

This ⬆️ looks sound, tho I have mixed feelings about the domain of the top condition including the second two branches. I don't feel strongly, though; I guess there's a logic to it.

image

These look wrong to me

@pokey
Copy link
Member

pokey commented Nov 15, 2023

"map"

image

Looks more like a list to me 🤔

@pokey
Copy link
Member

pokey commented Nov 15, 2023

"name"

image

In typescript and Python, I believe "name" would be success, result. But I know we have some open discussions on how to handle destructuring / multiple return values cc/ @AndreasArvidsson @josharian

image

Looks like the second line might have the scope doubled up? I guess that's prob because two targets are getting created, as above. But I wonder if "chuck" will choke on that

@pokey
Copy link
Member

pokey commented Nov 15, 2023

"state"

image

We don't usually consider individual branches to be statements

@pokey
Copy link
Member

pokey commented Nov 15, 2023

"value"

image

Some more scope doubling here; see how there are two overlapping domains with the same range

image

More scope doubling

image

We would usually exclude the {} from the value iteration scope, as the thinking is that in this example, those curly braces actually delimit a value themselves, ie the value of that assignment statement. I did check, though, and it looks like "every value" still works as expected, because the domain of the value doesn't start at those curly braces. But that's kind of an accident of how we iterate: if we iterated backwards instead of forwards for "every", you'd get just the return value

image

As above, I don't see this as a map, so I'm not sure it should be value iteration scope

image

The whole file should be an iteration scope for "value", so you can say "every value" at the top level and get the values of assigments

image

Similarly, function body should be iteration scope for value

@pokey
Copy link
Member

pokey commented Nov 15, 2023

"arg"

image

As above, we usually exclude the () from arg iteration

Copy link
Member

@pokey pokey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, fantastic first ante here. I left some comments using screenshots above. I haven't reviewed your .scm file in any detail as I figured we might want to discuss the behaviour mentioned in the screenshots first

@fidgetingbits
Copy link
Collaborator Author

"branch"

image Not sure bout this one ⬆️🤔 image I can see the logic of this one ⬆️, though, and it's consistent with the above. Idk @josharian @AndreasArvidsson wdyt?

I can definitely see the logic, but I do wonder if we want to include end

Finally able to take a look at your responses.

My reasoning was that if state itself includes end, where as it seemed to me the branch should be distinct. I'm open to being consistent with other languages of course, but my questions are:

  1. What's the issue with having the branch logic for if only? I will give an example why I think it's useful:

Say I have this:
image

I can say bring branch blue red to air and get something like this:

image

  1. I don't understand why you would want to include end, given if state would already handle that? Certainly you wouldn't want it for chuck in the else branch case, but I'll give an example of why I think it's useful to not include end for non-chuck case (similar to above). Say I have this:

image

I can say bring branch red sun after peck and get:

image

If I include end this wouldn't allow that.

@pokey
Copy link
Member

pokey commented Dec 12, 2023

@josharian @AndreasArvidsson see above re "branch". I know you guys have stronger opinions on branch ranges than I do 😅

@pokey
Copy link
Member

pokey commented Feb 1, 2024

Ok I factored out a couple semi-independent changes:

I kept the code here too, but presuming those PRs don't change much before merge, we should be able to cleanly merge as the changes will be identical

@pokey pokey mentioned this pull request Feb 2, 2024
13 tasks
@fidgetingbits
Copy link
Collaborator Author

fidgetingbits commented Mar 5, 2024

Ok nearly there! I pushed a bunch of changes. Worth having a look; if they make sense, would be great if you could apply similar tweaks to your other PRs if they apply

Thanks for that, it's a lot cleaner.

I should finally have time to start revisiting those other PRs soon, so will apply this style to them soon.

I migrated a few of the tests to facet tests where the facet test was a dramatic improvement. We'd love for all tests to be facet tests, but I don't want to block PRs on that if the PR was created before facet tests existed

I intend to start porting the old tests to the new facet tests pretty much right away, but as I know at least saidelike wants to start using the lua grammar asap, I feel like they can wait for a follow up PR if that's good with you.

The last issue here is that the condition tests seem to be failing for some reason. Guessing that broke when you tweaked condition based on my earlier feedback?

I just fixed those, so hopefully it's good to go as is. I guess I learned by lesson not to force push lol, as I have no idea where that code went... I have a habit to git rebase and not merge, but my workflow doesn't work so well for PRs it seems. I guess you usually git merge --squash upstream/main for PR branches instead?

Suppose I should note there is still #2012, but imo that isn't really blocking for lua atm as I've been using it for a lua project for a bit and is working well. I guess I should note I'm gonna have to learn typescript and setup proper cursorless extension debugging and what not to help work on cursorless-neovim, so it may be that I can fix #2012 myself too once that's working

@pokey
Copy link
Member

pokey commented Mar 11, 2024

looks like there's a bunch of test failures. I think maybe you need to add an entry to https://github.com/cursorless-dev/cursorless/blob/main/packages/common/src/extensionDependencies.ts so that CI installs a Lua extension

@fidgetingbits
Copy link
Collaborator Author

looks like there's a bunch of test failures. I think maybe you need to add an entry to https://github.com/cursorless-dev/cursorless/blob/main/packages/common/src/extensionDependencies.ts so that CI installs a Lua extension

That didn't work, so will have to revert it I guess. I noticed here that lua is a default known identifier...

Seems almost as if lua.scm isn't getting picked up, but I don't see why it wouldn't be detected like every other language, after looking through LanguageDefinitions.ts

@fidgetingbits
Copy link
Collaborator Author

I've got something setup so I can run the github workflows locally using act, so hopefully can try to jump into the docker instance and poke around a bit...

@fidgetingbits
Copy link
Collaborator Author

fidgetingbits commented Mar 13, 2024

I've got something setup so I can run the github workflows locally using act, so hopefully can try to jump into the docker instance and poke around a bit...

Running the workflows locally didn't end up helping so far unfortunately, as they fail for different reasons I haven't been able to fix yet.

One thing I realized though is that the nix lang support tests are passing... If there was any one of the pull requests that I would expect to be failing for the languageId not existing it would be that one, so I'm at a bit of a loss as to why this one isn't working.

@pokey
Copy link
Member

pokey commented Mar 14, 2024

I noticed here that lua is a default known identifier...

ah right in that case yeah I'd roll back your change. Nice find

@pokey
Copy link
Member

pokey commented Mar 14, 2024

One thing I realized though is that the nix lang support tests are passing

that's because it installs nix dependency https://github.com/cursorless-dev/cursorless/pull/1911/files#diff-22615f93d65e3ad864e07dbd196d6094287fc773d4daa41f6f275bf1ee029f7e

@pokey
Copy link
Member

pokey commented Mar 14, 2024

I've got something setup so I can run the github workflows locally using act, so hopefully can try to jump into the docker instance and poke around a bit...

what happens when you run tests locally without act? Ie according to instructions in contributor guide?

@fidgetingbits
Copy link
Collaborator Author

fidgetingbits commented Mar 15, 2024

what happens when you run tests locally without act? Ie according to instructions in contributor guide?

Running the lua subset tests, I get three test errors, and the rest pass. But the three errors are seemingly totally unrelated to the github workflow errors:

TypeError: injectIde is not a function

and

Error: Could not get pokey.cursorless extension

Not sure why yet.

@fidgetingbits
Copy link
Collaborator Author

Ya, so after fixing the other errors (had to do with direnv and vscode issues again), all the lua tests pass locally. I also don't have any special extensions loaded into the dev profile for vscode, command-server, parse tree, and nix.. so seems to be a github workflow only issue

@fidgetingbits
Copy link
Collaborator Author

fidgetingbits commented Mar 17, 2024

I was going through the failed test log again and noticed a lot earlier than the actual flagged red errors there was lots of complaints from lua.scm about @dummy being invalid identifier, and then noticed you folks used @_dummy not @dummy in your .scm files, so I just switched it to @_dummy and it resolved the languageId problem errors later on. but now other specific lua tests are failing.

maybe worth having a pre-commit checker or other check somewhere to prevent use of invalid identifiers like @dummy

fwiw I think the new test failures are all for the new style scope facet tests, which my subset test run locally wasn't using, so will investigate that next. forgot you had to run those tests separately.

@fidgetingbits
Copy link
Collaborator Author

I'm out of ideas again. The facet coverage tests do all pass on my system (and were running):

    ✔ lua facet coverage
extensionHostProcess.js:147
    ✔ scopes/lua/branch.if (50ms)
extensionHostProcess.js:147
    ✔ scopes/lua/functionCallee (63ms)
extensionHostProcess.js:147
    ✔ scopes/lua/map (68ms)
extensionHostProcess.js:147
    ✔ scopes/lua/name.assignment (68ms)
extensionHostProcess.js:147
    ✔ scopes/lua/name.variable (58ms)
extensionHostProcess.js:147
    ✔ scopes/lua/namedFunction (71ms)
extensionHostProcess.js:147
    ✔ scopes/lua/value.assignment (64ms)
extensionHostProcess.js:147
    ✔ scopes/lua/value.variable (66ms)
extensionHostProcess.js:

The error output from the failing test log that show the actual and expected look identical to me too, so not sure what it's running into

@pokey
Copy link
Member

pokey commented Mar 19, 2024

we changed fixture format so just run the update fixtures launch config locally

Copy link
Member

@pokey pokey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's ship it!

@pokey pokey enabled auto-merge March 19, 2024 17:22
@pokey pokey added this pull request to the merge queue Mar 19, 2024
Merged via the queue into cursorless-dev:main with commit aa76859 Mar 19, 2024
pokey added a commit that referenced this pull request Mar 19, 2024
@fidgetingbits fidgetingbits deleted the lua-lang branch March 20, 2024 00:18
github-merge-queue bot pushed a commit that referenced this pull request Mar 25, 2024
github-merge-queue bot pushed a commit that referenced this pull request Jun 13, 2024
This is me breaking out some functionality that was part of PR #1911
since that PR has lots of work to be done and it may still benefit other
languages that get in sooner. I've applied the suggested changes from
pokey in that PR's code review.

I've also added a placeholder map for lua which uses `[[ ]]` for
multiline support as noted in PR #1962, however I realized as I went to
add it that it's not as simple as the way I did it for nix because I
can't just reuse the existing `singleQuote` entry since lua actually
uses single quotes. So this may actually be a bit harder, and reminds me
of [this
discussion](#1992 (comment)).
So before I go randomly hacking stuff I'm curious what you think the
best approach here is.

These are the comments I had left in the Nix PR about these changes:

* I called returning delimiterToText as getSimpleDelimiterMap to kind of
mirror complexDelimiterMap.

* I'm not sure is if you still want a delimiterMap.ts standalone file,
and then a getSimpleDelimiterMap.ts only for that function. Now
delimiterToText isn't referenced anywhere else, so seemed maybe okay to
keep them together.

* I also thought about adding getComplexDelimiterMap(), but atm because
complexDelimiterMap only ever uses the keys from the delimiterToText it
won't change even if the language has different values, so seemed
unnecessary. These are all things I could guess at your preferences, but
may as well ask instead.

* I also noticed leftToRightMap in that file isn't actually used
anywhere so can be deleted I think, but also not sure about doing that
as part of a totally unrelated code change, to keep commits clean.


## Checklist

- [x] I have added
[tests](https://www.cursorless.org/docs/contributing/test-case-recorder/)
- [-] I have updated the
[docs](https://github.com/cursorless-dev/cursorless/tree/main/docs) and
[cheatsheet](https://github.com/cursorless-dev/cursorless/tree/main/cursorless-talon/src/cheatsheet)
- [x] I have not broken the cheatsheet

---------

Co-authored-by: fidgetingbits <[email protected]>
Co-authored-by: Pokey Rule <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants