Skip to content

yeast: Prepare for type checking of rules#22112

Open
tausbn wants to merge 5 commits into
mainfrom
tausbn/yeast-reify-output-schema-as-ast-types
Open

yeast: Prepare for type checking of rules#22112
tausbn wants to merge 5 commits into
mainfrom
tausbn/yeast-reify-output-schema-as-ast-types

Conversation

@tausbn

@tausbn tausbn commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

A few changes and a bug fix in preparation for adding type checking of rewrite rules. Specifically

  • the addition of a rules! macro, which allows specifying multiple rules at the same time, as well as the input/output schemas to check said rules against,
  • an extension to the syntax of rules wherein root-level rust blocks (used in some of the more complicated rules) must be annotated to indicate the type and multiplicity of the node(s) they produce, and
  • A fix in matching (_) that would (before the fix) also match unnamed nodes, making the final catch-all rewrite rule superfluous.

Should be reviewed commit-by-commit.

tausbn added 4 commits July 2, 2026 12:58
For type checking rules, we need to be able to load schemas (so we know
what to check against). However, since we can't have yeast-macros
depending on yeast (where the schema-handling code currently lives) as
this would introduce a circular dependency, we instead split the
schema-related code into its own yeast-schema crate.
This macro allows the easy addition of multiple rules at the same time.
In addition, it also accepts an input and output schema, which
eventually will be used to check the validity of the rewrite rules.
In order to facilitate static type checking of rules (and to make it
easier for human readers as well), rust blocks at the root level (i.e.
rules of the form `... => { ... }`) must now have a type annotation in
front.

All other forms are unaffected: if the right hand side of a rule is a
tree, we can read the type of the root node directly. For interpolations
that happen inside of such a tree, we can recover the type by looking at
what field we're interpolating into, and consulting the output schema.

All existing uses have been updated to have the appropriate type
annotations, though these are of course not checked yet (and so could be
wrong).

Finally, this commit also removes the final catch-all rule `_ @node =>
{node}`. Because of the preceding rule that matches `(_) @node`, this
rule would only ever match unnamed nodes, and I think in practice it did
not match at all (at least not in our current set of tests).

To give it a proper type we would have to add some notion of an "any"
type, which I would like to avoid. If it _does_ turn out to be needed,
we can easily add it back (ideally with a test-case that shows why it's
still needed).
Turns out, `(_)` would match both named and unnamed nodes, as we never
checked the value of the `match_unnamed` field. This is the real reason
why the final catch-all rule we removed in the last commit was
superfluous -- unnamed nodes were being caught by the penultimate rule
instead (and mapped to `unsupported_node`).

Having fixed the bug, we now (correctly) get errors due to unmatched
unnamed nodes in the input. To fix this, we change the catch-all rule to
match unnamed nodes as well. This restores the previous behaviour
exactly.

At some point, we should find a better way to handle unnamed nodes, as
it seems wasteful to map these to `unsupported_node` (since we in
practice only use them for their string content). Perhaps we should not
attempt to translate unnamed nodes at all?
@tausbn tausbn added the no-change-note-required This PR does not need a change note label Jul 2, 2026
@tausbn tausbn marked this pull request as ready for review July 2, 2026 13:36
@tausbn tausbn requested review from a team as code owners July 2, 2026 13:36
Copilot AI review requested due to automatic review settings July 2, 2026 13:36

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prepares YEAST for future compile-time type checking of rewrite rules by introducing a rules! macro (bundling rules with input/output schema paths), requiring explicit return-kind annotations for Rust-block rule bodies, and tightening wildcard query semantics to align with tree-sitter’s named/unnamed distinction.

Changes:

  • Add a new yeast-schema crate to share schema/YAML loader logic between runtime (yeast) and proc-macros without pulling tree-sitter into proc-macro builds.
  • Extend yeast-macros with rules! parsing and the new annotated Rust-body form (=> kind { ... }, => kind? { ... }, => kind* { ... }), and update docs/tests accordingly.
  • Fix wildcard matching so (_) matches only named nodes (tree-sitter semantics), then update Swift translation rules and fallbacks to use the intended _ vs (_) behavior.
Show a summary per file
File Description
unified/extractor/tests/rules_macro_smoke.rs Adds a compile-only smoke test for yeast::rules! against real Swift schemas.
unified/extractor/src/languages/swift/swift.rs Updates Swift rules to the new annotation syntax and adjusts fallback wildcard usage.
shared/yeast/tests/test.rs Adds rules! macro tests and annotation-form tests; updates existing tests to new syntax.
shared/yeast/tests/input-types.yml Adds a small input schema used by new rules! tests.
shared/yeast/src/schema.rs Refactors schema handling to re-export from yeast-schema and provides tree-sitter adapter.
shared/yeast/src/query.rs Implements corrected named/unnamed wildcard semantics for (_) vs _.
shared/yeast/src/node_types_yaml.rs Re-exports YAML/JSON helpers from yeast-schema and keeps tree-sitter adapter.
shared/yeast/src/lib.rs Re-exports rules macro and switches schema ID types/CHILD_FIELD to yeast-schema.
shared/yeast/doc/yeast.md Documents annotation-form rule bodies and the new rules! macro.
shared/yeast/Cargo.toml Adds dependency on yeast-schema.
shared/yeast/BUILD.bazel Adds Bazel dep on //shared/yeast-schema.
shared/yeast-schema/src/schema.rs Introduces shared schema implementation (moved from yeast).
shared/yeast-schema/src/node_types_yaml.rs Introduces shared YAML/JSON node-types conversion/loading (moved from yeast).
shared/yeast-schema/src/lib.rs Defines yeast-schema crate public API and shared ID types/CHILD_FIELD.
shared/yeast-schema/Cargo.toml New crate manifest for yeast-schema.
shared/yeast-schema/BUILD.bazel New Bazel target for yeast-schema.
shared/yeast-macros/src/parse.rs Adds annotation parsing for Rust-block transforms and implements rules! parsing/expansion.
shared/yeast-macros/src/lib.rs Exposes the new rules! proc macro.
misc/bazel/3rdparty/tree_sitter_extractors_deps/defs.bzl Wires shared/yeast-schema into Bazel dependency maps.
Cargo.toml Adds shared/yeast-schema to the workspace.
Cargo.lock Adds the yeast-schema package entry and dependency edge.

Review details

  • Files reviewed: 20/21 changed files
  • Comments generated: 1
  • Review effort level: Low

Comment thread shared/yeast-macros/src/parse.rs Outdated
Happily, it turned out that there was already a library function for
handling this case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants