Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to match when a number is followed by a % #199

Open
EnderShadow opened this issue Feb 22, 2024 · 3 comments
Open

Failure to match when a number is followed by a % #199

EnderShadow opened this issue Feb 22, 2024 · 3 comments
Labels
compromise Indicates that the current behavior is an unfortunate technical compromise enhancement

Comments

@EnderShadow
Copy link

EnderShadow commented Feb 22, 2024

I have some asm syntax which allows something like this minimal reproducer to occur, but it fails to compile with a no match found error.

#subruledef a
{
    s{x: u4} => x
}

#subruledef b
{
    %r{x: u4} => x
}

#ruledef
{
    {x: a} {y: b} => x @ y
}

s0 %r0

Changing the rule def and instruction to the following lets it compile successfully.

#ruledef
{
    {x: a}, {y: b} => x @ y
}

s0, %r0

The issue still occurs with --debug-no-optimize-matcher

@EnderShadow
Copy link
Author

Some additional information based on my testing. If I inline subrule b from above, it properly compiles

@EnderShadow
Copy link
Author

An even smaller POC if it matters.

#subruledef b
{
    % => 0x0`0
}

#ruledef
{
    {x: u8} {y: b} => x @ y
}

0 %

@hlorenzi
Copy link
Owner

Hmm, this might be unsolvable in the current architecture. The reason s0 %r0 fails is that the a subrule s{x: u4} starts parsing an expression after the s token, and the remaining 0 %r0 looks like a valid expression syntactically (using the modulo operator % on the literal 0 an the variable r0). To avoid any parser complexity, expression parsing is greedy. The solution, as you mention, is to introduce a separator token that looks invalid in an expression, such as a comma.

There is an exception, however, in the case of specifying everything in a single rule as s{x: u4} %r{y: u4}, where the parser will look ahead, and expression parsing can stop early. This means that extracting parameters into their own named subrules isn't exactly orthogonal in behavior, and the most powerful option is usually to specify everything monolithically. This can be annoying and might be worth improving in the future.

@hlorenzi hlorenzi added enhancement compromise Indicates that the current behavior is an unfortunate technical compromise labels May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compromise Indicates that the current behavior is an unfortunate technical compromise enhancement
Projects
None yet
Development

No branches or pull requests

2 participants