Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The syntax parsing time and performance did not meet my expectations #4683

Open
cmmjxh opened this issue Aug 26, 2024 · 2 comments
Open

The syntax parsing time and performance did not meet my expectations #4683

cmmjxh opened this issue Aug 26, 2024 · 2 comments

Comments

@cmmjxh
Copy link

cmmjxh commented Aug 26, 2024

Hello, I have customized the syntax using Antlr4 (4.13.2), but when using the parser to parse, I found that the general performance loss for each syntax parsing is about 50ms. Our project can only allow us to control it within about 5ms. I am not sure if it is because our syntax definition is inaccurate, and the performance supported by Antlr4 can only reach this limit. Can you help answer my question?(java)
`grammar DataFusion;

@Header{
package org.example.code;
}

options {
language = Java;
}

// DataFusion 语法定义jql
jql: elements end ;

elements: element (',' element)* ; // Simplified to allow for easier parsing of multiple elements

element: ID ':' (strings | constant | function | expr) // Simplified and removed unnecessary alternatives
| all
| ignore
;

expr: term (op=('+'|'-') term)*
| factor (op=(''|'/') factor)
| number
| strings
| function
| '(' expr ')'
;

term: factor (op=('+'|'-') factor) *;
factor: number | strings | function | '(' expr ')' | ID;

function:
AGGOPER '(' argument ')'
| IFNULL '(' number ',' strings ',' argument ')'
| CONCAT '(' strings (',' strings)* ')'
;

argument: strings | number | expr | constant
;

number: INTEGER | FLOAT
;

strings : ID
;

end : ';';

constant: ''' ID ''' | '''''' | ''' (INTEGER | FLOAT) ''';

all : ''
| ID '
'
;

ignore : '-$'ID | '-$'all;

AGGOPER: 'SUM' | 'sum' | 'AVG' | 'avg' | 'COUNT' | 'count' | 'MAX' | 'max' | 'MIN' | 'min'
;

IFNULL: 'IFNULL' | 'ifnull'
;

CONCAT: 'CONCAT' | 'concat';

FLOAT : '-'? DIGIT+ '.' DIGIT+ ;
fragment DIGIT : [0-9] ;
ID : [a-zA-Z0-9_$\u0080-\uffff.]+ ;
Grammar_EOF: ';' ;
WS: [ \t\r\n]+ -> skip;`
解析耗时:57ms

@kaby76
Copy link
Contributor

kaby76 commented Aug 26, 2024

The grammar you provide is not valid. It doesn't pass the Antlr Tool. Please use correct Markdown syntax for code blocks.

@jimidle
Copy link
Collaborator

jimidle commented Aug 27, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants