The grammar section is heavily based on the EBNF notation with some additional syntax to help with generating output. If you aren't familiar with EBNF it would be good to start there before continuing.

Rules

The basic syntax of a Rule is a Rule Name enclosed by [] brackets and then a set of symbols (combinations of Terminals and Non Terminals). Multiple Production Rules can be declared for a Rule Name by seperating them with a |. Depending on the parsing algorithm you choose there maybe additional constraints on the Production Rules.
Consider this example:

[Start] 
    |  HelloGoodbye __ Target

[HelloGoodbye]
    | "Hello"
    | "Goodbye"

[Target] 
    | "World" 
    | \r "[a-zA-Z]+"

[__] 
    <ws>

There are 4 Non Terminals; Start, HelloGoodbye, Target, and __.

5 Terminals; "Hello", "Goodbye", "World", \r "a-zA-Z+", and <ws>_.

You will also notice that we use an optional leading pipe | for the [Start] rule but omit it in the [__] rule. This is not a typo this is showcasing styling differences.

Rule Name

The name of a rule must be a wordLoading...

Symbols

Non Terminals

Non Terminals, represented by a word are references to another production rule in the grammar.

Terminals

A Terminal is a literal, regex expression or token found on the right hand side of a production rule. Terminals are what lexer tokens are evaluated against.

Literals

Literals are double quoted strings that are case-sensitively matched in the lexer stream.

"Hello"

Optionaly strings can be modified to allow case insensitive matching by prepending the string with \i.

\i "Hello"

Regular Expressions

Grammar Well supports Regular Expressions but it is written in TypeScript, so it is limited to the capabilities of JavaScript's regex.

\r "[a-zA-Z]+"

Token Tags

Token Tags are set in the lexer and are matched in the grammar by wrapping the token tag in angled brackets.

<token>

Sub-Expressions

Symbols can be grouped with () seperating each option with a | to create an inline sub-expression

The following are equivalent:

[RuleName]
    ("Hello" | "Goodbye") "World"
[RuleName]
    SubRule "World"
[SubRule]
    | "Hello" 
    | "Goodbye"

Quantifiers

A Quantifier refers to ?,*, and + which can be appened to symbols. These characters match the behavior commonly found in Regular Expressions.

SymbolQuantityExample
?0 or 1"a"?
*0 or more"b"*
+1 or more"c"+

Post Processors

In addition to the standard anatomy of EBNF. Grammar Well supports Post Processors. Post Processors are used to either evaluate or transform a matched production rule's values.

Types

TypeExampleDescription
Array=> [ ...$0, $3.value ]JavaScript Array syntax
Expression=> ( JSON.parse($0.value) )JavaScript Expressions wrapped in paranthesis
Function Body=> { return JSON.parse($0.value) }JavaScript Function body syntax
Interpolation=> ${ ({data}) => JSON.parse(data[0].value) }Interpolates content into parser. This is expected to be invokable

Positioning

They can immediately follow a rule name to apply to each rule as the default but overridable postprocessor.

[RuleName] => ($0.value)  
    | "World"
    | "Goodbye"
[RuleName]
    | "World" => ($0.value)
    | "Goodbye" => ($0.value)

Ordinal References

The Javascript Template version expects a function body and is provided a variable data. It will also do simple string replacements. For example any $ followed by a number will be replaced with data[number].

[RuleName]
    "Hello" => ( $0.value )
[RuleName] 
    "Hello" => ${ ({data}) => data[0].value }

Aliased References

Keeping tracking of the ordinal index of your symbols in an expression can be tedious, so Grammar Well also provides aliasing. Any symbol in an expression can be suffixed with :word. That wordLoading... can then be referenced in the template.

[RuleName]
    "Hello":hello => ( $hello.value )
[RuleName]
    "Hello" => ${ ({data}) => data[0].value }