Lexer

Grammar Well utilizes a stateful lexer, which is optional but highly recommended due to its significant assistance in constructing production rules for the grammar. The lexer configuration comprises two subsections: config and states. The configuration section must be placed at the top.
Currently, the only configuration option available is the optional setting start, which determines the initial lexer state.

A lexer state is a named collection of rules that define how to tokenize input when in that state.

lexer {
    [string]
        - import singleQuoteString, doubleQuoteString

    [singleQuoteString]
        - when r:{'} tag "squote" highlight "string" goto singleQuoteStringEnd

    [singleQuoteStringEnd]
        - when r:{\\[\\\/bnrft]} tag "escaped"
        - when r:{\\'} tag "quoteEscape"
        - when r:{\\u[A-Fa-f\d]{4}} tag "escaped"
        - when r:{\\.} tag "badEscape"
        - when r:{[^'\\]+} tag "string" highlight "string"
        - when "'" tag "squote" highlight "string" pop
    
    [doubleQuoteString] span {
        [start]
              - when "\""  tag "dquote" highlight "string"
        
        [span]
            - when r:{\\[\\\/bnrft]} tag "escaped" highlight "constant"
            - when r:{\\"} tag "quoteEscape"
            - when r:{\\u[A-Fa-f\d]{4}} tag "escaped" highlight "constant"
            - when r:{\\.} tag "badEscape"
            - when r:{[^"\\]+} tag "string" highlight "string"
        
        [stop]
            - when "\"" tag "dquote" highlight "string"
    }
}

In the above example, we start with a state named string followed by a - delimited list of rules. There are two types of rules: import rules and matching rules.
Order is important: rules are evaluated from top to bottom.

Import

The import rule expects a comma-separated list of states whose rules are to be imported into this state. This is a convenient way of keeping your rules DRY.

- import singleQuoteString, doubleQuoteString

Match

Match rules, as the name implies, declare what to match in the input stream.

- when r:{[^"\\]+} tag "string" highlight "string"
- when "\"" tag "dquote" highlight "string" pop

Directives

Name	Arguments	Notes	Behavior
`when`	string \| regex	Required. Exclusive with `before`, `skip`	What to match in the input stream
`before`	string \| regex	Required. Exclusive with `when`, `skip`	What to match but does not consume the input stream, should be used in conjunction with `goto`, `pop`, `inset`
`skip`	string \| regex	Required. Exclusive with `when`, `before`	What to match but does not get propagated to the grammar, the matched text is ignored

`goto`	word	Must be a valid state Exclusive with `pop`, `inset`, `stay`	Moves to the defined state and adds the current state onto the stack
`pop`	number \| none	If included, implicitly uses 1 if no argument is provided Exclusive with `goto`, `inset`, `stay`	Pops 1 or the number of states off the stack
`inset`	number \| none	If included, implicitly uses 1 if no argument is provided Exclusive with `goto`, `pop`, `stay`	Adds the current state onto the stack 1 or the number of times defined.
`stay`	none	Exclusive with `goto`, `pop`, `inset`	Prevents state switching when used in a span state

`tag`	string(s) comma separated	ex: `tag "tag1", "tag2", "tag3"`	Applies 1 or more tags to the matched token; these can be referenced in the grammar

`highlight`	string		This isn't used directly but can be used to help generate syntax highlighting.

Spans

Spans are a lexer construct that define a lexer state for a specific language fragment that is enclosed by a start delimiter and an end delimiter, with its own set of tokens in the middle section.
For example, a string enclosed by quotation marks acts as a span, where the start and end quotes serve as delimiters, and the content inside may include special characters.

[doubleQuoteString] span {
    [start]
        - when "\""  tag "dquote" highlight "string"
    
    [span]
        - when r:{\\[\\\/bnrft]} tag "escaped" highlight "constant"
        - when r:{\\"} tag "quoteEscape"
        - when r:{\\u[A-Fa-f\d]{4}} tag "escaped" highlight "constant"
        - when r:{\\.} tag "badEscape"
        - when r:{[^"\\]+} tag "string" highlight "string"
    
    [stop]
        - when "\"" tag "dquote" highlight "string"
}

Grammar Well

Import

Match

Directives

Spans

Table of Contents