Naive EBNF LL Autocompletion parser.
npm install naive-autocompletion-parserThis was written mainly for the UHK macro language. It may or may not be usable for other projects - depending on how much you are willing to content yourself with our ebnf notation or deep diving into the code.
Simplified example using an excerpt of the actual UHK grammar:
``
import { ParserBuilder } from 'naive-autocompletion-parser'
let grammarText =
BODY = COMMENT
BODY = [LABEL:] COMMAND [COMMENT]
COMMENT = //
COMMAND = [CONDITION|MODIFIER]* COMMAND
COMMAND = {exec|call|fork} MACRONAME
COMMAND = {pressKey|holdKey|tapKey|releaseKey} SHORTCUT
COMMAND = tapKeySeq [SHORTCUT]+
COMMAND = set module.MODULEID.navigationMode.LAYERID_BASIC NAVIGATION_MODE
COMMAND = set module.MODULEID.baseSpeed
COMMAND = set module.MODULEID.speed
COMMAND = set module.MODULEID.xceleration
CONDITION = {ifShift | ifAlt | ifCtrl | ifGui}
MODIFIER = final
MODIFIER = autoRepeat
MODIFIER = oneShot
LAYERID = {fn|mouse|mod|base|fn2|fn3|fn4|fn5|alt|shift|super|ctrl}|last|previous
LAYERID_BASIC = {fn|mouse|mod|base|fn2|fn3|fn4|fn5}
KEYMAPID =
VARIABLE_EXPANSION = $
EXPRESSION = (EXPRESSION) | INT | BOOL | FLOAT | VARIABLE_EXPANSION | EXPRESSION OPERATOR EXPRESSION | !EXPRESSION | min(EXPRESSION [, EXPRESSION]+) | max(EXPRESSION [, EXPRESSION]+)
PARENTHESSED_EXPRESSION = (EXPRESSION)
INT = PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | [0-9]+ | -[0-9]+
BOOL = PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | 0 | 1
FLOAT = PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | [0-9]*.[0-9]+ | -FLOAT
VALUE = INT | BOOL | FLOAT
STRING = "
IDENTIFIER = [a-zA-Z0-9_]+
LABEL =
let parser = new ParserBuilder(IO.dummy)
.addRule(grammarText)
.overrideRuleWithConstantString("CODE_BLOCK", "{")
.overrideRuleWithConstantString("COMMENT", "//
.overrideRuleWithRegex("COMMENT", "//.*")
.overrideRule("BOOL", "PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | 0 | 1 | true | false")
.overrideRule("FLOAT", "PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | /[-]?[0-9]*.[0-9]+/")
.overrideRuleWithConstantString("FLOAT", "<[-]?[0-9]*.[0-9]+>")
.overrideRule("INT", "PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | /[-]?[0-9]+/")
.overrideRuleWithConstantString("INT", "<[-]?[0-9]+>")
.overrideRuleWithConstantString("STRING", "\"
.overrideRuleWithConstantString("STRING", "'
.overrideRuleWithRegex("IDENTIFIER", '[a-zA-Z0-9_]*')
.overrideRuleWithConstantString("IDENTIFIER", "<[a-zA-Z0-9_]*>")
.build();
let res = parser.complete("set module.trackp", "BODY"); // [ { "suggestion": "trackpoint", "overlap": 6 } ]
console.log("Suggested completion: " + res[0].suggestion.substring(res[0].overlap) + ", overlapping with the original expression by " + res[0].overlap " chars.");
`
feature | syntax
--- | ---
rules / nonterminals | UPPERCASE_IDENTIFIERS{ ... }
obligatory group | [ ... ]
optional group | [ ... ]+
repeat once or more | [ ... ]*
repeat arbitrary | A \| B
choice | /regex/
terminal as regex | text
terminal as string |
human-readable hint |
human-readable hint, backed by a specific rule |
Furthermore, the project has a REPL environment that has some commands for debug of the grammar. E.g., a it provides a walker that allows you to expand the rule step by step, picking expansions one by one. It can be started by:
`
import { Cli } from 'naive-autocompletion-parser'
Cli.launch(parser);
`
We currently use the simple academic approach of converting the ebnf into normal forms. Specifically:
1) convert EBNF (Extended Backus Naur Form) into a standard BNF (i.e., replace iteration rules)
2) eliminate nullable rules
3) unify common prefixes
4) convert the non-nullable BNF into a GNF (Griebach Normal Form)
We leave out most CNF conversion steps though, as they don't have much practical sense in a non-academic setup. (Admittedly, we should probably convert the grammar into a binary form in order to make sure that nullable elimination doesn't go exponential.)
Generally, efficiency is relatively good. On our grammar, one consumed token takes around 30 stack operations, and tokens are looked up in a normal Map.
There is still a room for improvement, as we are not using a lexer... which means that regex rules are costly, as they are (naively) matched one by one at the moment.
We provide a vscode configuration.
From commandline, you can:
``
git clone https://github.com/kareltucek/naive-autocompletion-parser.git
cd naive-autocompletion-parser
tsc # compile the project
node dist/cli/launch.js # start the repl
help # see available commands
Or from your project you can:
``
npm install naive-autocompletion-parser
and then
```
import { ParserBuilder } from 'naive-autocompletion-parser'