naive-autocompletion-parser

$3

This is an LL parser that takes an EBNF (Extended Backus-Naur Form) grammar and a partial expression, and produces possible completions by running all possible expansion paths of the grammar. We use a strongly customized notation of the ebnf grammar, tailored for human readability. The Parser allows overriding selected rules, so it is possible to use the same (human-readability-adjusted, and partially informal) grammar as a documentation for the user, and then the same grammar (just with some overriden rules) for the actual autocompletion.

This was written mainly for the UHK macro language. It may or may not be usable for other projects - depending on how much you are willing to content yourself with our ebnf notation or deep diving into the code.

$3

Simplified example using an excerpt of the actual UHK grammar:

``import { ParserBuilder } from 'naive-autocompletion-parser'

let grammarText =
BODY = COMMENT
BODY = [LABEL:] COMMAND [COMMENT]
COMMENT = //
COMMAND = [CONDITION|MODIFIER]* COMMAND
COMMAND = {exec|call|fork} MACRONAME
COMMAND = {pressKey|holdKey|tapKey|releaseKey} SHORTCUT
COMMAND = tapKeySeq [SHORTCUT]+
COMMAND = set module.MODULEID.navigationMode.LAYERID_BASIC NAVIGATION_MODE
COMMAND = set module.MODULEID.baseSpeed
COMMAND = set module.MODULEID.speed
COMMAND = set module.MODULEID.xceleration
CONDITION = {ifShift | ifAlt | ifCtrl | ifGui}
MODIFIER = final
MODIFIER = autoRepeat
MODIFIER = oneShot
LAYERID = {fn|mouse|mod|base|fn2|fn3|fn4|fn5|alt|shift|super|ctrl}|last|previous
LAYERID_BASIC = {fn|mouse|mod|base|fn2|fn3|fn4|fn5}
KEYMAPID = |last
VARIABLE_EXPANSION = $ | $ | $currentAddress | $thisKeyId | $queuedKeyId. | $keyId.KEYID_ABBREV
EXPRESSION = (EXPRESSION) | INT | BOOL | FLOAT | VARIABLE_EXPANSION | EXPRESSION OPERATOR EXPRESSION | !EXPRESSION | min(EXPRESSION [, EXPRESSION]+) | max(EXPRESSION [, EXPRESSION]+)
PARENTHESSED_EXPRESSION = (EXPRESSION)
INT = PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | [0-9]+ | -[0-9]+
BOOL = PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | 0 | 1
FLOAT = PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | [0-9]*.[0-9]+ | -FLOAT
VALUE = INT | BOOL | FLOAT
STRING = "" | ''
IDENTIFIER = [a-zA-Z0-9_]+
LABEL =
NAVIGATION_MODE = cursor | scroll | caret | media | zoom | zoomPc | zoomMac | none
MODULEID = trackball | touchpad | trackpoint | keycluster
SCANCODE = CHAR | SCANCODE_ABBREV
MODMASK = [MODMASK]+ | [L|R]{S|C|A|G} | {p|r|h|t} | {s|i|o}
SHORTCUT = MODMASK- | MODMASK-SCANCODE | SCANCODE | MODMASK
CHAR =
SCANCODE_ABBREV = enter | escape | backspace | tab | space | minusAndUnderscore | equalAndPlus | openingBracketAndOpeningBrace | closingBracketAndClosingBrace
MACRONAME =
;

let parser = new ParserBuilder(IO.dummy) .addRule(grammarText) .overrideRuleWithConstantString("CODE_BLOCK", "{") .overrideRuleWithConstantString("COMMENT", "//") .overrideRuleWithRegex("COMMENT", "//.*") .overrideRule("BOOL", "PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | 0 | 1 | true | false") .overrideRule("FLOAT", "PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | /[-]?[0-9]*.[0-9]+/") .overrideRuleWithConstantString("FLOAT", "<[-]?[0-9]*.[0-9]+>") .overrideRule("INT", "PARENTHESSED_EXPRESSION | VARIABLE_EXPANSION | /[-]?[0-9]+/") .overrideRuleWithConstantString("INT", "<[-]?[0-9]+>") .overrideRuleWithConstantString("STRING", "\"\"") .overrideRuleWithConstantString("STRING", "''") .overrideRuleWithRegex("IDENTIFIER", '[a-zA-Z0-9_]*') .overrideRuleWithConstantString("IDENTIFIER", "<[a-zA-Z0-9_]*>") .build();

let res = parser.complete("set module.trackp", "BODY"); // [ { "suggestion": "trackpoint", "overlap": 6 } ] console.log("Suggested completion: " + res[0].suggestion.substring(res[0].overlap) + ", overlapping with the original expression by " + res[0].overlap " chars.");`

`$3`

feature | syntax --- | --- rules / nonterminals |UPPERCASE_IDENTIFIERSobligatory group |{ ... }optional group |[ ... ]repeat once or more |[ ... ]+repeat arbitrary |[ ... ]*choice |A \| Bterminal as regex |/regex/terminal as string |texthuman-readable hint |human-readable hint, backed by a specific rule |

`$3`

Furthermore, the project has a REPL environment that has some commands for debug of the grammar. E.g., a it provides a walker that allows you to expand the rule step by step, picking expansions one by one. It can be started by:

`import { Cli } from 'naive-autocompletion-parser'

Cli.launch(parser);`

`$3`

We currently use the simple academic approach of converting the ebnf into normal forms. Specifically:

1) convert EBNF (Extended Backus Naur Form) into a standard BNF (i.e., replace iteration rules) 2) eliminate nullable rules 3) unify common prefixes 4) convert the non-nullable BNF into a GNF (Griebach Normal Form)

We leave out most CNF conversion steps though, as they don't have much practical sense in a non-academic setup. (Admittedly, we should probably convert the grammar into a binary form in order to make sure that nullable elimination doesn't go exponential.)

Generally, efficiency is relatively good. On our grammar, one consumed token takes around 30 stack operations, and tokens are looked up in a normal Map.

There is still a room for improvement, as we are not using a lexer... which means that regex rules are costly, as they are (naively) matched one by one at the moment.

`$3`

We provide a vscode configuration.

From commandline, you can:

`git clone https://github.com/kareltucek/naive-autocompletion-parser.git cd naive-autocompletion-parser tsc # compile the project node dist/cli/launch.js # start the repl help # see available commands`

Or from your project you can:

`npm install naive-autocompletion-parser`

and then

`import { ParserBuilder } from 'naive-autocompletion-parser'``

$3

Simplified example using an excerpt of the actual UHK grammar:

``import { ParserBuilder } from 'naive-autocompletion-parser'

`$3`

`import { Cli } from 'naive-autocompletion-parser'

Cli.launch(parser);`

`$3`

We currently use the simple academic approach of converting the ebnf into normal forms. Specifically:

Generally, efficiency is relatively good. On our grammar, one consumed token takes around 30 stack operations, and tokens are looked up in a normal Map.

There is still a room for improvement, as we are not using a lexer... which means that regex rules are costly, as they are (naively) matched one by one at the moment.

`$3`

We provide a vscode configuration.

From commandline, you can:

Or from your project you can:

`npm install naive-autocompletion-parser`

and then

`import { ParserBuilder } from 'naive-autocompletion-parser'``