TLEX

A lexical analyzer (tokenizer) generator in TypeScript.

TLEX provides a flexible tokenizer with regex-based pattern matching, support for both JavaScript and Flex-style regex syntax, and stateful lexing. It works in Node.js and browsers with full TypeScript support.

Documentation

Full documentation and interactive examples: panyam.github.io/tlex

Try tokenizers in the browser: Playground

Installation

``bash npm install tlex`

`Quick Example`

`typescript import { Tokenizer } from "tlex";

const tokenizer = new Tokenizer();

// Add token rules tokenizer.add(/[0-9]+/, { tag: "NUMBER" }); tokenizer.add(/[a-zA-Z_][a-zA-Z0-9_]*/, { tag: "IDENTIFIER" }); tokenizer.add(/\+|\-|\*|\//, { tag: "OPERATOR" }); tokenizer.add(/\s+/, { tag: "WS", skip: true });

// Tokenize input const tokens = tokenizer.tokenize("x + 42 * y"); // Returns: [IDENTIFIER "x", OPERATOR "+", NUMBER "42", OPERATOR "*", IDENTIFIER "y"]``

Features

- Regex-based patterns - Use JavaScript regex or Flex-style patterns
- Rule priorities - Control which rule matches on conflicts
- Skip tokens - Automatically skip whitespace and comments
- Stateful lexing - Context-sensitive tokenization with lexer states
- Token callbacks - Custom handlers for token processing
- Lookahead support - TokenBuffer for parser lookahead
- TypeScript native - Full type definitions included

Examples

The documentation includes several runnable examples:

- JSON Tokenizer - Complete JSON lexer
- Calculator - Expression tokenizer
- C Lexer - C-style language with states

Reference

- API Reference - Tokenizer, Token, Rule classes
- JS Regex Syntax - JavaScript regex support
- Flex Syntax - Flex-style patterns
- Rule Configuration - Priority, skip, states

License

MIT