Process Tags Core

A robust parsing engine for identifying and extracting custom Process Tags from text content. Supports 7 distinct tag syntaxes with module names, input values, and default values.

Features

- ✅ 7 Tag Syntaxes: Inline and block variants with different parameter combinations
- ✅ Regex-Based Parsing: Fast, efficient pattern matching with proper precedence
- ✅ Escape Sequences: Full support for \" and \\ in string values
- ✅ TypeScript-First: Complete type definitions and IDE support
- ✅ Zero Dependencies: No runtime dependencies, minimal footprint
- ✅ Dual Module Formats: Both ESM and CommonJS support
- ✅ Lenient Parsing: Invalid tags are silently ignored, not errors
- ✅ Position Tracking: Source mapping for each tag
- ✅ Parse Options: Content normalization and runtime limits
- ✅ Range-Based Parsing: Parse specific content regions with absolute positions
- ✅ Utility Functions: Tag walking, filtering, and overlap detection

Installation

``bash npm install @aiconnect/process-tags`

Package published at: https://www.npmjs.com/package/@aiconnect/process-tags

`Quick Start`

`$3`

`typescript import { parse } from '@aiconnect/process-tags';

console.log(result.tags); // [ // { type: 'inline', module: 'username', input: 'World', ... }, // { type: 'inline', module: 'app', ... } // ]`

`$3`

`javascript const { parse } = require('@aiconnect/process-tags');

console.log(result.tags); // [ // { type: 'inline', module: 'username', input: 'World', ... }, // { type: 'inline', module: 'app', ... } // ]`

`Tag Syntaxes`

`$3`

1. Basic: [|module|]2. With Input:[|module|"value"|]3. With Default:[|module||"default"|]4. Full:[|module|"input"|"default"|]5. Compact:[|module§"value"|] (value serves as both input and default)

`$3`

6. Simple: [|module]content[/module|]7. Compact:[|module§]content[/module§|] (content serves as both input and default)

`$3`

Inline tags can appear within block tag content, and both will be detected as independent tags:

const result = parse(content); // Returns 2 tags: // 1. Block tag with full content (including the raw inline tag text) // 2. Inline tag with its parsed values

console.log(result.tags); // [ // { // type: 'block', // module: 'description', // input: 'This is a description with [|author|"John Doe"|] inline.', // position: { start: 0, end: ... } // }, // { // type: 'inline', // module: 'author', // input: 'John Doe', // position: { start: ..., end: ... } // } // ]`

Note: The parser uses a two-pass strategy to detect both block and inline tags independently. Block tags within block tags (hierarchical nesting) are not supported.

`API`

`$3`

Main parsing function. Returns all tags found in the content.

`typescript const result = parse('[|title|"My Page"|]'); // { tags: [...], original: "[|title|\"My Page\"|]" }`

`$3`

Convenience function that returns just the tags array.

`typescript const tags = findTags('[|title|"My Page"|]'); // [{ type: 'inline', module: 'title', input: 'My Page', ... }]`

`$3`

Validates if a string represents a valid Process Tag.

`typescript isValidTag('[|valid|]'); // true isValidTag('[|invalid tag|]'); // false`

`$3`

Extracts the module name from a tag string.

`typescript extractModule('[|myModule|]'); // "myModule" extractModule('invalid'); // null`

`Advanced Features`

`$3`

Configure parsing behavior with options for normalization and limits:

`typescript import { parse, ParseOptions } from '@aiconnect/process-tags';

const options: ParseOptions = { normalizeBlockContent: true, // Normalize block tag content maxTags: 100, // Stop after 100 tags maxBlockContentBytes: 10000, // Skip blocks > 10KB maxBlockContentLines: 50 // Skip blocks > 50 lines };

const result = parse(content, options);`

#### Content Normalization

When normalizeBlockContent: true, block tags get an additional normalizedInputfield with: - Leading/trailing empty lines removed - Common indentation stripped

`typescript const content =[|code]
function hello() {
console.log("Hi");
}
[/code|];

const result = parse(content, { normalizeBlockContent: true });

console.log(result.tags[0].normalizedInput); // "function hello() {\n console.log(\"Hi\");\n}" // (common indentation removed)

console.log(result.tags[0].input); // Original content preserved (trimmed)`

#### Parse-Time Limits

Protect against resource exhaustion in untrusted content:

`typescript // Limit total tags parse(content, { maxTags: 10 }); // Parse up to 10 tags only

// Skip large blocks parse(content, { maxBlockContentBytes: 1000, // Skip blocks > 1KB maxBlockContentLines: 20 // Skip blocks > 20 lines });`

Note: Limits are applied to normalized content when normalizeBlockContent: true.

`$3`

Parse only a specific portion of content with absolute position tracking:

`typescript import { parseInRange } from '@aiconnect/process-tags';

const content = "prefix [|tag1|] middle [|tag2|] suffix"; const range = { start: 7, end: 35 }; // Parse middle section only

const result = parseInRange(content, range);

// Returns tags with absolute positions (relative to original content) console.log(result.tags[0].position); // { start: 7, end: 15 } (absolute positions)`

Use Cases: - Re-parse inline tags within block content - Process specific sections of large documents - 10-100x faster than re-parsing entire document

`$3`

#### walkTags(tags, callbacks)

Iterate over tags with type-specific callbacks:

`typescript import { walkTags } from '@aiconnect/process-tags';

walkTags(result.tags, { onBlock: (tag, index) => { console.log(Block: ${tag.module}); return true; // continue iteration }, onInline: (tag, index) => { console.log(Inline: ${tag.module}); if (tag.module === 'stop') return false; // stop iteration } });`

#### filterNested(tags, parentRange)

Filter tags contained within a parent range:

`typescript import { filterNested } from '@aiconnect/process-tags';

const blockTag = result.tags[0]; // A block tag const nestedTags = filterNested(result.tags, blockTag.position);

// Returns only tags fully within the block's range`

#### hasOverlap(range1, range2)

Check if two ranges overlap:

`typescript import { hasOverlap } from '@aiconnect/process-tags';

hasOverlap( { start: 0, end: 10 }, { start: 5, end: 15 } ); // true (overlapping)

hasOverlap( { start: 0, end: 10 }, { start: 10, end: 20 } ); // false (adjacent, not overlapping)`

#### normalizeContent(content)

Normalize content (remove empty lines and common indentation):

`typescript import { normalizeContent } from '@aiconnect/process-tags';

const input = "\n Line 1\n Line 2\n"; const normalized = normalizeContent(input); // "Line 1\nLine 2"`

`TypeScript Types`

`typescript interface ProcessTag { type: 'inline' | 'block'; module: string; input?: string; default?: string; raw: string; position: { start: number; end: number }; normalizedInput?: string; // Present when normalizeBlockContent is enabled }

interface ParseResult { tags: ProcessTag[]; original: string; }

interface ParseOptions { normalizeBlockContent?: boolean; maxTags?: number; maxBlockContentBytes?: number; maxBlockContentLines?: number; }

interface Range { start: number; // Inclusive end: number; // Exclusive }

interface WalkCallbacks { onBlock?: (tag: ProcessTag, index: number) => void | boolean; onInline?: (tag: ProcessTag, index: number) => void | boolean; }`

`Module Name Rules`

Module names must match /^[a-zA-Z0-9_-]+$/: - ✅ Allowed: letters, digits, underscore, hyphen - ❌ Not allowed: dots, spaces, special characters

`Escape Sequences`

Strings support two escape sequences: -\" → "(literal quote) -\\ → \ (literal backslash)

Example:`typescript const result = parse('[|text|"He said \\"Hello\\""|]'); // result.tags[0].input === 'He said "Hello"'`

`Development`

`bash

`Install dependencies`


npm install
Run tests

npm test
Build

npm run build
Lint

npm run lint

Performance

- < 5ms for typical documents (< 10KB)
- Handles 1000+ tags efficiently
- No blocking operations

License

MIT

Process Tags Core

A robust parsing engine for identifying and extracting custom Process Tags from text content. Supports 7 distinct tag syntaxes with module names, input values, and default values.

Features

Installation

``bash npm install @aiconnect/process-tags`

Package published at: https://www.npmjs.com/package/@aiconnect/process-tags

`Quick Start`

`$3`

`typescript import { parse } from '@aiconnect/process-tags';

console.log(result.tags); // [ // { type: 'inline', module: 'username', input: 'World', ... }, // { type: 'inline', module: 'app', ... } // ]`

`$3`

`javascript const { parse } = require('@aiconnect/process-tags');

console.log(result.tags); // [ // { type: 'inline', module: 'username', input: 'World', ... }, // { type: 'inline', module: 'app', ... } // ]`

`Tag Syntaxes`

`$3`

6. Simple: [|module]content[/module|]7. Compact:[|module§]content[/module§|] (content serves as both input and default)

`$3`

Inline tags can appear within block tag content, and both will be detected as independent tags:

const result = parse(content); // Returns 2 tags: // 1. Block tag with full content (including the raw inline tag text) // 2. Inline tag with its parsed values

Note: The parser uses a two-pass strategy to detect both block and inline tags independently. Block tags within block tags (hierarchical nesting) are not supported.

`API`

`$3`

Main parsing function. Returns all tags found in the content.

`typescript const result = parse('[|title|"My Page"|]'); // { tags: [...], original: "[|title|\"My Page\"|]" }`

`$3`

Convenience function that returns just the tags array.

`typescript const tags = findTags('[|title|"My Page"|]'); // [{ type: 'inline', module: 'title', input: 'My Page', ... }]`

`$3`

Validates if a string represents a valid Process Tag.

`typescript isValidTag('[|valid|]'); // true isValidTag('[|invalid tag|]'); // false`

`$3`

Extracts the module name from a tag string.

`typescript extractModule('[|myModule|]'); // "myModule" extractModule('invalid'); // null`

`Advanced Features`

`$3`

Configure parsing behavior with options for normalization and limits:

`typescript import { parse, ParseOptions } from '@aiconnect/process-tags';

const result = parse(content, options);`

#### Content Normalization

When normalizeBlockContent: true, block tags get an additional normalizedInputfield with: - Leading/trailing empty lines removed - Common indentation stripped

`typescript const content =[|code]
function hello() {
console.log("Hi");
}
[/code|];

const result = parse(content, { normalizeBlockContent: true });

console.log(result.tags[0].normalizedInput); // "function hello() {\n console.log(\"Hi\");\n}" // (common indentation removed)

console.log(result.tags[0].input); // Original content preserved (trimmed)`

#### Parse-Time Limits

Protect against resource exhaustion in untrusted content:

`typescript // Limit total tags parse(content, { maxTags: 10 }); // Parse up to 10 tags only

// Skip large blocks parse(content, { maxBlockContentBytes: 1000, // Skip blocks > 1KB maxBlockContentLines: 20 // Skip blocks > 20 lines });`

Note: Limits are applied to normalized content when normalizeBlockContent: true.

`$3`

Parse only a specific portion of content with absolute position tracking:

`typescript import { parseInRange } from '@aiconnect/process-tags';

const content = "prefix [|tag1|] middle [|tag2|] suffix"; const range = { start: 7, end: 35 }; // Parse middle section only

const result = parseInRange(content, range);

// Returns tags with absolute positions (relative to original content) console.log(result.tags[0].position); // { start: 7, end: 15 } (absolute positions)`

Use Cases: - Re-parse inline tags within block content - Process specific sections of large documents - 10-100x faster than re-parsing entire document

`$3`

#### walkTags(tags, callbacks)

Iterate over tags with type-specific callbacks:

`typescript import { walkTags } from '@aiconnect/process-tags';

#### filterNested(tags, parentRange)

Filter tags contained within a parent range:

`typescript import { filterNested } from '@aiconnect/process-tags';

const blockTag = result.tags[0]; // A block tag const nestedTags = filterNested(result.tags, blockTag.position);

// Returns only tags fully within the block's range`

#### hasOverlap(range1, range2)

Check if two ranges overlap:

`typescript import { hasOverlap } from '@aiconnect/process-tags';

hasOverlap( { start: 0, end: 10 }, { start: 5, end: 15 } ); // true (overlapping)

hasOverlap( { start: 0, end: 10 }, { start: 10, end: 20 } ); // false (adjacent, not overlapping)`

#### normalizeContent(content)

Normalize content (remove empty lines and common indentation):

`typescript import { normalizeContent } from '@aiconnect/process-tags';

const input = "\n Line 1\n Line 2\n"; const normalized = normalizeContent(input); // "Line 1\nLine 2"`

`TypeScript Types`

interface ParseResult { tags: ProcessTag[]; original: string; }

interface ParseOptions { normalizeBlockContent?: boolean; maxTags?: number; maxBlockContentBytes?: number; maxBlockContentLines?: number; }

interface Range { start: number; // Inclusive end: number; // Exclusive }

interface WalkCallbacks { onBlock?: (tag: ProcessTag, index: number) => void | boolean; onInline?: (tag: ProcessTag, index: number) => void | boolean; }`

`Module Name Rules`

Module names must match /^[a-zA-Z0-9_-]+$/: - ✅ Allowed: letters, digits, underscore, hyphen - ❌ Not allowed: dots, spaces, special characters

`Escape Sequences`

Strings support two escape sequences: -\" → "(literal quote) -\\ → \ (literal backslash)

Example:`typescript const result = parse('[|text|"He said \\"Hello\\""|]'); // result.tags[0].input === 'He said "Hello"'`

`Development`

`bash

`Install dependencies`


npm install
Run tests

npm test
Build

npm run build
Lint

npm run lint

Performance

- < 5ms for typical documents (< 10KB)
- Handles 1000+ tags efficiently
- No blocking operations

License

MIT