s10n

> s10n stands for _"sanitization"_.
> Just like _l10n_ stands for _"localization"_.
> See also i18n, l10n et al

A library to make basic user input sanitization
and subsequent validation an easier job.

- Use cases
- Example 1. Username
- Example 2. Arbitrary text
- API
- Modifiers
- Treating line break characters
- Line break character
- Elementary transformers
- Transform whitespaces
- Handle line breaks
- Keep/Remove/Replace
- Other transformations
- Compound transformers
- Semantic sanitizers
- Custom transformations
- Getting sanitized value
- Utility methods

Use cases

Sanitization is NOT validation, but
it can help make validation an easier job
and/or help to suggest to a user an input variation
that better matches input expectations or requirements.

As with validation sanitization, if in place, should
be applied on both frontend and backend, since a user
can bypass sanitization and validation on the frontend and
send input directly to a backend endpoint.

$3

Let's assume the following scenario of a username input.

The rule is that only a-z, A-Z, numbers, underscore and dash
are only expected in valid input.

A user submits a string of #UsEr #$%"' NaMe 5_6-9.
Input gets invalidated, the rule gets presented to the user,
and the user expected to remove all invalid characters.
The input then becomes a valid string of UsErNaMe5_6-9.

Alternatively an app might have suggested (or enforced)
a valid input. Examples below are demonstration of
default and tuned behaviour of a relevant semantic sanitizer
(spaces get replaced with underscores).

``javascript let input = " UsEr #$%' NaMe 5_6-9 "; s10n(input).keepUsername().value; // "UsErNaMe5_6-9" s10n(input).keepUsernameLC().value; // "username5_6-9" s10n(input).keepUsername("_").value; // "UsEr_NaMe_5_6-9" s10n(input).keepUsernameLC("_").value; // "user_name_5_6-9"`

Semantic sanitizers applied are a combination of elementary and compound transformers with an optional parameter to replace spaces (in this particular use case).

`$3`

Let's assume the input received from a user is" \n\r\n \u200B\u200C\u200D\u2060 \t\uFEFF\xA0 Sensible text \n Line 2 \n\r\n\r\r "

Here are issues worth attention and optimization:

- it contains problematic whitespaces - it contains sequences of 2 or more whitespaces - it contains leading and trailing whitespaces - there is a variety of line break characters, potentially hazardous (CRLF injection) - there are leading and trailing empty lines - line break characters are invalid in a one line input

Any of the above can be considered as some unnecessarily contaminated data.

Having all issues fixed the above input would have been:

- "Sensible text\nLine 2"for a multiline input -"Sensible text Line 2" for a simple string input

`javascript let input = " \n\r\n \u200B\u200C\u200D\u2060 \t\uFEFF\xA0 Sensible text \n Line 2 \n\r\n\r\r ";

s10n(input).minimizeWhitespaces().value; // "Sensible text Line 2"

s10n(input) .preserveLineBreaks() // modifier for subsequent methods to preserve line breaks .minimizeWhitespaces().value; // "Sensible text\nLine 2"`

minimizeWhitespaces does the following:

- normalizes line break characters, i.e. CRLF (\n\r) and individual CR (\r) are converted into LF (\n) (default behaviour) - normalizes whitespaces into standard space character (\x20) - merges continuous whitespaces into a single space character - normalizes lines in a multiline input (strips leading and trailing spaces in each line of a multiline input) - trims leading and trailing whitespaces - trims leading and trailing line breaks

s10n

s10n

Table of Contents

Use cases

$3

$3

API

$3

$3

$3

$3

$3

$3

$3

Development and Publishing

s10n

s10n

Table of Contents

Use cases

$3

$3

API

$3

$3

$3

$3

$3

$3

$3

Development and Publishing

`$3`

`API`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`API`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`