URLReference class
npm install url-referenceURLReference
============
**NB. This package has been obsoleted by [spec-url] which now
exposes the same URLReference API. Please use the [spec-url] package instead**.
[spec-url]: https://www.npmjs.com/package/spec-url
*
"URL or Relative Reference"
The URLReference class is designed to overcome shortcomings of the URL class.
#### Features
- Supports Relative and scheme-less URLs.
- Supports Nullable Components.
- Distinct Rebase, Normalize and Resolve methods.
- Resolve is Behaviourally Equivalent with the WHATWG URL Standard.
#### Examples
``javascript
new URLReference ('filename.txt#top', '//host') .href
// => '//host/filename.txt#top'
new URLReference ('?do=something', './path/to/resource?do=nothing') .href
// => './path/to/resource?do=something'
new URLReference ('take/action.html') .resolve ('http://🌲') .href
// => 'http://xn--vh8h/take/action.html'
`
Summary
-------
The module exports a single class URLReference with nullable properties (getters/setters):
- scheme,username
- , password, hostname, port,pathname
- , pathroot, driveletter, filename, query
- , fragment.
It has three key methods:
- rebase, normalize and resolve.
It can be converted to an ASCII, or to a Unicode string via:
- the href getter and the toString method.
The WHATWG URL standard uses the phrase "__special URL__" for URLs that have a _special scheme_.
A scheme is a _special scheme_ if it is equivalent to http, https, ws, wss, ftp or file.
The _path_ of an URL may either be hierarchical, or opaque:
An _hierarchical path_ is subdivided into path components, an _opaque path_ is not.
The path of a "_special_ URL" is always considered to be hierarchical.
The path of a non-special URL is opaque unless the URL has an authority or if its path starts with a path-root /.
URLReference API
----------------
Constructor
- new URLReference ()new URLReference (input)
- new URLReference (input, base)
-
Constructs a new URLReference object. The result _may_ represent a relative URL. The _resolve_ method can be used to ensure that the result represents an absolute URL.
Arguments input and base are optional. Each may be a string to be parsed, or an existing URLReference object. If a base argument is supplied, then input is rebased onto base after parsing.
Parsing behaviour
The parsing behaviour adapts to the scheme of input or the scheme of base otherwise:
* The invalid \ code-points before the host and in the path are converted to /
if the input has a special scheme or if it has no scheme at all.
* Windows drive letters are detected if the scheme is equivalent to file or if no scheme is present at all. file
If no scheme is present and a windows drive letter is detected then then the scheme is implicitly set to .
The hostname is always parsed as an opaque hostname string.
Parsing and validating a hostname as a domain is done by the resolve method instead.
Examples:
`javascript
const r1 = new URLReference ();
// r.href == '' // The 'empty relative URL'
const r2 = new URLReference ('/big/trees/');
// r.href == '/big/trees/'
const r3 = new URLReference ('index.html', '/big/trees/');
// r.href == '/big/trees/index.html'
const r4 = new URLReference ('README.md', r3);
// r.href == '/big/trees/README.md'
`
Parsing Behaviour Examples:
`javascript
const r1 = new URLReference ('\\foo\\bar', 'http:')
// r1.href == 'http:/foo/bar'
const r2 = new URLReference ('\\foo\\bar', 'ofp:/')
// r2.href == 'ofp:/\\foo\\bar'
const r3 = new URLReference ('/c:/path/to/file')
// r3.href == 'file:/c:/path/to/file'
// r3.hostname == null
// r3.driveletter == 'c:'
const r4 = new URLReference ('/c:/path/to/file', 'http:')
// r4.href == 'http:/c:/path/to/file'
// r4.hostname == null
// r4.driveletter == null
`
Rebase – urlReference .rebase (base)
The _base_ argument may be a string or a URLReference object.
Rebase returns a new URLReference instance.
It throws an error if the base argument reprensents an URL with an _opaque path_ (unless _urlReference_ consists of a fragment identifier only).
Rebase implements a _slight generalisation_ of [reference transformation][T] as defined in RFC3986 URI.
In our case the _base_ argument is allowed to be a relative reference, in addition to an absolute URL.
Rebase applies a __non-strict__ reference transformation to URLReferences that have a "_special scheme_"
and a __strict__ reference transformation in all other cases:
* The RFC3986 (URI) standard defines a strict and a non-strict variant of _reference transformation_.
The _non-strict_ variant ignores the scheme of the input if it is equivalent to the scheme of the base.
The WHATWG uses the _non-strict_ behaviour for "_special_" URLs and the _strict_ behaviour for other URLs.
[T]: https://www.rfc-editor.org/rfc/rfc3986#section-5.2.2
[RFC3986]: https://www.rfc-editor.org/rfc/rfc3986
Note: The non-strict WHATWG behaviour has a surprising consequene.
An URLReference that has a special scheme may still "behave as a relative URL".
Example — non-strict behaviour:
`javascript`
const base = new URLReference ('http://host/dir/')
const rel = new URLReference ('http:?do=something')
const rebased = rel.rebase (base)
// rebased.href == 'http://host/dir/?do=something'
Example — strict behaviour:
Rebase applies a "strict" reference transformation to non-special URLReferences. The strict variant does not remove the scheme from the input:
`javascript`
const base = new URLReference ('ofp://host/dir/')
const abs = new URLReference ('ofp:?do=something')
const rebased = abs.rebase (base)
// rebased.href == 'ofp:?do=something'
Example — opaque path behaviour:
It is not possible to rebase a relative URLReference on a base that has an _opaque path_.
`javascript
const base = new URLReference ('ofp:this/is/an/opaque-path/')
const rel = new URLReference ('filename.txt')
// const rebased = rel.rebase (base) // throws:
// TypeError: Cannot rebase
const base2 = new URLReference ('ofp:/not/an/opaque-path/')
const rebased = rel.rebase (base2) // This works as expected
// rebased.href == 'ofp:/not/an/opaque-path/filename.txt'
`
Normalize – urlReference .normalize ()
Normalize collapses dotted segments in the path, removes default ports and percent encodes certain code-points. It behaves in the same way as the WHATWG URL constructor, except for the fact that it supports relative URLs. It does not interpret hostnames as a domain, this is done in the resolve method instead. Normalize always returns a new URLReference instance.
Resolve
- urlReference .resolve ()urlReference .resolve (base)
-
The optional base argument may be a string or an existing URLReference object.
Resolve returns a new URLReference that represents an absolute URL.
It throws an error if this is not possible.
Resolve does additional processing and checks on the authority:
- Asserts that file-URLs and web-URLs have an authority.
- Asserts that the authority of web-URLs is not empty.
- Asserts that file-URLs do not have a username, password or port.
- Parses opaque hostnames of file-URLs and web-URLs as a domain or an IPv4-address.
Resolve uses the same forceful error correcting behaviour as the WHATWG URL constructor.
Note: An unpleasant aspect of the WHATWG behaviour is that if the input is a non-file special URL, and the input has no authority, then the first non-empty path component will be coerced to an authority:
`javascript
const r1 = new URLReference ('http:/foo/bar')
// r.host == null
// r.pathname == '/foo/bar'
const r2 = r1.resolve ('http://host/')
// The scheme of r1 is ignored because it matches the base.
// Thus the hostname is taken from the base.
// r2.href == 'http://host/foo/bar'
const r3 = r1.resolve ()
// r1 does not have an authority, so the first non-empty path
// component foo is coerced into an authority for the result.`
// r1.href == 'http://foo/bar'
String – urlReference .toString ()
Converts the URLReference to a string. This _preserves_ unicode characters in the URL, unlike the href getter which ensures that the result consists of ASCII code-points only.
`javascript
new URLReference ('take/action.html') .resolve ('http://🌲') .toString ()
// => 'http://🌲/take/action.html'
new URLReference ('take/action.html') .resolve ('http://🌲') .href
// => 'http://xn--vh8h/take/action.html'
`
Access to the components of the URLReference goes through the following getters/setters.
All properties are nullable, however some invariants are maintained.
- schemeusername
- password
- hostname
- port
- pathname
- driveletter
+ pathroot
+ filename
+ query
- fragment
-
The properties driveletter, pathroot and filename do not use the idiomaticpathname
camelCase style. This is is done to remain consistent with existing property
names of the WHATWG URL class, such as and hostname`.
Licence
-------
MIT Licenced.