npm explorer

jsdom

v27.0.0-beta.3

A JavaScript implementation of many web standards

domhtmlwhatwgw3c
0/weekUpdated 6 days agoMITUnpacked: 3.3 MB
Published by timothygu
npm install jsdom
RepositoryHomepagenpm




jsdom

jsdom is a pure-JavaScript implementation of many web standards, notably the WHATWG DOM and HTML Standards, for use with Node.js. In general, the goal of the project is to emulate enough of a subset of a web browser to be useful for testing and scraping real-world web applications.

The latest versions of jsdom require newer Node.js versions; see the package.json "engines" field for details.

Basic usage

``js
const jsdom = require("jsdom");
const { JSDOM } = jsdom;
`

To use jsdom, you will primarily use the JSDOM constructor, which is a named export of the jsdom main module. Pass the constructor a string. You will get back a JSDOM object, which has a number of useful properties, notably window:

`js
const dom = new JSDOM(

Hello world

);
console.log(dom.window.document.querySelector("p").textContent); // "Hello world"
`

(Note that jsdom will parse the HTML you pass it just like a browser does, including implied , , and tags.)

The resulting object is an instance of the JSDOM class, which contains a number of useful properties and methods besides window. In general, it can be used to act on the jsdom from the "outside," doing things that are not possible with the normal DOM APIs. For simple cases, where you don't need any of this functionality, we recommend a coding pattern like

`js
const { window } = new JSDOM(
...);
// or even
const { document } = (new JSDOM(
...)).window;
`

Full documentation on everything you can do with the JSDOM class is below, in the section "JSDOM Object API".

Customizing jsdom

The JSDOM constructor accepts a second parameter which can be used to customize your jsdom in the following ways.

$3

`js
const dom = new JSDOM(
, {
url: "https://example.org/",
referrer: "https://example.com/",
contentType: "text/html",
includeNodeLocations: true,
storageQuota: 10000000
});
`

- url sets the value returned by window.location, document.URL, and document.documentURI, and affects things like resolution of relative URLs within the document and the same-origin restrictions and referrer used while fetching subresources. It defaults to "about:blank".
-
referrer just affects the value read from document.referrer. It defaults to no referrer (which reflects as the empty string).
-
contentType affects the value read from document.contentType, as well as how the document is parsed: as HTML or as XML. Values that are not a HTML MIME type or an XML MIME type will throw. It defaults to "text/html". If a charset parameter is present, it can affect binary data processing.
-
includeNodeLocations preserves the location info produced by the HTML parser, allowing you to retrieve it with the nodeLocation() method (described below). It also ensures that line numbers reported in exception stack traces for code running inside
);

// The script will not be executed, by default:
console.log(dom.window.document.getElementById("content").children.length); // 0
`

To enable executing scripts inside the page, you can use the runScripts: "dangerously" option:

`js
const dom = new JSDOM(



, { runScripts: "dangerously" });

// The script will be executed and modify the DOM:
console.log(dom.window.document.getElementById("content").children.length); // 1
`

Again we emphasize to only use this when feeding jsdom code you know is safe. If you use it on arbitrary user-supplied code, or code from the Internet, you are effectively running untrusted Node.js code, and your machine could be compromised.

If you want to execute _external_ scripts, included via
, { runScripts: "outside-only" });

// run a script outside of JSDOM:
dom.window.eval('document.getElementById("content").append(document.createElement("p"));');

console.log(dom.window.document.getElementById("content").children.length); // 1
console.log(dom.window.document.getElementsByTagName("hr").length); // 0
console.log(dom.window.document.getElementsByTagName("p").length); // 1
`

This is turned off by default for performance reasons, but is safe to enable.

Note that in the default configuration, without setting runScripts, the values of window.Array, window.eval, etc. will be the same as those provided by the outer Node.js environment. That is, window.eval === eval will hold, so window.eval will not run scripts in a useful way.

We strongly advise against trying to "execute scripts" by mashing together the jsdom and Node global environments (e.g. by doing global.window = dom.window), and then executing scripts or test code inside the Node global environment. Instead, you should treat jsdom like you would a browser, and run all scripts and tests that need access to a DOM inside the jsdom environment, using window.eval or runScripts: "dangerously". This might require, for example, creating a browserify bundle to execute as a , {
url: "https://example.com/",
runScripts: "dangerously",
resources: {
userAgent: "Mellblomenator/9000",
dispatcher: new ProxyAgent("http://127.0.0.1:9001"),
interceptors: [
requestInterceptor((request, context) => {
// Override the contents of this script to do something unusual.
if (request.url === "https://example.com/some-specific-script.js") {
return new Response("window.someGlobal = 5;", {
headers: { "Content-Type": "application/javascript" }
});
}
// Return undefined to let the request proceed normally
})
]
}
});
`

The context object passed to the interceptor includes element (the DOM element that initiated the request, or null for requests that are not from DOM elements). For example:

`js
requestInterceptor((request, { element }) => {
if (element) {
console.log(
Element ${element.localName} is requesting ${request.url});
}
// Return undefined to let the request proceed normally
})
`

To be clear on the flow: when something in your jsdom fetches resources, first the request is set up by jsdom, then it is passed through any interceptors in the order provided, then it reaches any provided dispatcher (defaulting to undici's global dispatcher). If you use jsdom's requestInterceptor(), returning promise fulfilled with a Response will prevent any further interceptors from running, or the base dispatcher from being reached.

> [!WARNING]
> All resource loading customization is ignored when scripts inside the jsdom use synchronous
XMLHttpRequest. This is a technical limitation as we cannot transfer dispatchers or interceptors across a process boundary.

$3

Like web browsers, jsdom has the concept of a "console". This records both information directly sent from the page, via scripts executing inside the document using the window.console API, as well as information from the jsdom implementation itself. We call the user-controllable console a "virtual console", to distinguish it from the Node.js console API and from the inside-the-page window.console API.

By default, the JSDOM constructor will return an instance with a virtual console that forwards all its output to the Node.js console. This includes both jsdom output (such as not-implemented warnings or CSS parsing errors) and in-page window.console calls.

To create your own virtual console and pass it to jsdom, you can override this default by doing

`js
const virtualConsole = new jsdom.VirtualConsole();
const dom = new JSDOM(
, { virtualConsole });
`

Code like this will create a virtual console with no behavior. You can give it behavior by adding event listeners for all the possible console methods:

`js
virtualConsole.on("error", () => { ... });
virtualConsole.on("warn", () => { ... });
virtualConsole.on("info", () => { ... });
virtualConsole.on("dir", () => { ... });
// ... etc. See https://console.spec.whatwg.org/#logging
`

(Note that it is probably best to set up these event listeners _before_ calling new JSDOM(), since errors or console-invoking script might occur during parsing.)

If you simply want to redirect the virtual console output to another console, like the default Node.js one, you can do

`js
virtualConsole.forwardTo(console);
`

There is also a special event, "jsdomError", which will fire with error objects to report errors from jsdom itself. This is similar to how error messages often show up in web browser consoles, even if they are not initiated by console.error.

As mentioned above, the default behavior for jsdom is to send these to the Node.js console. This done via console.error(jsdomError.message), or in the case of "unhandled-exception"-type jsdom errors that occur from scripts running in the jsdom, via console.error(jsdomError.cause.stack). Using forwardTo() will give the same behavior. If you want a non-default behavior, you can customize it in the following ways:

`js
// Do not send any jsdom errors to the Node.js console:
virtualConsole.forwardTo(console, { jsdomErrors: "none" });

// Send only certain jsdom errors to the Node.js console, ignoring others:
virtualConsole.forwardTo(console, { jsdomErrors: ["unhandled-exception", "not-implemented"]});

// Customize the handling of all jsdom errors:
virtualConsole.forwardTo(console, { jsdomErrors: "none" });
virtualConsole.on("jsdomError", err => {
switch (err.type) {
case "unhandled-exception": {
// ... process ...
break;
}
case "css-parsing": {
// ... process in some other way ...
break;
}
// ... etc. ...
}
});
`

The details for each type of jsdom error, listed by their type property, are:

- "css-parsing": an error parsing CSS stylesheets
-
cause: the exception object from our CSS parser library, rrweb-cssom
-
sheetText: the full text of the stylesheet that we attempted to parse
-
"not-implemented": an error emitted when certain stub methods from unimplemented parts of the web platform are called
-
"resource-loading": an error loading resources, e.g. due to a network error or a bad response code from the server
-
cause property: the exception object from the internal Node.js network calls jsdom made when retrieving the resource, or from the developer's custom resource loader
-
url property: the URL of the resource that was attempted to be fetched
-
"unhandled-exception": a script execution error that was not handled by a Window "error" event listener
-
cause property: contains the original exception object

$3

Like web browsers, jsdom has the concept of a cookie jar, storing HTTP cookies. Cookies that have a URL on the same domain as the document, and are not marked HTTP-only, are accessible via the document.cookie API. Additionally, all cookies in the cookie jar will impact the fetching of subresources.

By default, the JSDOM constructor will return an instance with an empty cookie jar. To create your own cookie jar and pass it to jsdom, you can override this default by doing

`js
const cookieJar = new jsdom.CookieJar(store, options);
const dom = new JSDOM(
, { cookieJar });
`

This is mostly useful if you want to share the same cookie jar among multiple jsdoms, or prime the cookie jar with certain values ahead of time.

Cookie jars are provided by the tough-cookie package. The jsdom.CookieJar constructor is a subclass of the tough-cookie cookie jar which by default sets the looseMode: true option, since that matches better how browsers behave. If you want to use tough-cookie's utilities and classes yourself, you can use the jsdom.toughCookie module export to get access to the tough-cookie module instance packaged with jsdom.

$3

jsdom allows you to intervene in the creation of a jsdom very early: after the Window and Document objects are created, but before any HTML is parsed to populate the document with nodes:

`js
const dom = new JSDOM(

Hello

, {
beforeParse(window) {
window.document.childNodes.length === 0;
window.someCoolAPI = () => { / ... / };
}
});
`

This is especially useful if you are wanting to modify the environment in some way, for example adding shims for web platform APIs jsdom does not support.

JSDOM object API

Once you have constructed a JSDOM object, it will have the following useful capabilities:

$3

The property window retrieves the Window object that was created for you.

The properties virtualConsole and cookieJar reflect the options you pass in, or the defaults created for you if nothing was passed in for those options.

$3

The serialize() method will return the HTML serialization of the document, including the doctype:

`js
const dom = new JSDOM(
hello);

dom.serialize() === "hello";

// Contrast with:
dom.window.document.documentElement.outerHTML === "hello";
`

$3

The nodeLocation() method will find where a DOM node is within the source document, returning the parse5 location info for the node:

`js
const dom = new JSDOM(

Hello

,
{ includeNodeLocations: true }
);

const document = dom.window.document;
const bodyEl = document.body; // implicitly created
const pEl = document.querySelector("p");
const textNode = pEl.firstChild;
const imgEl = document.querySelector("img");

console.log(dom.nodeLocation(bodyEl)); // null; it's not in the source
console.log(dom.nodeLocation(pEl)); // { startOffset: 0, endOffset: 39, startTag: ..., endTag: ... }
console.log(dom.nodeLocation(textNode)); // { startOffset: 3, endOffset: 13 }
console.log(dom.nodeLocation(imgEl)); // { startOffset: 13, endOffset: 32 }
`

Note that this feature only works if you have set the includeNodeLocations option; node locations are off by default for performance reasons.

$3

The built-in vm module of Node.js is what underpins jsdom's script-running magic. Some advanced use cases, like pre-compiling a script and then running it multiple times, benefit from using the vm module directly with a jsdom-created Window.

To get access to the contextified global object, suitable for use with the vm APIs, you can use the getInternalVMContext() method:

`js
const { Script } = require("vm");

const dom = new JSDOM(, { runScripts: "outside-only" });
const script = new Script(

if (!this.ran) {
this.ran = 0;
}

++this.ran;
);

const vmContext = dom.getInternalVMContext();

script.runInContext(vmContext);
script.runInContext(vmContext);
script.runInContext(vmContext);

console.assert(dom.window.ran === 3);
`

This is somewhat-advanced functionality, and we advise sticking to normal DOM APIs (such as window.eval() or document.createElement("script")) unless you have very specific needs.

Note that this method will throw an exception if the JSDOM instance was created without runScripts set.

$3

The top property on window is marked [Unforgeable] in the spec, meaning it is a non-configurable own property and thus cannot be overridden or shadowed by normal code running inside the jsdom, even using Object.defineProperty.

Similarly, at present jsdom does not handle navigation (such as setting window.location.href = "https://example.com/"); doing so will cause the virtual console to emit a "jsdomError" explaining that this feature is not implemented, and nothing will change: there will be no new Window or Document object, and the existing window's location object will still have all the same property values.

However, if you're acting from outside the window, e.g. in some test framework that creates jsdoms, you can override one or both of these using the special reconfigure() method:

`js
const dom = new JSDOM();

dom.window.top === dom.window;
dom.window.location.href === "about:blank";

dom.reconfigure({ windowTop: myFakeTopForTesting, url: "https://example.com/" });

dom.window.top === myFakeTopForTesting;
dom.window.location.href === "https://example.com/";
`

Note that changing the jsdom's URL will impact all APIs that return the current document URL, such as window.location, document.URL, and document.documentURI, as well as the resolution of relative URLs within the document, and the same-origin checks and referrer used while fetching subresources. It will not, however, perform navigation to the contents of that URL; the contents of the DOM will remain unchanged, and no new instances of Window, Document, etc. will be created.

Convenience APIs

$3

In addition to the JSDOM constructor itself, jsdom provides a promise-returning factory method for constructing a jsdom from a URL:

`js
JSDOM.fromURL("https://example.com/", options).then(dom => {
console.log(dom.serialize());
});
`

The returned promise will fulfill with a JSDOM instance if the URL is valid and the request is successful. Any redirects will be followed to their ultimate destination.

The options provided to fromURL() are similar to those provided to the JSDOM constructor, with the following additional restrictions and consequences:

- The url and contentType options cannot be provided.
- The
referrer option is used as the HTTP Referer request header of the initial request.
- The
resources option also affects the initial request; this is useful if you want to, for example, configure a proxy (see above).
- The resulting jsdom's URL, content type, and referrer are determined from the response.
- Any cookies set via HTTP
Set-Cookie response headers are stored in the jsdom's cookie jar. Similarly, any cookies already in a supplied cookie jar are sent as HTTP Cookie request headers.

$3

Similar to fromURL(), jsdom also provides a fromFile() factory method for constructing a jsdom from a filename:

`js
JSDOM.fromFile("stuff.html", options).then(dom => {
console.log(dom.serialize());
});
`

The returned promise will fulfill with a JSDOM instance if the given file can be opened. As usual in Node.js APIs, the filename is given relative to the current working directory.

The options provided to fromFile() are similar to those provided to the JSDOM constructor, with the following additional defaults:

- The url option will default to a file URL corresponding to the given filename, instead of to "about:blank".
- The
contentType option will default to "application/xhtml+xml" if the given filename ends in .xht, .xhtml, or .xml; otherwise it will continue to default to "text/html".

$3

For the very simplest of cases, you might not need a whole JSDOM instance with all its associated power. You might not even need a Window or Document! Instead, you just need to parse some HTML, and get a DOM object you can manipulate. For that, we have fragment(), which creates a DocumentFragment from a given string:

`js
const frag = JSDOM.fragment(

Hello

Hi!);

frag.childNodes.length === 2;
frag.querySelector("strong").textContent === "Hi!";
// etc.
`

Here frag is a DocumentFragment instance, whose contents are created by parsing the provided string. The parsing is done using a