Embeddings.js

Embeddings.js is a simple way to get text embeddings in Node.js. Embeddings are useful for text similarity search using a vector database.

``javascript await embeddings("Hello World!"); // embedding array`

- Easy to use - Works with any vector database - Supports multiple embedding models with the same simple interface - Local with Xenova/all-MiniLM-L6-v2 - OpenAI with text-embedding-ada-002 - Mistral with mistral-embed - Caches embeddings - MIT license

`Install`

`bash npm install @themaximalist/embeddings.js`

To use local embeddings, be sure to install the model as well

`bash npm install @xenova/transformers`

`Configure`

Embeddings.js works out of the box with local embeddings, but if you use the OpenAI or Mistral embeddings you'll need an API key in your environment.

`bash export OPENAI_API_KEY= export MISRAL_API_KEY=`

`Usage`

Using Embeddings.js is as simple as calling a function with any string.

`javascript import embeddings from "@themaximalist/embeddings.js";

// defaults to local embeddings const embedding = await embeddings("Hello World!"); // 384 dimension embedding array`

Switching embedding models is easy:`javascript // openai const embedding = await embeddings("Hello World", { service: "openai" }); // 1536 dimension embedding array

// mistral const embedding = await embeddings("Hello World", { service: "mistral" }) // 1024 dimension embedding array`

`Cache`

Embeddings.js caches by default, but you can disable it by passing cache: false as an option.

`javascript // don't cache (on by default) const embedding = await embeddings("Hello World!", { cache: false });`

The cache file is written to .embeddings.cache.json—you can also delete this file to reset the cache.

`API`

The Embeddings.js API is a simple function you call with your text, with an optional config object.

`javascript await embeddings( input, // Text input to compute embeddings { service: "openai", // Embedding service model: "text-embedding-ada-002", // Embedding model cache: true, // Cache embeddings } );`

Options

* service : Embedding service provider. Default is transformers, a local embedding provider. *model : Embedding service model. Default is Xenova/all-MiniLM-L6-v2, a local embedding model. If no model is provided, it will use the default for the selected service. *cache : Cache embeddings. Default is true.

Response

Embeddings.js returns a float[] — an array of floating-point numbers.

`javascript [ -0.011776604689657688, 0.024298833683133125, 0.0012317118234932423, ... ]`

The length of the array is the dimensions of the embedding. When performing text similarity, you'll want to know the dimensions of your embeddings to use them in a vector database.

Dimension Embeddings

* Local: 384 * OpenAI: 1536 * Mistral: 1024

The Embeddings.js API ensures you have a simple way to use embeddings from multiple providers.

`Debug`

Embeddings.js uses the debug npm module with the embeddings.js namespace.

View debug logs by setting the DEBUG environment variable.

`bash > DEBUG=embeddings.js* > node src/get_embeddings.js

`debug logs`



Vector Database
Embeddings can be used in any vector database like Pinecone, Chroma, PG Vector, etc...

For a local vector database that runs in-memory and uses Embeddings.js internally, check out VectorDB.js.

`Projects`

Embeddings.js` is currently used in the following projects:

- AI.js — simple AI library
- VectorDB.js — local text similarity search
- HyperType — knowledge graph toolkit
- HyperTyper — multidimensional mind mapping

License

MIT

Author

Created by The Maximalist, see our open-source projects.