Simple text embeddings library
npm install @themaximalist/embeddings.js
Embeddings.js is a simple way to get text embeddings in Node.js. Embeddings are useful for text similarity search using a vector database.
``javascript`
await embeddings("Hello World!"); // embedding array
- Easy to use
- Works with any vector database
- Supports multiple embedding models with the same simple interface
- Local with Xenova/all-MiniLM-L6-v2
- OpenAI with text-embedding-ada-002
- Mistral with mistral-embed
- Caches embeddings
- MIT license
`bash`
npm install @themaximalist/embeddings.js
To use local embeddings, be sure to install the model as well
`bash`
npm install @xenova/transformers
Embeddings.js works out of the box with local embeddings, but if you use the OpenAI or Mistral embeddings you'll need an API key in your environment.
`bash`
export OPENAI_API_KEY=
export MISRAL_API_KEY=
Using Embeddings.js is as simple as calling a function with any string.
`javascript
import embeddings from "@themaximalist/embeddings.js";
// defaults to local embeddings
const embedding = await embeddings("Hello World!");
// 384 dimension embedding array
`
Switching embedding models is easy:
`javascript
// openai
const embedding = await embeddings("Hello World", {
service: "openai"
});
// 1536 dimension embedding array
// mistral
const embedding = await embeddings("Hello World", {
service: "mistral"
})
// 1024 dimension embedding array
`
caches by default, but you can disable it by passing cache: false as an option.`javascript
// don't cache (on by default)
const embedding = await embeddings("Hello World!", {
cache: false
});
`The cache file is written to
.embeddings.cache.json—you can also delete this file to reset the cache.API
The
Embeddings.js API is a simple function you call with your text, with an optional config object.
`javascript
await embeddings(
input, // Text input to compute embeddings
{
service: "openai", // Embedding service
model: "text-embedding-ada-002", // Embedding model
cache: true, // Cache embeddings
}
);
`Options
*
service : Embedding service provider. Default is transformers, a local embedding provider.
* model : Embedding service model. Default is Xenova/all-MiniLM-L6-v2, a local embedding model. If no model is provided, it will use the default for the selected service.
* cache : Cache embeddings. Default is true.Response
Embeddings.js returns a float[] — an array of floating-point numbers.`javascript
[ -0.011776604689657688, 0.024298833683133125, 0.0012317118234932423, ... ]
`The length of the array is the
dimensions of the embedding. When performing text similarity, you'll want to know the dimensions of your embeddings to use them in a vector database.Dimension Embeddings
* Local: 384
* OpenAI: 1536
* Mistral: 1024
The
Embeddings.js API ensures you have a simple way to use embeddings from multiple providers.Debug
Embeddings.js uses the debug npm module with the embeddings.js namespace.View debug logs by setting the
DEBUG environment variable.`bash
> DEBUG=embeddings.js*
> node src/get_embeddings.js
debug logs
`
Vector Database
Embeddings can be used in any vector database like Pinecone, Chroma, PG Vector, etc...
For a local vector database that runs in-memory and uses
Embeddings.js internally, check out VectorDB.js.Projects
Embeddings.js` is currently used in the following projects:- AI.js — simple AI library
- VectorDB.js — local text similarity search
- HyperType — knowledge graph toolkit
- HyperTyper — multidimensional mind mapping
MIT
Created by The Maximalist, see our open-source projects.