The UBC GenAI Toolkit (TypeScript) is a modular library designed to simplify the integration of Generative AI capabilities into web applications at UBC. It provides standardized interfaces for common GenAI tasks, shielding applications from underlying imp
npm install ubc-genai-toolkit-tsThe UBC GenAI Toolkit (TypeScript) is a modular library designed to simplify the integration of Generative AI capabilities into web applications at UBC. It provides standardized interfaces for common GenAI tasks, shielding applications from underlying implementations and ensuring API stability even as technologies evolve.
This toolkit follows the Facade pattern, offering simplified interfaces over potentially complex underlying libraries or services. This allows developers of applications that consume this toolkit to focus on application logic rather than GenAI infrastructure, and enables easier adoption of new technologies or providers in the future without requiring changes to consuming applications.
- Installation
- Core Concepts
- Modules
- Core Module
- LLM Module
- Embeddings Module
- Chunking Module
- Document Parsing Module
- RAG Module
- Example Applications
- Future Modules
- Contributing
- License
Note: This toolkit is currently under active development and is not yet published on npm. The following instructions are temporary.
Our goal is to publish this toolkit to npm as @ubc-genai-toolkit/PACKAGE_NAME (e.g., @ubc-genai-toolkit/llm) in the future.
For now, to use the toolkit in your project:
1. Clone this repository to your local machine:
``bash`
git clone https://github.com/ubc/ubc-genai-toolkit-ts.git
package.json
2. In your project's , add the desired toolkit modules as dependencies using relative file: paths pointing to the corresponding directories within your cloned toolkit repository:`
json`
{
"dependencies": {
"@ubc-genai-toolkit/core": "file:/path/to/your/cloned/ubc-genai-toolkit-ts/modules/core",
"@ubc-genai-toolkit/llm": "file:/path/to/your/cloned/ubc-genai-toolkit-ts/modules/llm",
"@ubc-genai-toolkit/embeddings": "file:/path/to/your/cloned/ubc-genai-toolkit-ts/modules/embeddings"
// Add other modules as needed
}
}
/path/to/your/cloned/
_Replace with the actual path on your system._npm install
3. Run (or yarn install, pnpm install) in your project directory.
The toolkit is built upon several core design principles:
- Modular Design: Capabilities are encapsulated in distinct modules (core, llm, embeddings, etc.).
- Stable API: Public interfaces aim for stability, abstracting underlying changes.
- Implementation Agnostic: Core APIs are defined independently of specific technologies.
- Configurable: Modules accept configuration options at initialization.
- Multi-instance: Supports multiple simultaneous instances of modules.
- Observable: Follows consistent patterns for logging and error handling.
- Well-documented: Aims for comprehensive documentation and examples.
Refer to the modules/core/src directory for common patterns like error handling (error.ts), configuration (config.ts), and logging (logger.ts).
The toolkit consists of several modules, each providing specific functionality:
Location: modules/core
This module provides foundational interfaces and utilities used by other modules, including standardized error handling, configuration management, and logging interfaces. It establishes the common patterns that other modules adhere to.
Location: modules/llm
Provides a consistent interface for interacting with various Large Language Models (LLMs). It simplifies managing conversations and handling responses.
- Providers:
- Anthropic (via @anthropic-ai/sdk)openai
- OpenAI (via )ollama
- Ollama (via )example-apps/llm-conversation
- UBC LLM Sandbox (via a custom Ollama/LiteLLM proxy)
- Example App: demonstrates basic conversational interaction.
Location: modules/embeddings
Handles the creation of text embeddings using different models. Embeddings are crucial for tasks like semantic search and Retrieval-Augmented Generation (RAG).
- Underlying Library: fastembedexample-apps/embedding-cli
- Example App: shows how to generate embeddings for text.
Location: modules/chunking
Provides strategies for splitting large texts into smaller, manageable chunks, often a necessary preprocessing step for embedding or LLM processing.
- Underlying Library: langchain (specifically its text splitters)example-apps/chunking-cli
- Example App: demonstrates text chunking strategies.
Location: modules/document-parsing
Extracts text content from various document formats.
- Supported Formats/Libraries:
- PDF (via @opendocsg/pdf2md)mammoth
- DOCX (via )turndown
- HTML (via )markdown-to-text
- Markdown/Text (via , file-type)example-apps/document-parsing-cli
- Example App: shows how to parse different file types.
Location: modules/rag
Facilitates building Retrieval-Augmented Generation systems. It integrates embedding generation and vector storage/retrieval to provide relevant context to LLMs.
- Vector Store Interaction: Currently supports Qdrant (via @qdrant/js-client-rest).Embeddings Module
- Dependencies: Relies on the .example-apps/rag-app
- Example App: demonstrates a basic RAG implementation.
The example-apps/ directory contains simple applications demonstrating how to use each non-core module:
- llm-conversation: Basic chat interface using the LLM module.embedding-cli
- : Command-line tool for generating text embeddings.chunking-cli
- : Command-line tool for splitting text documents.document-parsing-cli
- : Command-line tool for extracting text from files.rag-app`: Simple application showcasing RAG principles.
-
These examples serve as starting points for integrating the toolkit into your own applications.
We are actively working on expanding the toolkit with additional modules relevant to the UBC context, including:
- Authentication Module: Integration with UBC's Shibboleth/SAML2 infrastructure.
- LTI Module: Support for Learning Tools Interoperability (LTI) to connect with Learning Management Systems (LMS) like Canvas.
Contribution guidelines will be added soon. In the meantime, feel free to open issues or pull requests.
This project is licensed under the GNU General Public License v2.0. See the LICENSE file for details.