ztimson 7dd3307a07
All checks were successful
Publish Library / Build NPM Project (push) Successful in 39s
Publish Library / Tag Version (push) Successful in 15s
Update LLM models at runtime
2026-06-07 15:50:54 -04:00
2025-10-28 22:18:11 -04:00
2026-06-07 15:50:54 -04:00
2025-10-28 22:18:11 -04:00
2025-10-28 22:18:11 -04:00
2025-10-28 22:18:11 -04:00
2025-10-28 22:18:11 -04:00
2025-10-28 22:18:11 -04:00
2025-10-28 22:18:11 -04:00
2026-06-07 15:50:54 -04:00
2026-02-22 09:29:31 -05:00


Logo

@ztimson/ai-utils

AI Utility Library - Unified interface for multiple AI providers

Version Pull Requests Issues



Table of Contents

About

A TypeScript library that provides a unified interface for working with multiple AI providers, making it easy to integrate various AI capabilities into your applications.

Features

  • Multi-Provider LLM Support: Seamlessly work with OpenAI, Anthropic (Claude), and Self-hosted (Ollama) models
  • Audio Speech Recognition (ASR): Convert audio to text using Whisper models
  • Optical Character Recognition (OCR): Extract text from images using Tesseract
  • Semantic Similarity: Compare text similarity using tensor-based cosine similarity
  • Provider Abstraction: Switch between AI providers without changing your code

Built With

Anthropic llama OpenAI Pyannote TensorFlow Tesseract Transformers.js TypeScript Whisper

Setup

Production

Prerequisites

Instructions

  1. Install the package: npm i @ztimson/ai-utils
  2. For speaker diarization: pip install pyannote.audio

Development

Prerequisites

Instructions

  1. Install the dependencies: npm i
  2. For speaker diarization: pip install pyannote.audio
  3. Build library: npm build
  4. Run unit tests: npm test

Documentation

Setup

const ai = new Ai({
    path: '/ai-models',
    
    // Setup audio
    whisper: '/path/to/binary', // Required for ASR
    hfToken: '...', // Required for diarization
    asr: 'ggml-base.en.bin', // Override default ASR model
    
    // Setup LLM
    embedder: 'bge-small-en-v1.5', // Override default embedder model
    llm: {
        system: 'You are a helpful assistant.',
        compress: {max: 90_000, min: 50_000}, // Compress chat history to min tokens when max is reached
        temperature: 0.8,
        max_tokens: 100_000,
        memoryModel: 'gpt-4o', // Cheap model for managing memories in background, defaults to current model
        models: {
            'claude-3-5-sonnet': {proto: 'anthropic', token: process.env.ANTHROPIC_TOKEN},
            'gpt-4o':            {proto: 'openai',    token: process.env.OPENAI_TOKEN},
            'llama3':            {proto: 'ollama',    host: 'http://localhost:11434'},
        },
        mcp: [
            {name: 'files', url: 'https://mcp.example.com', token: process.env.MCP_TOKEN}
        ],
        skills: [
            {name: 'Tone of voice', description: 'Brand writing guidelines', content: '# Tone of Voice\n\nAlways be concise and friendly...'}
        ],
        tools: [{
            name: 'Marco?',
            description: 'Where is marco polo?',
            args: {
                shout: {type: 'boolean', default: 'Shout into the void?', description: false, required: false}
            },
            fn: (args: any, stream: LLMRequest['stream'], ai: Ai) => {
                const {shout} = args;
                return shout ? 'Polo!' : 'Polo';
            }
        }],
    },
	
    // Setup Vision
    ocr: 'eng' // Override default OCR model
});

Audio

// Crate audio transcript
const text = await ai.audio.asr('./path/to/audio.mp3');
console.log(text);

// Break transcript into speakers
const text = await ai.audio.asr('./path/to/audio.mp3', {diarization: true});
console.log(text);

// Break transcript into named speakers
const text = await ai.audio.asr('./path/to/audio.mp3', {diarization: 'llm'});
console.log(text);

Language

const history = [], memory = [];

// Wait for entire response
const text = await ai.language.ask('My favorite color is blue, whats yours?', {history, memory});
console.log(text);

// Stream response
const chunks = '';
await ai.language.ask('Write me a poem', {
	history, memory,
	stream: chunk => chunks += chunk,
});
console.log(chunks);

// Manually compile history into memories at end of conversation
// Happens automatically when coverstaions are compressed
await ai.language.updateMemory(history, memory);

// Summarize text
const summary = await ai.language.summarize(longText, 200);

// Code response (no conversation or extra BS)
const code = await ai.language.code('Write a fibonacci function');

// Structured JSON response
const data = await ai.language.json('Extract the name and age', `{
    "name": "string",
    "age": "number"
}`, {system: 'Extract from user input'});

Premade LLM Tools:

  • cli: Run a shell command, returns its output
  • get_datetime: Returns local date/time
  • get_datetime_utc: Returns current UTC date/time
  • exec: Execute code in cli, node, or python
  • fetch: Make HTTP requests (GET/POST/PUT/DELETE)
  • exec_javascript: Execute CommonJS JavaScript
  • exec_python: Execute Python via python -c
  • read_webpage: Scrape & clean content from a URL, handles HTML, JSON, CSV, media, PDFs etc.
  • web_search: Anonymous DuckDuckGo search, returns a list of URLs
  • wikipedia_lookup: Fetch a Wikipedia article (intro or full)
  • wikipedia_search: Search Wikipedia and return matching articles
  • get_weather: Fetch current weather + forecast for a location (just built!)

Vision

// Extract text from image
const text = await ai.vision.ocr('./path/to/image.png');
console.log(text);

License

Copyright © 2023 Zakary Timson | Available under MIT Licensing

See the license for more information.

Description
Unified interface for multiple AI providers
Readme MIT 929 KiB
Languages
TypeScript 99%
JavaScript 1%