Files
ai-utils/README.md
ztimson ca66e8e304
All checks were successful
Publish Library / Build NPM Project (push) Successful in 49s
Publish Library / Tag Version (push) Successful in 7s
Improved whisper + pyannote, sentence diarization
2026-02-21 14:16:20 -05:00

113 lines
4.8 KiB
Markdown

<!-- Header -->
<div id="top" align="center">
<br />
<!-- Logo -->
<img alt="Logo" width="200" height="200" src="https://git.zakscode.com/repo-avatars/a82d423674763e7a0c1c945bdbb07e249b2bb786d3c9beae76d5b196a10f5c0f">
<!-- Title -->
### @ztimson/ai-utils
<!-- Description -->
AI Utility Library - Unified interface for multiple AI providers
<!-- Repo badges -->
[![Version](https://img.shields.io/badge/dynamic/json.svg?label=Version&style=for-the-badge&url=https://git.zakscode.com/api/v1/repos/ztimson/ai-utils/tags&query=$[0].name)](https://git.zakscode.com/ztimson/ai-utils/tags)
[![Pull Requests](https://img.shields.io/badge/dynamic/json.svg?label=Pull%20Requests&style=for-the-badge&url=https://git.zakscode.com/api/v1/repos/ztimson/ai-utils&query=open_pr_counter)](https://git.zakscode.com/ztimson/ai-utils/pulls)
[![Issues](https://img.shields.io/badge/dynamic/json.svg?label=Issues&style=for-the-badge&url=https://git.zakscode.com/api/v1/repos/ztimson/ai-utils&query=open_issues_count)](https://git.zakscode.com/ztimson/ai-utils/issues)
<!-- Links -->
---
<div>
<a href="https://ai-utils.docs.zakscode.com" target="_blank">Documentation</a>
• <a href="https://git.zakscode.com/ztimson/ai-utils/releases" target="_blank">Release Notes</a>
• <a href="https://git.zakscode.com/ztimson/ai-utils/issues/new?template=.github%2fissue_template%2fbug.md" target="_blank">Report a Bug</a>
• <a href="https://git.zakscode.com/ztimson/ai-utils/issues/new?template=.github%2fissue_template%2fenhancement.md" target="_blank">Request a Feature</a>
</div>
---
</div>
## Table of Contents
- [@ztimson/ai-utils](#top)
- [About](#about)
- [Features](#features)
- [Built With](#built-with)
- [Setup](#setup)
- [Production](#production)
- [Development](#development)
- [Documentation](https://ai-utils.docs.zakscode.com/)
- [License](#license)
## About
A TypeScript library that provides a unified interface for working with multiple AI providers, making it easy to integrate various AI capabilities into your applications.
### Features
- **Multi-Provider LLM Support**: Seamlessly work with OpenAI, Anthropic (Claude), and Self-hosted (Ollama) models
- **Audio Speech Recognition (ASR)**: Convert audio to text using Whisper models
- **Optical Character Recognition (OCR)**: Extract text from images using Tesseract
- **Semantic Similarity**: Compare text similarity using tensor-based cosine similarity
- **Provider Abstraction**: Switch between AI providers without changing your code
### Built With
[![Anthropic](https://img.shields.io/badge/Anthropic-de7356?style=for-the-badge&logo=anthropic&logoColor=white)](https://anthropic.com/)
[![llama](https://img.shields.io/badge/llama.cpp-fff?style=for-the-badge&logo=ollama&logoColor=black)](https://github.com/ggml-org/llama.cpp)
[![OpenAI](https://img.shields.io/badge/OpenAI-000?style=for-the-badge&logo=openai-gym&logoColor=white)](https://openai.com/)
[![Pyannote](https://img.shields.io/badge/Pyannote-458864?style=for-the-badge&logo=python&logoColor=white)](https://github.com/pyannote)
[![TensorFlow](https://img.shields.io/badge/TensorFlow-fff?style=for-the-badge&logo=tensorflow&logoColor=ff6f00)](https://tensorflow.org/)
[![Tesseract](https://img.shields.io/badge/Tesseract-B874B2?style=for-the-badge&logo=hack-the-box&logoColor=white)](https://tesseract-ocr.github.io/)
[![Transformers.js](https://img.shields.io/badge/Transformers.js-000?style=for-the-badge&logo=hugging-face&logoColor=yellow)](https://huggingface.co/docs/transformers.js/en/index)
[![TypeScript](https://img.shields.io/badge/TypeScript-3178C6?style=for-the-badge&logo=typescript&logoColor=white)](https://typescriptlang.org/)
[![Whisper](https://img.shields.io/badge/Whisper.cpp-000?style=for-the-badge&logo=openai-gym&logoColor=white)](https://github.com/ggerganov/whisper.cpp)
## Setup
<details>
<summary>
<h3 id="production" style="display: inline">
Production
</h3>
</summary>
#### Prerequisites
- [Node.js](https://nodejs.org/en/download)
#### Instructions
1. Install the package: `npm i @ztimson/ai-utils`
2. For speaker diarization: `pip install pyannote.audio`
</details>
<details>
<summary>
<h3 id="development" style="display: inline">
Development
</h3>
</summary>
#### Prerequisites
- [Node.js](https://nodejs.org/en/download)
- _[Whisper.cpp](https://github.com/ggml-org/whisper.cpp/releases/tag) (ASR)_
- _[Pyannote](https://github.com/pyannote) (ASR Diarization):_ `pip install pyannote.audio`
#### Instructions
1. Install the dependencies: `npm i`
2. For speaker diarization: `pip install pyannote.audio`
3. Build library: `npm build`
4. Run unit tests: `npm test`
</details>
## Documentation
[Available Here](https://ai-utils.docs.zakscode.com/)
## License
Copyright © 2023 Zakary Timson | Available under MIT Licensing
See the [license](_media/LICENSE) for more information.