Unify LLM is a TypeScript SDK for building AI applications across OpenAI, Anthropic, Gemini, and Ollama with one consistent API. Use it when you need provider-agnostic text generation, tool calling, structured outputs, prompt caching, streaming, middleware, cost tracking, routing, and hallucination interception without rewriting your app for every model vendor.
Unify LLM is designed for developers who want a TypeScript-first LLM SDK with direct control over providers, middleware, and runtime behavior. It normalizes provider differences such as OpenAI tool_calls, Anthropic tool_use, and Gemini functionCall, while still leaving room for provider-specific options when you need them.
- Why use Unify LLM?
- Core features
- Installation
- Quickstart
- Tool calling example
- Middleware, routing, and safety
- Supported providers
- Benchmarks and quality signals
- Use cases
- Unify LLM vs other TypeScript AI SDKs
- Plain-English naming guide
- Examples and repository guide
- FAQ
- Contributing
- License
Most multi-provider AI projects hit the same friction points:
- different request and response shapes for each provider
- inconsistent tool calling formats
- one-off streaming adapters scattered across the codebase
- duplicate cost tracking and retry logic
- growing need for routing, safety, and local-model support
Unify LLM gives you a single client and middleware pipeline so your application logic stays stable while you switch models, add failover, or experiment with routing.
- TypeScript AI apps that need OpenAI, Anthropic, Gemini, and Ollama behind one SDK
- AI agents that rely on tool calling and structured outputs
- LLM gateways that need retries, rate limiting, routing, or cost controls
- safety-aware systems that want response anomaly detection or stream interception
- teams comparing providers without rewriting business logic for each API
- Unified multi-provider API for OpenAI, Anthropic, Gemini, Ollama, and related integrations
- Universal tool calling with a single schema shape across supported providers
- Structured outputs using JSON schema-style contracts
- Streaming support for incremental generation and stream middleware
- Prompt caching support where providers expose native caching controls
- Middleware pipeline for retry, caching, rate limiting, cost tracking, and safety
- Routing primitives for cost/latency/quality-aware or drift-aware model selection
- Hallucination interception for response anomaly detection and early stream aborts
- TypeScript-first developer experience with exported types, examples, and benchmark utilities
npm install @atom8ai/unify-llmRequires Node.js 20+.
If you want to run examples locally, configure the provider API keys you actually use. For local-only workflows with Ollama, point your runtime at http://localhost:11434.
This is the fastest way to send one prompt through a unified TypeScript interface.
import { UnifyClient, OpenAIProvider } from '@atom8ai/unify-llm';
const client = new UnifyClient()
.registerProvider(new OpenAIProvider(process.env.OPENAI_API_KEY!));
const response = await client.generate('openai', {
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'Explain prompt caching in one sentence.' },
],
});
console.log(response.content);If you later switch from OpenAI to Anthropic or Gemini, your app can keep the same high-level flow while only changing provider registration and model selection.
Unify LLM is especially useful when you want one tool definition that works across multiple LLM providers.
import { UnifyClient, OpenAIProvider } from '@atom8ai/unify-llm';
const client = new UnifyClient()
.registerProvider(new OpenAIProvider(process.env.OPENAI_API_KEY!));
const getWeatherTool = {
name: 'getWeather',
description: 'Get the current weather for a city.',
schema: {
type: 'object',
properties: {
city: { type: 'string' },
},
required: ['city'],
},
execute: async ({ city }: { city: string }) => {
return { city, forecast: 'Rain', temperatureF: 52 };
},
};
const result = await client.generate('openai', {
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'Should I bring an umbrella in Seattle today?' },
],
tools: [getWeatherTool],
autoExecute: true,
});
console.log(result.content);The middleware layer is where Unify LLM becomes more than a thin API wrapper.
import {
CacheMiddleware,
CostTrackerMiddleware,
RetryMiddleware,
UnifyClient,
OpenAIProvider,
createHallucinationGuard,
} from '@atom8ai/unify-llm';
const costTracker = new CostTrackerMiddleware();
const client = new UnifyClient()
.registerProvider(new OpenAIProvider(process.env.OPENAI_API_KEY!))
.use(new CacheMiddleware())
.use(new RetryMiddleware({ maxRetries: 3, baseDelayMs: 1000 }))
.use(costTracker)
.use(createHallucinationGuard({ alpha: 1.2, tau: 2, chunkSize: 6 }));
const response = await client.generate('openai', {
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'Summarize the benefits of JSON schema.' }],
});
console.log(response.content);
console.log('Total cost:', costTracker.getTotalCost());Unify LLM includes advanced and experimental routing primitives for teams exploring:
- cost/latency/quality tradeoffs with
CostLatencyQualityRouter - Gaussian-process utility routing with
BayesianUtilityRouter - topological drift monitoring with
TopologicalDriftRouter - complexity-threshold routing with
ComplexityThresholdRouter - failover-capable orchestration with
SelfHealingGateway
These are useful when you want a single TypeScript SDK to act like a lightweight LLM gateway, multi-model router, or AI orchestration layer.
| Provider | Typical models | Tool calling | Streaming | Vision | Prompt caching |
|---|---|---|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, o1 |
✅ | ✅ | ✅ | N/A |
| Anthropic | claude-3-7-sonnet, claude-3-opus |
✅ | ✅ | ✅ | ✅ |
| Google Gemini | gemini-2.0-flash, gemini-1.5-pro |
✅ | ✅ | ✅ | ✅ |
| Ollama | llama3.3, mistral, phi4 |
✅ | ✅ | ✅ | N/A |
Unify LLM ships with a reproducible benchmark harness in benchmarks/run.ts and evaluation helpers in evaluation/.
- hallucination guard accuracy
- stream abort latency
- cost savings from model routing
- scaling behavior for micro-batched async workloads
From the current local benchmark artifact in benchmarks/latest.json:
- hallucination guard accuracy:
100% - guardian p95 abort latency:
19.23 ms - cost savings vs always-frontier baseline:
43.69% - scaling smoke test:
10,000micro-batched iterations completed
The benchmark assertion step currently enforces these regression guards:
- guardian accuracy ≥ 95%
- guardian p95 abort latency ≤ 50 ms
- pareto cost savings ≥ 20%
These are synthetic benchmark thresholds, not universal production guarantees. They are most useful for catching regressions in routing and safety logic over time.
The latest local test run completed with 33 passing test files, 230 passing tests, and 1 skipped file, which helps keep README claims grounded in code that is actually exercised.
Developers usually land on Unify LLM through one of these intents:
Use one TypeScript client to talk to OpenAI, Anthropic, Gemini, and Ollama while keeping your application code stable.
Define tools once, keep schemas predictable, and avoid provider-specific tool payload drift.
Combine middleware, routing, and failover so your app can make model decisions without adopting a heavier agent framework.
Use createHallucinationGuard to monitor semantic drift, annotate provider metadata, and stop unstable streams early.
Use Ollama for local experimentation and hosted providers for production paths or fallbacks.
This section exists for real developer intent: many users are actively searching for a Vercel AI SDK alternative, LangChain.js alternative, or a more focused multi-provider LLM SDK for TypeScript.
| Tool | Best when you want | Tradeoff |
|---|---|---|
| Unify LLM | One API for multiple providers, middleware, routing, tool calling, and safety primitives | Smaller ecosystem than the largest framework players |
| Vercel AI SDK | Tight UI integration for web apps, especially React/Next.js streaming experiences | Less centered on experimental routing and safety middleware primitives |
| LangChain.js | Large ecosystem of chains, integrations, and agent abstractions | Heavier abstraction layer if you mainly want direct provider control |
- you want a TypeScript SDK for OpenAI, Anthropic, Gemini, and Ollama
- you care about middleware, tool calling, and provider normalization
- you want routing and safety controls without building them all from scratch
- you prefer direct programmatic control over a large framework stack
Some modules still keep research-style or legacy names for backward compatibility. For new code, prefer the clearer aliases below.
| Internal name | Preferred public name | Meaning |
|---|---|---|
createSemanticMomentumGuardian |
createHallucinationGuard |
Hallucination and drift guard |
HallucinationInterceptionAlgorithm |
ResponseAnomalyDetector |
Response anomaly detector |
ParetoNavigatorRouter |
CostLatencyQualityRouter |
Cost/latency/quality router |
PrimRouter |
TopologicalDriftRouter |
Topological drift router |
VonNeumannRouter |
BayesianUtilityRouter |
Bayesian utility router |
AstralDysonRouter |
ComplexityThresholdRouter |
Prompt complexity router |
semanticFingerprintEngine.ts |
semanticFingerprint.ts |
Semantic fingerprint helpers |
topologyPersistence.ts |
topologyDrift.ts |
Topology drift helpers |
loopRiskEngine.ts |
executionLoopRisk.ts |
Execution loop risk helpers |
Useful starting points in this repository:
examples/basic.ts- base client with cache and cost trackingexamples/orchestration.ts- retrieval, prompt templates, and structured parsingexamples/paretoNavigator.ts- multi-objective routing exampleexamples/primRouter.ts- topological drift routing exampleexamples/hallucinationGuard.ts- non-streaming and streaming guard usagebenchmarks/run.ts- local benchmark harnessCONTRIBUTING.md- contributor setup and expectations
Unify LLM is a TypeScript SDK that lets you build AI applications across multiple LLM providers with one API for generation, tool calling, middleware, routing, and safety.
It can be, depending on your goals. If you want a lighter TypeScript abstraction with direct provider control, middleware, and routing primitives, Unify LLM is a strong option.
Yes. If your priority is provider normalization, routing, and middleware rather than UI-focused web framework helpers, Unify LLM is a reasonable alternative.
Yes. Unify LLM includes an OllamaProvider, which is useful for local inference, offline experiments, and hybrid local/hosted setups.
Yes. You can define JSON schema-style response shapes and use them for more predictable parsing and downstream automation.
It includes createHallucinationGuard, which monitors semantic drift and can annotate or abort unstable response streams. You should still add normal application-level validation and domain-specific safety checks in production.
Contributions are welcome.
- Open an issue for bugs, provider support requests, or documentation gaps
- Include tests when you change routing, middleware, or core request handling
- Include benchmark notes when your change affects performance, safety, or routing behavior
- Start with
CONTRIBUTING.md
Released under the MIT License.