Skip to content

Madhan230205/unify-llm

Repository files navigation

Unify LLM: TypeScript SDK for Multi-Provider LLM Orchestration

Unify LLM is a TypeScript SDK for building AI applications across OpenAI, Anthropic, Gemini, and Ollama with one consistent API. Use it when you need provider-agnostic text generation, tool calling, structured outputs, prompt caching, streaming, middleware, cost tracking, routing, and hallucination interception without rewriting your app for every model vendor.

npm version License: MIT TypeScript CI

Unify LLM is designed for developers who want a TypeScript-first LLM SDK with direct control over providers, middleware, and runtime behavior. It normalizes provider differences such as OpenAI tool_calls, Anthropic tool_use, and Gemini functionCall, while still leaving room for provider-specific options when you need them.

Table of contents

Why use Unify LLM?

Most multi-provider AI projects hit the same friction points:

  • different request and response shapes for each provider
  • inconsistent tool calling formats
  • one-off streaming adapters scattered across the codebase
  • duplicate cost tracking and retry logic
  • growing need for routing, safety, and local-model support

Unify LLM gives you a single client and middleware pipeline so your application logic stays stable while you switch models, add failover, or experiment with routing.

Best fit for

  • TypeScript AI apps that need OpenAI, Anthropic, Gemini, and Ollama behind one SDK
  • AI agents that rely on tool calling and structured outputs
  • LLM gateways that need retries, rate limiting, routing, or cost controls
  • safety-aware systems that want response anomaly detection or stream interception
  • teams comparing providers without rewriting business logic for each API

Core features

  • Unified multi-provider API for OpenAI, Anthropic, Gemini, Ollama, and related integrations
  • Universal tool calling with a single schema shape across supported providers
  • Structured outputs using JSON schema-style contracts
  • Streaming support for incremental generation and stream middleware
  • Prompt caching support where providers expose native caching controls
  • Middleware pipeline for retry, caching, rate limiting, cost tracking, and safety
  • Routing primitives for cost/latency/quality-aware or drift-aware model selection
  • Hallucination interception for response anomaly detection and early stream aborts
  • TypeScript-first developer experience with exported types, examples, and benchmark utilities

Installation

npm install @atom8ai/unify-llm

Requires Node.js 20+.

If you want to run examples locally, configure the provider API keys you actually use. For local-only workflows with Ollama, point your runtime at http://localhost:11434.

Quickstart

This is the fastest way to send one prompt through a unified TypeScript interface.

import { UnifyClient, OpenAIProvider } from '@atom8ai/unify-llm';

const client = new UnifyClient()
  .registerProvider(new OpenAIProvider(process.env.OPENAI_API_KEY!));

const response = await client.generate('openai', {
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: 'Explain prompt caching in one sentence.' },
  ],
});

console.log(response.content);

If you later switch from OpenAI to Anthropic or Gemini, your app can keep the same high-level flow while only changing provider registration and model selection.

Tool calling example

Unify LLM is especially useful when you want one tool definition that works across multiple LLM providers.

import { UnifyClient, OpenAIProvider } from '@atom8ai/unify-llm';

const client = new UnifyClient()
  .registerProvider(new OpenAIProvider(process.env.OPENAI_API_KEY!));

const getWeatherTool = {
  name: 'getWeather',
  description: 'Get the current weather for a city.',
  schema: {
    type: 'object',
    properties: {
      city: { type: 'string' },
    },
    required: ['city'],
  },
  execute: async ({ city }: { city: string }) => {
    return { city, forecast: 'Rain', temperatureF: 52 };
  },
};

const result = await client.generate('openai', {
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: 'Should I bring an umbrella in Seattle today?' },
  ],
  tools: [getWeatherTool],
  autoExecute: true,
});

console.log(result.content);

Middleware, routing, and safety

The middleware layer is where Unify LLM becomes more than a thin API wrapper.

import {
  CacheMiddleware,
  CostTrackerMiddleware,
  RetryMiddleware,
  UnifyClient,
  OpenAIProvider,
  createHallucinationGuard,
} from '@atom8ai/unify-llm';

const costTracker = new CostTrackerMiddleware();

const client = new UnifyClient()
  .registerProvider(new OpenAIProvider(process.env.OPENAI_API_KEY!))
  .use(new CacheMiddleware())
  .use(new RetryMiddleware({ maxRetries: 3, baseDelayMs: 1000 }))
  .use(costTracker)
  .use(createHallucinationGuard({ alpha: 1.2, tau: 2, chunkSize: 6 }));

const response = await client.generate('openai', {
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Summarize the benefits of JSON schema.' }],
});

console.log(response.content);
console.log('Total cost:', costTracker.getTotalCost());

Routing and orchestration highlights

Unify LLM includes advanced and experimental routing primitives for teams exploring:

  • cost/latency/quality tradeoffs with CostLatencyQualityRouter
  • Gaussian-process utility routing with BayesianUtilityRouter
  • topological drift monitoring with TopologicalDriftRouter
  • complexity-threshold routing with ComplexityThresholdRouter
  • failover-capable orchestration with SelfHealingGateway

These are useful when you want a single TypeScript SDK to act like a lightweight LLM gateway, multi-model router, or AI orchestration layer.

Supported providers

Provider Typical models Tool calling Streaming Vision Prompt caching
OpenAI gpt-4o, gpt-4o-mini, o1 N/A
Anthropic claude-3-7-sonnet, claude-3-opus
Google Gemini gemini-2.0-flash, gemini-1.5-pro
Ollama llama3.3, mistral, phi4 N/A

Benchmarks and quality signals

Unify LLM ships with a reproducible benchmark harness in benchmarks/run.ts and evaluation helpers in evaluation/.

What the benchmark covers

  • hallucination guard accuracy
  • stream abort latency
  • cost savings from model routing
  • scaling behavior for micro-batched async workloads

Current checked-in synthetic snapshot

From the current local benchmark artifact in benchmarks/latest.json:

  • hallucination guard accuracy: 100%
  • guardian p95 abort latency: 19.23 ms
  • cost savings vs always-frontier baseline: 43.69%
  • scaling smoke test: 10,000 micro-batched iterations completed

CI regression thresholds

The benchmark assertion step currently enforces these regression guards:

  • guardian accuracy ≥ 95%
  • guardian p95 abort latency ≤ 50 ms
  • pareto cost savings ≥ 20%

These are synthetic benchmark thresholds, not universal production guarantees. They are most useful for catching regressions in routing and safety logic over time.

Test coverage signal

The latest local test run completed with 33 passing test files, 230 passing tests, and 1 skipped file, which helps keep README claims grounded in code that is actually exercised.

Use cases

Developers usually land on Unify LLM through one of these intents:

Build one app across multiple LLM providers

Use one TypeScript client to talk to OpenAI, Anthropic, Gemini, and Ollama while keeping your application code stable.

Add tool calling to AI agents

Define tools once, keep schemas predictable, and avoid provider-specific tool payload drift.

Create a lightweight TypeScript LLM gateway

Combine middleware, routing, and failover so your app can make model decisions without adopting a heavier agent framework.

Add safety checks to streaming AI output

Use createHallucinationGuard to monitor semantic drift, annotate provider metadata, and stop unstable streams early.

Run local and hosted models side by side

Use Ollama for local experimentation and hosted providers for production paths or fallbacks.

Unify LLM vs other TypeScript AI SDKs

This section exists for real developer intent: many users are actively searching for a Vercel AI SDK alternative, LangChain.js alternative, or a more focused multi-provider LLM SDK for TypeScript.

Tool Best when you want Tradeoff
Unify LLM One API for multiple providers, middleware, routing, tool calling, and safety primitives Smaller ecosystem than the largest framework players
Vercel AI SDK Tight UI integration for web apps, especially React/Next.js streaming experiences Less centered on experimental routing and safety middleware primitives
LangChain.js Large ecosystem of chains, integrations, and agent abstractions Heavier abstraction layer if you mainly want direct provider control

When Unify LLM is a strong choice

  • you want a TypeScript SDK for OpenAI, Anthropic, Gemini, and Ollama
  • you care about middleware, tool calling, and provider normalization
  • you want routing and safety controls without building them all from scratch
  • you prefer direct programmatic control over a large framework stack

Plain-English naming guide

Some modules still keep research-style or legacy names for backward compatibility. For new code, prefer the clearer aliases below.

Internal name Preferred public name Meaning
createSemanticMomentumGuardian createHallucinationGuard Hallucination and drift guard
HallucinationInterceptionAlgorithm ResponseAnomalyDetector Response anomaly detector
ParetoNavigatorRouter CostLatencyQualityRouter Cost/latency/quality router
PrimRouter TopologicalDriftRouter Topological drift router
VonNeumannRouter BayesianUtilityRouter Bayesian utility router
AstralDysonRouter ComplexityThresholdRouter Prompt complexity router
semanticFingerprintEngine.ts semanticFingerprint.ts Semantic fingerprint helpers
topologyPersistence.ts topologyDrift.ts Topology drift helpers
loopRiskEngine.ts executionLoopRisk.ts Execution loop risk helpers

Examples and repository guide

Useful starting points in this repository:

FAQ

What is Unify LLM in one sentence?

Unify LLM is a TypeScript SDK that lets you build AI applications across multiple LLM providers with one API for generation, tool calling, middleware, routing, and safety.

Is Unify LLM a LangChain.js replacement?

It can be, depending on your goals. If you want a lighter TypeScript abstraction with direct provider control, middleware, and routing primitives, Unify LLM is a strong option.

Is Unify LLM a Vercel AI SDK alternative?

Yes. If your priority is provider normalization, routing, and middleware rather than UI-focused web framework helpers, Unify LLM is a reasonable alternative.

Can I use local models?

Yes. Unify LLM includes an OllamaProvider, which is useful for local inference, offline experiments, and hybrid local/hosted setups.

Does Unify LLM support structured outputs?

Yes. You can define JSON schema-style response shapes and use them for more predictable parsing and downstream automation.

Does Unify LLM include hallucination protection?

It includes createHallucinationGuard, which monitors semantic drift and can annotate or abort unstable response streams. You should still add normal application-level validation and domain-specific safety checks in production.

Contributing

Contributions are welcome.

  • Open an issue for bugs, provider support requests, or documentation gaps
  • Include tests when you change routing, middleware, or core request handling
  • Include benchmark notes when your change affects performance, safety, or routing behavior
  • Start with CONTRIBUTING.md

License

Released under the MIT License.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors