A powerful AI-powered contextual chat plugin for Docusaurus that brings intelligent, RAG-based assistance to your documentation. Built with TypeScript, React, and OpenAI-compatible APIs.
- π€ AI-Powered Chat - Contextual Q&A based on your documentation
- π RAG (Retrieval Augmented Generation) - Semantic search with embeddings
- π― Smart Citations - Clickable source links for every answer
- β‘ Real-time Streaming - SSE support for token-by-token responses
- π Security First - Prompt injection guards and rate limiting
- π¨ Theme-Aware UI - Respects Docusaurus dark/light mode
- π¦ Zero Config - Works out of the box with sensible defaults
- π Flexible Deployment - Local server or external endpoint
- β»οΈ Incremental Indexing - Smart caching to speed up rebuilds
Note: This plugin is not yet published to npm. Use one of the following methods:
npm install edujbarrios/docusaurus-plugin-ai-chat
# or
yarn add edujbarrios/docusaurus-plugin-ai-chat# Clone the repository
git clone https://github.com/edujbarrios/docusaurus-plugin-ai-chat.git
cd docusaurus-plugin-ai-chat
# Build the plugin
npm install
npm run build
# Link locally
npm link
# In your Docusaurus project
npm link docusaurus-plugin-ai-chat# Login to npm
npm login
# Publish
npm publish
# Then users can install with:
# npm install docusaurus-plugin-ai-chatEdit your docusaurus.config.js:
module.exports = {
plugins: [
[
'docusaurus-plugin-ai-chat',
{
// Required: OpenAI-compatible API settings
provider: 'openai-compatible',
apiKey: process.env.AI_API_KEY,
baseUrl: process.env.AI_BASE_URL || 'https://api.openai.com/v1',
model: 'gpt-4o-mini',
embeddingsModel: 'text-embedding-3-small',
// Optional: Customize behavior
chunkSizeTokens: 800,
chunkOverlapTokens: 80,
topK: 6,
preferCurrentPage: true,
enableStreaming: true,
},
],
],
};Create a .env file:
AI_API_KEY=your-openai-api-key
AI_BASE_URL=https://api.openai.com/v1npm run buildDuring build, the plugin will:
- Extract content from MDX files
- Generate semantic chunks
- Create embeddings
- Build search index at
.docusaurus/ai-index.json
npm run startYou'll see a floating chat button in the bottom-right corner! π
| Option | Type | Default | Description |
|---|---|---|---|
provider |
'openai-compatible' | 'ollama' |
'openai-compatible' |
LLM provider type |
apiKey |
string |
- | API key for authentication |
baseUrl |
string |
'https://api.openai.com/v1' |
Base URL for API |
model |
string |
'gpt-4o-mini' |
Model for chat completions |
embeddingsModel |
string |
'text-embedding-3-small' |
Model for embeddings |
chunkSizeTokens |
number |
800 |
Size of text chunks (in tokens) |
chunkOverlapTokens |
number |
80 |
Overlap between chunks |
topK |
number |
6 |
Number of chunks to retrieve |
preferCurrentPage |
boolean |
true |
Boost current page in results |
enableStreaming |
boolean |
true |
Enable SSE streaming |
index.type |
'json' | 'sqlite' |
'json' |
Index storage format |
index.path |
string |
'.docusaurus/ai-index.json' |
Index file path |
endpointUrl |
string | null |
null |
External API endpoint (see below) |
maxTokensContext |
number |
4000 |
Max tokens in context |
enableRateLimit |
boolean |
true |
Enable rate limiting |
rateLimitPerMinute |
number |
20 |
Requests per minute per IP |
contentDirs |
string[] |
['docs'] |
Directories to index |
MDX Files β Extract Content β Chunk Text β Generate Embeddings β Persist Index
- loadContent() - Scans MDX files in content directories
- Extract - Parses frontmatter, headings, code blocks, and text
- Chunk - Splits content into semantic chunks with overlap
- Embed - Generates vector embeddings via OpenAI API
- Index - Saves to JSON (or SQLite) with deduplication
User Query β Embed Query β Vector Search β Retrieve TopK β LLM Generation β Response + Citations
- Query Embedding - Convert user question to vector
- Similarity Search - Find most relevant chunks (cosine similarity)
- Context Building - Assemble retrieved chunks
- Prompt Construction - Add security guards and system prompt
- LLM Call - Generate answer with citations
- Stream - Return tokens via SSE (if enabled)
The plugin includes a built-in Express server:
// In your server code (e.g., server.js)
import { createHandler } from 'docusaurus-plugin-ai-chat/lib/server/handler';
import path from 'path';
const app = createHandler(
{
provider: 'openai-compatible',
apiKey: process.env.AI_API_KEY,
baseUrl: process.env.AI_BASE_URL,
model: 'gpt-4o-mini',
embeddingsModel: 'text-embedding-3-small',
// ... other options
},
path.join(__dirname, '.docusaurus/ai-index.json')
);
app.listen(3001, () => {
console.log('AI Chat API running on http://localhost:3001');
});The client will call /api/ai-chat by default.
Use an external API (e.g., Vercel, AWS Lambda):
plugins: [
[
'docusaurus-plugin-ai-chat',
{
// ... API credentials for indexing only
endpointUrl: 'https://your-api.vercel.app/api/ai-chat',
// ... other options
},
],
],Your endpoint should accept POST requests with:
{
"message": "How do I install this?",
"currentRoute": "/docs/intro",
"history": []
}And return:
{
"answer": "To install, run `npm install ...`",
"citations": [
{
"route": "/docs/intro",
"anchor": "installation",
"title": "Installation",
"snippet": "Run the following command..."
}
]
}The plugin implements multiple layers of defense:
- System Prompt - Instructs model to ignore embedded instructions
- Input Sanitization - Removes control characters, limits length
- Context Sanitization - Escapes dangerous patterns
- Detection - Flags suspicious patterns in user input
Built-in rate limiting prevents abuse:
- Default: 20 requests/minute per IP
- Configurable via
rateLimitPerMinute - Can be disabled with
enableRateLimit: false
- Use environment variables
- Keep keys in
.env(gitignored) - For production, use external endpoint mode
- Consider API key rotation
Override CSS variables in your custom CSS:
[data-theme='light'] {
--ifm-color-primary: #your-color;
}
[data-theme='dark'] {
--ifm-color-primary: #your-dark-color;
}The chat panel automatically respects these theme variables.
Customize the quick action buttons by forking the component or creating a theme wrapper.
Solution: Run npm run build to generate the index.
Solution: Ensure AI_API_KEY is in your .env and loaded:
// docusaurus.config.js
require('dotenv').config();Solution: Increase rateLimitPerMinute or disable with enableRateLimit: false.
Solutions:
- Use a smaller embeddings model
- Enable incremental indexing (automatic)
- Use SQLite index for large sites
- Reduce
chunkSizeTokens
Solutions:
- Increase
topKto retrieve more context - Enable
preferCurrentPagefor page-specific queries - Adjust
chunkSizeTokensandchunkOverlapTokens - Use a more powerful model (e.g.,
gpt-4)
See the /example directory for a complete working example.
loadContent()- Collects MDX filescontentLoaded()- Processes and indexes contentgetClientModules()- Injects UI componentspostBuild()- Final validation
Non-streaming chat endpoint.
Streaming chat endpoint (SSE).
Health check endpoint.
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
MIT License - see LICENSE file for details.
Eduardo J. Barrios (edujbarrios)
- GitHub: @edujbarrios
- Built with Docusaurus
- Powered by OpenAI
- Inspired by modern RAG implementations
- Docusaurus: v2.x and v3.x
- Node.js: >= 18.0.0
- React: v17.x and v18.x