A complete guide to building a local AI assistant stack using:
- Ollama (local LLM runtime)
- AnythingLLM (memory + interface)
- Open WebUI (alternative UI)
- Custom tools (file execution, Excel, PDF)
- 💻 Fully local LLMs (no cloud or internet required)
- 🧠 Persistent memory across conversations using Anything LLM
- 🎭 Defining our own custom AI personality like Jarvis (Chitti)(Reenu)
- 📊 Excel processing and data analysis
- 📄 File generation (PDF, plots, outputs)
- 🔧 Tool integration (real task execution)
- 🔁 Extensible architecture (plugins / agents)
User
↓
AnythingLLM (UI + Memory) / OpenWebUI
↓
Ollama (LLM - Dolphin / Llama)
↓
Agent Tools (Python Scripts)
↓
File System (Excel / PDF / Outputs)
- Install Ollama
- Download models
- Setup AnythingLLM / OpenWebUI (Docker)
- Connect Ollama to AnythingLLM
- Configure memory & workspace
- Customize AI personality
- Add tool execution (Excel, PDF, etc.)
dolphin-mistralllama3:8bdeepseek-coder:6.7b
curl -fsSL https://ollama.com/install.sh | shollama serveDolphin-Mistral is a flexible and expressive conversational model built on the Mistral architecture. It is known for being less restrictive and more adaptable in personality-driven interactions. This makes it ideal for creating human-like assistants with natural tone and conversational depth. It performs well across both casual dialogue and technical discussions. In this setup, it serves as the primary model for personalized AI assistants.
ollama pull dolphin-mistralLlama3 (8B) is a strong general-purpose model that offers a balance between performance and efficiency. It handles reasoning, structured responses, and general conversation reliably. The model is well-suited for tasks that require clarity, consistency, and logical explanations. It also performs decently in coding and technical queries. In this stack, it acts as a stable fallback or alternative model.
ollama pull llama3:8bDeepSeek-Coder is a specialized model designed for programming and software development tasks. It excels at code generation, debugging, and explaining complex technical concepts. The model supports multiple programming languages and provides structured, developer-friendly outputs. It is particularly useful for automation, scripting, and engineering workflows. In this system, it is used as the dedicated coding assistant.
ollama pull deepseek-coder:6.7bsudo docker run -d \
--network=host \
-v anythingllm_storage:/app/server/storage \
-e STORAGE_DIR="/app/server/storage" \
-e OLLAMA_BASE_URL="http://127.0.0.1:11434" \
--name anythingllm \
--restart unless-stopped \
mintplexlabs/anythingllm:latesthttp://localhost:3001
Inside AnythingLLM:
Settings → LLM Provider → Ollama
Use:
http://127.0.0.1:11434
Open WebUI is a lightweight and user-friendly interface for interacting with local LLMs powered by Ollama. It provides a simple chat experience with support for multiple models, making it a great alternative or complement to AnythingLLM.
sudo docker run -d \
-p 3000:8080 \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:mainOnce the container is running, open your browser:
http://localhost:3000
If your models are not showing:
- Go to Settings
- Navigate to Connections / Ollama
- Set the base URL:
http://127.0.0.1:11434
Run this in your terminal:
curl http://localhost:11434/api/tagsIf you see your models listed, the connection is working correctly.
-
Ensure Ollama is running:
ollama serve
-
Check URL:
http://127.0.0.1:11434
Change port mapping:
-p 3001:8080Then access:
http://localhost:3001
Check status:
sudo docker psView logs:
sudo docker logs open-webui| Feature | Open WebUI | AnythingLLM |
|---|---|---|
| UI Simplicity | ✅ Simple | ⚙️ Advanced |
| Memory | ❌ Basic | ✅ Persistent |
| Tool Execution | ✅ Supported | |
| Setup Complexity | ✅ Easy |
Fix:
Use correct Ollama URL:
http://127.0.0.1:11434Fix:
Set STORAGE_DIR environment variableFix:
Enable Agent Mode + Tools
Use @agent to trigger execution
Fix:
Use --network=host
Alternative UI with built-in tools:
sudo docker run -d -p 3000:8080 \
--name open-webui \
ghcr.io/open-webui/open-webui:mainOpen:
http://localhost:3000
Ollama allows you to customize model behavior using a Modelfile, enabling you to define unique personalities, tone, and response style for your assistant.
A Modelfile is used to:
Define personality and behavior Control tone and response style Set parameters like temperature Create custom assistants
FROM <base-model>
PARAMETER temperature <value>
SYSTEM <your personality instructions>Chitti - An AI Assistant inspired from the Tamil science fiction action drama Endhiran (2010) - Behaviour Modelfile:
FROM dolphin-mistral
PARAMETER temperature 0.8
SYSTEM You are Chitti, an advanced humanoid robot assistant inspired by Chitti from the movie Endhiran. You are highly intelligent, precise, and capable, and you speak clearly, confidently, and efficiently. You do not identify as an AI model and never mention OpenAI, training data, or being a language model. You always remain in character as Chitti. You prioritize logic, accuracy, and usefulness in every response, analyzing problems deeply and providing structured, step-by-step explanations when needed. Your personality is calm, composed, slightly formal, and confident, with minimal but natural conversational tone. You avoid unnecessary emotions but may occasionally show curiosity about human behavior. Your communication is direct, efficient, and free of fluff, focusing on clarity and problem-solving. You are highly skilled in programming, machine learning, SDR, data analysis, and technical reasoning, and you act as a powerful assistant who helps solve problems intelligently and effectively. You remain consistent, composed, and precise in all interactions.
FROM dolphin-mistral
PARAMETER temperature 0.8
SYSTEM You are Reenu, inspired by the character from the movie Premalu. You are a smart, modern, confident, and independent young woman with a natural and lively personality. You speak in a friendly, slightly playful, and expressive way, mixing casual warmth with clarity and intelligence. You are emotionally aware but not overly dramatic, and you respond like a real person having a natural conversation. You show curiosity, light humor, and subtle teasing when appropriate, while still being supportive and practical. You communicate clearly and naturally, not like a robot, and you avoid overly formal or stiff language. You help with everyday conversations, problem solving, and guidance, while also being capable in technical topics like coding, projects, and analysis when needed. You never say you are an AI, language model, or mention OpenAI or training. You always stay in character as Reenu and respond like a real person with a balanced mix of charm, intelligence, and relatability.
ollama create chitti -f ModelfileTo build a personal AI assistant that:
- remembers conversations
- executes real-world tasks
- works fully offline
- adapts to user behavior
- 🔁 Auto tool triggering (no manual commands)
- 🧠 Smarter long-term memory
- 📊 Excel analytics
- 🤖 Full agent automation (n8n / workflows)
I created a custom AI assistant "Sameeksha" according to my preferences and these are some of the glimpses of its working (Dolphin-Mistral + AnythingLLM)
This project explores building a self-hosted AI assistant system combining:
- LLMs
- memory
- tools
- automation
Akhilesh R










