Use this document as a single input to an LLM for designing the graphical UI page by page.
CapYap Local Agent
Turn any YouTube video (or transcript file) into a citation-grounded Q/A workspace with timestamp evidence.
- Fast answers from long videos
- Evidence links with timestamps
- Local-first usage with user-controlled LLM provider
- Backend: FastAPI + LangGraph
- Frontend: React + Vite
- Desktop: Tauri (macOS/Windows/Linux)
- Local storage: JSON settings + transcript cache
- Non-coder learners (students, researchers, creators)
- Power users with custom/local model endpoints
- Make setup non-technical (paste key once, start asking)
- Keep trust high (always show transcript evidence)
- Keep app light and fast (minimal screens, low cognitive load)
- Heavy analytics dashboards
- Enterprise admin controls
- Large plugin marketplace UI
- One primary action per page
- Clear next step on every screen
- Minimal chrome, low visual noise
- Mobile-safe responsive layout
- Avoid deep menus; max 2 levels of navigation
- Show progress and errors in plain language
- Page 1: Welcome + Onboarding
- Page 2: Source Loader (YouTube URL / local transcript file)
- Page 3: Ask Workspace (single-turn + cited answer)
- Page 4: Talk to Agent Popup (multi-turn)
- Page 5: Settings (provider/model/performance)
- Inputs:
- Provider name
- Base URL
- Model
- API token (optional local save)
- Token env variable name
- Retrieval defaults (top-k, chunk size, languages)
- Actions:
- Validate and save settings
- Continue to workspace
- States:
- Empty/new user
- Saving
- Save success
- Validation/API failure
- Inputs:
- YouTube URL/ID or local
.txtpath
- YouTube URL/ID or local
- Actions:
- Load transcript
- Show source metadata (id, chunks, words)
- States:
- Idle
- Loading
- Loaded
- Error (invalid URL, missing transcript, fetch failure)
- Inputs:
- Single question text area
- Outputs:
- Answer text
- Citation list with chunk id + timestamp range + excerpt
- Actions:
- Ask agent
- Jump to YouTube timestamp via link
- States:
- Asking
- Answer ready
- No evidence / weak evidence
- Error
- Trigger:
- Floating “Talk to Agent” button
- Behavior:
- Opens chat panel above current page
- Keeps recent conversation turns
- Each assistant response can include citations
- Actions:
- Send follow-up questions
- Close/reopen without losing current session state
- Edit:
- Provider/base URL/model/token settings
- Retrieval defaults
- Actions:
- Save
- Reset to defaults
- Optional “test connection”
- Top bar: product title + settings shortcut
- Source status chip: shows loaded source
- Citation card:
[chunk-N], timestamp range, excerpt, open-link action - Error banner: plain language + retry action
- Loading states: skeleton or minimal spinner with one sentence
- Style: clean, calm, productive
- Density: medium-compact (desktop), spacious touch targets (mobile)
- Color usage: 1 primary, 1 accent, neutral scale
- Typography: legible sans-serif, strong hierarchy
- Motion: subtle, purposeful, <200ms transitions
- Accessibility:
- Keyboard-navigable forms and chat
- Contrast-safe text/buttons
- Clear focus states
- Keep implementation lightweight:
- No large UI framework requirement
- Reuse small component primitives
- Avoid heavy animations/illustration bundles
- Compatible with:
- React + Vite frontend
- Tauri desktop wrapper
- Localhost backend API
GET /api/settingsPOST /api/settingsPOST /api/transcripts/loadPOST /api/agent/chatGET /health
You can paste the prompt below directly:
You are a senior product designer and frontend UX architect.
Design a lightweight, production-ready UI for a local-first app called "CapYap Local Agent".
Purpose: users load a YouTube video/transcript and ask questions answered with timestamp citations.
Tech constraints:
- Frontend: React + Vite
- Desktop shell: Tauri
- Keep bundle and visual complexity light
- Mobile and desktop responsive
Design the app page by page:
1) Welcome + Onboarding
2) Source Loader
3) Ask Workspace
4) Talk to Agent Popup
5) Settings
For each page provide:
- page goal
- content hierarchy
- exact components
- interaction flows and states (idle/loading/success/error)
- accessibility notes
- concise copywriting
Then provide:
- a shared design system token set (color/type/spacing/radius/shadow)
- a component inventory
- a final click-path user journey from first launch to first cited answer
Constraints:
- prioritize simplicity and trust
- emphasize citations and timestamp evidence
- avoid heavy UI patterns and keep it visually calm
- no unnecessary pages
- New user reaches first answer in <= 2 minutes
- Citation links are obvious and one-click
- Onboarding requires no coding knowledge
- Layout works at 360px width and desktop
- No page feels overloaded or visually heavy