-
Notifications
You must be signed in to change notification settings - Fork 0
[chatbot-demo]: Cerebras llama3.1-8b + split-screen dual streaming UI #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jonah-berman
wants to merge
53
commits into
main
Choose a base branch
from
devin/1773098223-cerebras-inference
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
53 commits
Select commit
Hold shift + click to select a range
8b06a33
feat(chatbot-demo): switch inference from OpenRouter to Cerebras API
devin-ai-integration[bot] 4be8cb3
feat(chatbot-demo): add Cerebras logo to UI header and assets
devin-ai-integration[bot] 1912c15
feat(chatbot-demo): update UI with proper logos, model info, and Exa …
devin-ai-integration[bot] 4257566
fix: add reasoning_format hidden to suppress reasoning token leakage
devin-ai-integration[bot] 4f829f7
fix: add server-side post-processing to strip gpt-oss-120b reasoning …
devin-ai-integration[bot] 0c34f77
fix: improve reasoning artifact cleaning to strip plain text reasonin…
devin-ai-integration[bot] fc532e0
fix: improve reasoning artifact cleaning with better markdown detection
devin-ai-integration[bot] 2de2281
fix: add maxDuration config and SSE heartbeats to prevent Vercel timeout
devin-ai-integration[bot] dde8997
refactor: swap gpt-oss-120b to llama3.3-70b, remove all reasoning art…
devin-ai-integration[bot] d477d1b
fix(chatbot-demo): correct model name to llama3.1-8b (only available …
devin-ai-integration[bot] 7bdef63
fix: use non-streaming initial call for llama3.1-8b tool detection
devin-ai-integration[bot] a9a02fc
fix: handle stringified nested JSON in llama3.1-8b tool call arguments
devin-ai-integration[bot] 19ba2a3
fix: parse tool calls from content text when llama3.1-8b skips tool_c…
devin-ai-integration[bot] e719555
refactor: reset to original code, minimal API swap to Cerebras llama3…
devin-ai-integration[bot] 510ac15
fix: use non-streaming initial call to prevent tool call JSON leaking…
devin-ai-integration[bot] 1a5f325
fix: detect tool calls using 'parameters' key in addition to 'arguments'
devin-ai-integration[bot] 2351ad0
fix: add SSE heartbeats to keep connection alive during non-streaming…
devin-ai-integration[bot] ce0d4a1
refactor: back to streaming with buffered content detection
devin-ai-integration[bot] dba0675
fix: add res.flushHeaders() to prevent browser SSE hang
devin-ai-integration[bot] 076f0f9
fix: add SSE heartbeat comments to prevent follow-up query connection…
devin-ai-integration[bot] 38e93c4
fix: robust tool call parsing with regex fallback for malformed JSON
devin-ai-integration[bot] 498ce30
fix: filter tool call JSON from final response content stream
devin-ai-integration[bot] def8949
fix: retry once when model returns empty response (llama3.1-8b interm…
devin-ai-integration[bot] f24134c
Merge branch 'devin/1773098223-cerebras-inference' of https://git-man…
devin-ai-integration[bot] fc17480
fix: truncate assistant history to 500 chars to prevent empty follow-…
devin-ai-integration[bot] 4f6be22
fix: normalize string array searches from llama3.1-8b tool calls
devin-ai-integration[bot] 9b5c4a0
fix: harden tool call detection with prefix-tolerant parsing, single-…
devin-ai-integration[bot] a1dcebc
chore: change Market Intelligence card to AI & Robotics Fundraises
devin-ai-integration[bot] e8ea6fd
chore: remove OpenAI logo and value proposition bullets from header
devin-ai-integration[bot] 107fa5a
feat: split-screen dual streaming with Exa mode toggle and latency tr…
devin-ai-integration[bot] 2be76a7
feat: dropdown mode toggle, instant source display, server-side Exa l…
devin-ai-integration[bot] aa27f62
feat: flash sources banner briefly then vanish, restore bottom source…
devin-ai-integration[bot] da01b49
fix: match latency bar heights with fixed h-10 on both panes
devin-ai-integration[bot] b6a3028
chore: change suggestion card to Super Bowl question
devin-ai-integration[bot] 0cca2e9
fix: strip leaked followups/tool-call JSON from both panes
devin-ai-integration[bot] eda2e14
fix: client-side cleanup of followups/tool-call JSON, validate chart …
devin-ai-integration[bot] e5ed3f5
fix: guard against undefined/empty code blocks in CodeBlock renderer
devin-ai-integration[bot] cd03cc0
fix: strip empty code fences and trailing unclosed fences from content
devin-ai-integration[bot] 4f6120d
feat: default to fast mode, add refresh search button next to mode dr…
devin-ai-integration[bot] e2792bd
fix: refresh button resets to starter screen instead of re-running query
devin-ai-integration[bot] 0e93d4b
chore: change Super Bowl question to 'Who won the Super Bowl?'
devin-ai-integration[bot] 8eeb5fb
feat: switch to gpt-oss-120b, add 429 retry with exponential backoff
devin-ai-integration[bot] 6225e45
chore: default Exa mode to auto
devin-ai-integration[bot] 9a2e1d7
perf: eliminate first Cerebras call on Exa path, search directly with…
devin-ai-integration[bot] 29c03fc
feat: use OpenRouter (Gemini Flash) for query generation, Cerebras fo…
devin-ai-integration[bot] 1cdc2d0
chore: trigger redeploy with OPEN_ROUTER_KEY env var for preview
devin-ai-integration[bot] c83fcab
fix: improve Cerebras summarization - increase highlight text, use fo…
devin-ai-integration[bot] 7344234
fix: cleaner summarize prompt - no raw URLs/source blocks, no preamble
devin-ai-integration[bot] e6f1e29
fix: remove startPublishedDate filter from Exa calls - query year is …
devin-ai-integration[bot] 6ae255d
feat: default to Exa Instant, add mode toggle to home screen, reset m…
devin-ai-integration[bot] 4c8809b
feat: revert to Cerebras for query generation, add detailed latency b…
devin-ai-integration[bot] 0b0ab3a
style: add Cerebras logo next to Tool Call and Synthesis in latency bar
devin-ai-integration[bot] 60397a9
fix: reduce people category usage in search prompts, require paired n…
devin-ai-integration[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 Hardcoded API key exposed in source code (api/chat.js)
Same hardcoded Cerebras API key
csk-ctnvpnrpxw5t244c83c84pdecwk9tpfdp3jkvece9kve248xis exposed inapi/chat.js:6. This is the Vercel serverless function for the non-streaming chat endpoint. The key should be loaded exclusively from environment variables.Was this helpful? React with 👍 or 👎 to provide feedback.