Discussion on #1 - MCP Testing #1225

animator · 2026-03-01T02:16:59Z

animator
Mar 1, 2026
Maintainer

The Model Context Protocol (MCP) acts as the API layer of the AI world, defining a standard way for AI agents to discover, understand, and interact with tools, data, and software systems - much like REST or GraphQL do for traditional applications.

In this project, your task is to strengthen the MCP Developer ecosystem by designing and building the capability to create & test MCP servers and clients.

Also, go through this resource to understand an MCP Apps server so that you can incorporate its testing in the project.

Project Tech Stack - React, Node, TypeScript

Please feel free to share your ideas/research below.

fadexadex · 2026-03-01T03:08:28Z

fadexadex
Mar 1, 2026

Hi! I would love to work on the MCP testing idea. I have a few questions to help clarify the expectations and scope of the project.

1.Would the testing suite be a separate repository/module, or integrated directly into the core API Dash codebase?

2.What is the expected interface? Should it be a standalone web interface, or integrated seamlessly into the existing API Dash interface?

3.If it is a web interface, is a local standalone version fine for this idea?

4.How would client testing work in practice?

Would it be a Claude/Cursor-like interface where users provide an API key to access an LLM, and then they can easily connect their created MCP server and use the API to interact with it?

5.For enabling the creation of MCP servers, should the solution be tailored strictly towards users who don't know how to code, or accommodate both developers and non-developers?

6.Should it operate similarly to the standard MCP Inspector, or actively extend it?
If we are extending it, what are the key features we are looking to add?

7.Alternatively, do we want a completely standalone suite specifically tailored for API Dash? Is forking the MCP Inspector repo and using it as a base considered a good approach?

Looking forward to your thoughts!

2 replies

animator Mar 1, 2026
Maintainer Author

These question can be better answered over our weekly calls. Please feel free to join - https://luma.com/embed/calendar/cal-ZTW02O2EsWRs6V4/events

animator Mar 12, 2026
Maintainer Author

@fadexadex Kindly go through this article to understand MCP Apps so that its testing can be incorporated in your project.

saurabh24thakur · 2026-03-01T10:21:46Z

saurabh24thakur
Mar 1, 2026

Hi @animator , following up from #1198. I’ve been digging into the DashBot logic and current tool-calling flow.

I'd like to prototype a Provider-Adapter for the transport layer. This would keep the UI stable while switching between stdio and WebSocket servers.

I noticed in #1173 that error handling for malformed calls could be more robust, so I’m also ready to start on a JSON-RPC validation layer. This ensures DashBot handles timeouts or schema mismatches gracefully during testing.

Should I focus on this core transport logic first, or is there a higher priority for the testing interface??

1 reply

animator Mar 7, 2026
Maintainer Author

Please read the project description. This project is not about adding MCP tools to dashbot.

Khushboo0518 · 2026-03-01T11:50:04Z

Khushboo0518
Mar 1, 2026

Hi everyone
My name is Khushboo, and I’m focusing on the MCP Testing project. I’ve set up API Dash locally and started exploring the request execution flow and overall architecture.

Previously, I built an MCP-based AI cloud orchestration system (MCPilot) using Python, Flask, Docker, Apache CloudStack APIs, and Claude AI. In that project, I designed an MCP layer that automated multi-step VM provisioning workflows and reduced manual effort significantly.

For MCP Testing, I’m particularly interested in exploring how we can design an intelligent validation layer that understands API specifications and workflows, generates structured test strategies (including edge cases), and executes multi-step tests while maintaining context.

I’d like to begin by contributing to smaller issues to better understand the codebase, and then gradually propose improvements for strengthening MCP server/client testing. Please let me know where I can start contributing.

2 replies

animator Mar 7, 2026
Maintainer Author

@Khushboo0518 Send an idea doc PR with more details.

animator Mar 12, 2026
Maintainer Author

@Khushboo0518 Kindly go through this article to understand MCP Apps so that its testing can be incorporated in your project idea.

RedtRocks · 2026-03-01T17:48:53Z

RedtRocks
Mar 1, 2026

Hi, I researched and ideated on this idea. Here's the summary of the idea:

The idea is basically about making AI tools safer before they are used in the real world. It works by first using an automated “attacker” (Red Agent) to try and trick or break the system, especially by exploiting the AI’s tendency to be overly helpful. We will be feeding the attacking agent with the tool description and information and it will generate attacks accordingly. The results of the failed and secured attacks will be recorded and stored. Tools will have a policy that can be set related to their usage. We will be autonomously setting this policy to prevent any attacks. A "defender" agent will analyze the result and will be writing policies for the tool to prevent the vulnerabilities. Then it tests the system again to make sure the fix actually works. Finally, it adds a layer that checks whether every action the AI takes truly matches the user’s original intent. Overall, it turns the AI from being blindly helpful into being careful, secure, and aligned with user goals..

Here are the link of the research papers:
https://dl.acm.org/doi/pdf/10.1145/3766882.3767177
https://aclanthology.org/2025.emnlp-industry.62.pdf
https://arxiv.org/html/2512.13860v1#Sx3

Please give your inputs related to this ideation and how it can be improved upon.

Aarav Dudeja

2 replies

animator Mar 7, 2026
Maintainer Author

@RedtRocks Send an idea doc PR with more details.

animator Mar 12, 2026
Maintainer Author

@RedtRocks Kindly go through this article to understand MCP Apps so that its testing can be incorporated in your project idea.

AviraL0013 · 2026-03-01T19:40:24Z

AviraL0013
Mar 1, 2026

Hi @animator — This is what i have thought and researched

The core problem I identified:

The MCP ecosystem has grown extremely fast — 97M monthly SDK downloads and 10,000+ active servers — yet the developer tooling around testing is almost nonexistent.
(MCP Blog, 2025)

Right now most developers test MCP servers by manually connecting to Claude Desktop and typing prompts, with:

no protocol visibility
no edge-case coverage
no spec validation
no automated test generation

Victor Dibia also pointed out earlier this year that MCP testing options are “severely limited”, with poor debugging and developer experience.
(Victor Dibia, 2025)

Even the best existing tool — MCP Inspector — is mainly a development inspector and lacks features for team testing, historical tracking, or CI/CD integration.
(Testomat analysis)

4 replies

K-Khushal Mar 5, 2026

Hi @AviraL0013

I think the problem statement might be a bit overstated.

There are already tools like Postman, Insomnia, Hoppscotch, and Thunder Client that can be used to test MCP servers since they allow structured req–res testing.

Also, MCP Inspector is mainly designed for debugging during local MCP development, and it already provides protocol-level visibility, which makes it quite useful in practice.

Since MCP servers follow a well-structured protocol with standardized inputs and outputs, much of the interaction is predictable and often generated by AI clients.

Overall, instead of saying mcp testing tooling is almost nonexistent mostly. It might be more accurate to say that most current options are general purpose API tools or development inspectors, and there is still a great opportunity to build more MCP specific testing workflows on top of them.

animator Mar 7, 2026
Maintainer Author

@K-Khushal In case you have a different take feel free to send an idea doc PR with more details.

AviraL0013 Mar 8, 2026

Thanks for the feedback — that’s a fair point.

You're right that tools like Postman, Insomnia, Hoppscotch, and Thunder Client can technically be used to test MCP servers, and MCP Inspector already provides useful protocol visibility during development.

What I was trying to highlight is that most of these tools treat MCP interactions as generic HTTP/JSON requests. They don't provide workflows that are aware of MCP-specific concepts like tools, resources, prompts, or structured tool invocation flows.

So I agree the wording can be improved. Rather than saying MCP testing tooling is almost nonexistent, it's more accurate to say that most current solutions are general-purpose API tools or development inspectors, and there is still a strong opportunity to build MCP-specific testing workflows on top of them. @K-Khushal

animator Mar 12, 2026
Maintainer Author

@AviraL0013 Kindly go through this article to understand MCP Apps so that its testing can be incorporated in your project idea.

Khushboo0518 · 2026-03-02T10:14:34Z

Khushboo0518
Mar 2, 2026

Hi @animator and mentors!
I'm exploring Idea #1225 — MCP Testing for GSoC 2026 and wanted to share my initial architecture proposal to get early feedback before finalizing my proposal.
Here's my proposed design:

Key components I'm thinking of:
MCP Server Tester — acts as an MCP client to probe & validate any MCP server (tool discovery, parameterized calls, assertion engine)
Mock MCP Server — simulates a server to test MCP clients (configurable stubs, error simulation, traffic record/replay)
mcp-test-utils — Python library for running MCP tests headlessly in CI pipelines
React/TS UI — integrated into API Dash (explorer, request builder, response diff viewer, test suite runner)

Question for mentors: For the transport layer, should I prioritize stdio first or HTTP/SSE first — or design an abstracted transport interface that supports both from day one?
Looking forward to your feedback!

3 replies

animator Mar 7, 2026
Maintainer Author

@Khushboo0518 Send an idea doc PR with all the details.

animator Mar 12, 2026
Maintainer Author

@Khushboo0518 Kindly go through this article to understand MCP Apps so that its testing can be incorporated in your project idea.

Khushboo0518 Mar 22, 2026

Hi @animator!
Thank you for sharing the MCP Apps article — I read it carefully and have made my proposal and also include MCP Apps Testing covering ui:// resource registration, text/html;profile=mcp-app MIME type validation, tool-to-app linking via _meta.ui.resourceUri, ui/initialize handshake testing, host context support, and CSP declarations.
🔗 Proposal PR: #1415
🔗 Working Prototype: https://github.com/khushboo0518/mcp-tester-prototype
Looking forward to your feedback!

Saad-Mallebhari · 2026-03-05T16:26:44Z

Saad-Mallebhari
Mar 5, 2026

Hi @animator!

I'm very interested in GSoC 2026 Idea #1 - MCP Testing.

After exploring the thread and researching the MCP ecosystem, here are my initial thoughts:

The biggest gap I see is not just protocol visibility (which MCP Inspector covers) but structured test generation and validation ,specifically:

Automatically generating test cases from tool schemas
Validating that tool responses conform to the declared output spec
Replaying captured traffic for regression testing

I'm particularly interested in the assertion engine side , using schema-driven validation to catch edge cases that manual testing misses.

I've started setting up API Dash locally and exploring the codebase.
One question on scope: does the initial milestone prioritize server validation (acting as a client to probe and assert against MCP servers), or full client/server test tooling from the start?
Looking forward to the weekly calls and contributing!

2 replies

animator Mar 7, 2026
Maintainer Author

@Saad-Mallebhari Send an idea doc PR with more details.

animator Mar 12, 2026
Maintainer Author

@Saad-Mallebhari Kindly go through this article to understand MCP Apps so that its testing can be incorporated in your project idea.

souvikDevloper · 2026-03-06T07:48:17Z

souvikDevloper
Mar 6, 2026

Hi @animator, I went through the MCP Testing discussion and also spent time understanding how API Dash is structured today before thinking about an initial direction here.

One thing that stands out to me is that a very broad “MCP DevTools” vision can become proposal-heavy quite quickly, so I think the most useful starting point would be a smaller but strong first slice that solves real developer pain during MCP development.

My current view is that the first milestone should focus on a repeatable MCP testing core, not just an inspector-like live debugging flow:

transport abstraction so different MCP server transports can be tested through one execution layer,

protocol validation for malformed calls, schema / JSON-RPC mismatches, and timeout/error cases,

test session execution + artifact capture so runs become reproducible and comparable,

and then a developer-facing UI for defining/rerunning scenarios and inspecting failures.

I think this would differentiate the project in two ways:

from a pure inspector/debugger workflow, because the goal becomes repeatable testing, and

from an overly broad all-at-once platform design, because the first milestone remains implementation-friendly and easy to validate.

Since API Dash already emphasizes structured workflows, persistence, and testing in its current architecture, I feel MCP Testing should extend that philosophy into the MCP ecosystem rather than begin as a standalone mockup-heavy tool.

I’m continuing to study the codebase and would love to discuss whether the initial priority should be:

a Node/TypeScript execution + validation engine first, or

a thin end-to-end vertical slice with both engine and minimal UI together.

4 replies

souvikDevloper Mar 6, 2026

Hi @animator, I’ve opened this draft PR to make my initial direction for MCP Testing more concrete. I tried to keep the proposal focused on a realistic first slice: execution core, transport abstraction, validation, and reproducible test artifacts before expanding the UI surface. I’d really appreciate feedback on whether this milestone order and scope feel appropriate.

animator Mar 7, 2026
Maintainer Author

@souvikDevloper You can join our weekly call to get more feedback.

animator Mar 12, 2026
Maintainer Author

@souvikDevloper Kindly go through this article to understand MCP Apps so that its testing can be incorporated in your project idea.

souvikDevloper Mar 13, 2026

already did @animator and mentioned it in my updated pr #1336

VanshKaushal · 2026-03-12T09:03:55Z

VanshKaushal
Mar 12, 2026

Hi everyone , @animator 👋

I’ve been exploring the Model Context Protocol ecosystem and I find the idea of strengthening the MCP developer tooling very interesting.

One idea that could improve the developer experience would be to build an MCP Testing Playground where developers can connect to an MCP server and interactively test the exposed tools. The interface could allow users to:

Discover available tools from the MCP server
View input/output schemas for each tool
Send tool calls with custom parameters
Inspect responses and error handling
Save test cases for repeated testing

Additionally, a CLI-based MCP test runner could be useful for automation. Developers could define test cases in JSON/YAML and run them against an MCP server to validate tool behavior.

Example workflow:

Connect to MCP server
Auto-discover tools
Generate test templates
Execute tests and visualize results

I’m currently reading more about MCP implementations and would love to hear how the maintainers envision the testing workflow for MCP servers within the API Dash ecosystem.

4 replies

animator Mar 12, 2026
Maintainer Author

@VanshKaushal Kindly go through this article to understand MCP Apps so that its testing can be incorporated in your project idea.

Post which send your idea doc PR.

VanshKaushal Mar 13, 2026

Hi @animator

Thanks for pointing me to the MCP Apps resource — after exploring it in depth, I think MCP Apps can significantly elevate the MCP developer tooling story within API Dash.

Building on the earlier MCP Testing Playground idea, one strong direction could be to implement the playground itself as an MCP App exposed via an MCP testing tool (e.g., test_mcp_server). Instead of creating a standalone frontend, API Dash could expose a tool that, when invoked from an MCP host (Agent CLI / Chat UI / IDE integration), renders an interactive testing interface.

This MCP App–based testing interface could support:

• Automatic discovery of tools exposed by a connected MCP server
• Schema visualization (inputs, outputs, validation rules)
• Dynamic parameter form generation for tool calls
• Response inspection with structured diffing and error diagnostics
• Test case persistence and replay
• Batch execution and result visualization (tables / charts / timelines)

From an architecture perspective, this keeps API Dash aligned with the MCP philosophy — tools not only return data but can return rich interactive developer experiences. It also removes the need for a separately deployed testing UI and allows developers to test MCP servers directly within their preferred agent environment.

Additionally, I see value in a hybrid testing workflow:

MCP App for exploratory and manual tool testing
CLI-based MCP test runner for automated regression testing using JSON/YAML-defined test suites

The CLI runner could share the same test definition schema as the MCP App, enabling seamless transition from manual exploration to automated validation pipelines (e.g., CI integration).

At a higher level, this could evolve into a reusable “MCP Developer Toolkit” within API Dash — covering discovery, testing, benchmarking, and debugging of MCP tools and servers.

If this direction sounds aligned with the project vision, I’d be happy to draft a more concrete architecture proposal and explore implementation approaches.

File Structure (API Dash MCP Testing Toolkit)

apidash/
│
├── mcp/
│ ├── server/
│ │ ├── index.ts # MCP server bootstrap
│ │ ├── tool-registry.ts # Tool discovery + metadata cache
│ │ ├── test-mcp-server.tool.ts # MCP testing tool entry
│ │ └── protocol-adapter.ts # MCP transport / JSON-RPC handling
│
├── playground/
│ ├── app/
│ │ ├── index.html # MCP App root UI
│ │ ├── main.ts # UI bootstrap
│ │ ├── components/
│ │ │ ├── ToolList.tsx
│ │ │ ├── SchemaViewer.tsx
│ │ │ ├── ParamForm.tsx
│ │ │ ├── ResponsePanel.tsx
│ │ │ ├── TestHistory.tsx
│ │ │ └── BatchRunner.tsx
│ │ └── rpc-bridge.ts # UI ↔ MCP tool communication
│
├── testing/
│ ├── core/
│ │ ├── test-runner.ts # Execution engine
│ │ ├── validator.ts # Response validation logic
│ │ ├── snapshot.ts # Snapshot diffing
│ │ └── performance.ts # Latency + metrics capture
│ │
│ ├── specs/
│ │ ├── schema.ts # JSON/YAML test spec format
│ │ └── template-generator.ts # Auto-generate test templates
│
├── cli/
│ ├── index.ts # CLI entry
│ ├── commands/
│ │ ├── discover.ts # discover MCP tools
│ │ ├── run.ts # run test suite
│ │ ├── generate.ts # generate test templates
│ │ └── report.ts # test reporting
│
├── storage/
│ ├── test-case-store.ts # local / remote persistence
│ └── cache.ts # metadata cache
│
└── reporters/
├── html.ts
├── json.ts
└── markdown.ts

Flow Chart (Manual Testing via MCP App)

Manual exploratory testing via MCP App transitions seamlessly into automated regression testing via CLI using a shared test specification format.

Flow Chart (Automated CLI Workflow)

animator Mar 22, 2026
Maintainer Author

@VanshKaushal Please send proposal doc PR.

VanshKaushal Mar 27, 2026

Hi @animator , @ashitaprasad

I have created PR #1515 with the detailed proposal for the MCP Testing Playground & CLI Toolkit.

Apologies for the delay — I spent some time going deeper into MCP Apps architecture and refining the workflows, modular design and implementation plan so that the proposal is execution-focused and aligned with API Dash’s direction.

Would love to hear your feedback on scope and starting milestones. I am very interested in contributing further and can begin working on initial modules or proof-of-concepts based on your guidance.

Thanks for your time

mouha18 · 2026-03-20T17:39:35Z

mouha18
Mar 20, 2026

Hey everyone! 👋

My name is Mouhamed Diallo, a Junior CS student at DAUST (Dakar American University of Science and Technology) in Senegal. I'm applying for GSoC 2026 and this project immediately caught my attention.

I work with MCP daily through Cline as part of my development workflow, so I've seen firsthand how powerful the protocol is — but also how frustrating it can be to debug tool calls when something goes wrong, with no structured way to verify what's happening under the hood. That pain point is exactly what makes this project feel genuinely needed.

That said, I want to be honest: while I have a working understanding of MCP as a consumer (connecting to servers, invoking tools), I'm still getting familiar with the deeper internals of the protocol — the transport layer negotiation, the full JSON-RPC message lifecycle, and how a well-architected client/server test harness should be structured.

A few questions that would really help me scope this properly:

Scope boundary — Should the testing capability live entirely inside API Dash's existing UI (as a new panel/tab), or is a standalone CLI component also in scope? The idea description mentions React/Node/TypeScript but I want to make sure I'm thinking about integration correctly.
What "testing" means here — Are we focused on server testing (verifying that an MCP server's tools behave correctly), client testing (verifying that an agent correctly calls tools), or both? Or is the scope more about inspection and manual invocation rather than automated assertions?
Existing groundwork — Is there any existing code or proof-of-concept in the API Dash repo that already touches MCP, or would this be built from scratch?
Your definition of success — What does the ideal end-state look like to the team? A Postman-like experience for MCP? Something more automated?

I've gone through the contribution guidelines, the developer guide, and the code walkthrough video, and I've read the MCP Apps resource linked in the idea. I'm excited to dig deeper — just want to make sure I'm building in the right direction before drafting the full proposal.

Thanks for putting this idea together — looking forward to the conversation! 🙏

1 reply

animator Mar 22, 2026
Maintainer Author

@mouha18 Please send proposal doc PR.

yassinekolsi · 2026-03-23T21:14:50Z

yassinekolsi
Mar 23, 2026

I went through all comments in this thread and did a quick repo scan before drafting my proposal direction.

What I think is the highest-impact gap for API Dash Project #1 is transport-parity + deterministic replay, not just interactive inspection.

Proposed core slice (Node + TypeScript backend, React UI):

Session Runner with a canonical scenario format (initialize -> discover -> tool call/assert -> teardown), so the same scenario can run across stdio and HTTP/SSE transports.
A transport adapter layer that normalizes lifecycle events (spawn/connect, timeout, stderr/error stream, disconnect, reconnect), while preserving raw message traces.
A JSON-RPC conformance validator that classifies failures by layer: transport, protocol, schema, tool runtime.
Artifact capture for reproducibility: full request/response log, timing, exit status, and normalized assertion reports for CI.

Why this matters: the thread already discusses inspector-like UX and protocol visibility. The missing piece is reproducible pass/fail behavior across transports with the same scenario inputs, which is what teams need for regression testing.

I also read Ashita Prasad’s MCP Apps article (https://dev.to/ashita/a-practical-guide-to-building-mcp-apps-1bfm). It highlights host-level behaviors that should be testable explicitly, especially ui/initialize handshake, hostCapabilities handling, and tools/call/UI metadata behavior. I think MCP App checks should be first-class assertions, not only visual preview.

One specific scope question for mentors (Ankit/Ashita): for MVP, should transport parity target stdio + Streamable HTTP (with SSE fallback) from day one, or should we lock MVP to stdio first and add HTTP/SSE in phase 2 once the scenario runner + conformance classifier are stable?

I can start by implementing the transport-agnostic scenario schema and failure taxonomy first, then wire adapters incrementally.

1 reply

animator Apr 1, 2026
Maintainer Author

@yassinekolsi go through this resource to understand another MCP Apps server with source code so that you can incorporate its testing in the PoC.

Gaurav5189 · 2026-03-24T06:46:46Z

Gaurav5189
Mar 24, 2026

Hi everyone! I'm Gaurav, a first-year MCA student from India.

My background is strong in backend development, DevOps, testing infrastructure, and workflow automation using n8n. I have experience with Python, Node.js, React, and working a lot with RESTful APIs.

I am applying for GSoC 2026 and am very interested in Idea #1: MCP Testing. I recently built integration testing infrastructure for the Django framework. I am excited about creating strong testing features for MCP servers and clients using Python and Node.

I am currently reviewing the MCP specification and looking into Issue #1225 to understand the current testing issues. I look forward to contributing!
GitHub: https://github.com/Gaurav5189

2 replies

animator Apr 1, 2026
Maintainer Author

@Gaurav5189 go through this resource to understand an MCP Apps server so that you can incorporate its testing in the project.

Gaurav5189 Apr 2, 2026

Thank you for your consideration. I will review the resources to gain a deeper understanding of the MCP Apps server.

JYins · 2026-03-24T22:06:19Z

JYins
Mar 24, 2026

Hi @animator and mentors 👋
I'm Jeremy, an M.Eng ECE (Robotics & Control) student at Western University with a BSc in Computing Science from the University of Alberta. I'm interested in Idea #1 — MCP Testing for GSoC 2026.
My relevant background:

Built a config-driven RAG evaluation pipeline with automated regression testing via pytest + GitHub Actions, benchmarking retrieval metrics (Recall@K, MRR) across multiple configurations
LLM/RAG internship at SmartEarth: evaluated 6 retrieval configurations, catalogued failure modes, proposed mitigations adopted by the team
Graduate research building ML evaluation frameworks with structured metrics and reproducible experiment pipelines

I'm currently exploring the API Dash codebase and will be submitting a proposal draft shortly. Happy to discuss my approach and contribute a small PR in the meantime.
Looking forward to contributing!

1 reply

animator Apr 1, 2026
Maintainer Author

@JYins go through this resource to understand an MCP Apps server so that you can incorporate its testing in the project.

nazifsco · 2026-03-25T11:03:27Z

nazifsco
Mar 25, 2026

Hi, I'm Nazeef. Applying for the MCP Testing project.
I've been working through the spec at modelcontextprotocol.io and the open discussions. My idea for the headless runner: a Python CLI that reads test specs from YAML or JSON, connects to MCP servers over stdio and HTTP/SSE, and writes results to JSON, HTML, or Markdown. Curious whether that matches what you're actually looking for happy to adjust before I go deep on it.
I've been building AI tooling in Python for a while. My last project used a similar pattern orchestrator coordinating async execution modules so the architecture here isn't new ground for me.

2 replies

animator Apr 1, 2026
Maintainer Author

@nazifsco go through this resource to understand an MCP Apps server so that you can incorporate its testing in the project.

nazifsco Apr 2, 2026

On it sir

vinimlo · 2026-03-26T04:35:44Z

vinimlo
Mar 26, 2026

Hi @animator and everyone!

I've been following this discussion and wanted to share something I've been thinking about — should the testing engine be designed CI-first or UI-first?

My initial instinct was to start from the UI side, connecting to a server and browsing tools visually. But after actually building a prototype, I changed my mind. I think the engine needs to be the core product, and the UI is just a viewer on top of it.

The reasoning is simple: if the engine can run headless from day one, CI integration becomes a midterm deliverable instead of something you scramble to add in the last week. The project delivers real value even if the UI takes longer than expected. And every test becomes reproducible — just a CLI command with clear output, no manual clicking required. It also means the engine can live as a standalone package, which fits nicely with how API Dash already structures things (curl_parser, postman, genai as separate packages).

I put together a TypeScript prototype to test this idea — no MCP SDK, just raw stdio transport and JSON-RPC from scratch. It connects to any MCP server, runs 19 conformance tests across 5 categories (Protocol, Discovery, Schema, Execution, Edge Cases), and reports pass/fail with proper exit codes: github.com/vinimlo/mcp-conformance

One thing I think could really tie the whole system together is a recording + replay pipeline: the conformance engine records real JSON-RPC traffic to fixture files, and then a mock server replays those fixtures for deterministic client-side testing. One recording flow that serves both server testing and client testing.

Here is the overall architecture I am proposing:

Also, after reading the MCP Apps practical guide, I am convinced that MCP Apps conformance testing (ui:// resource validation, ui/initialize handshake, hostContext CSS injection) should be part of the core scope rather than a stretch goal.

Would love to hear everyone's thoughts on this!

3 replies

animator Apr 1, 2026
Maintainer Author

@vinimlo go through this resource to understand an MCP Apps server so that you can incorporate its testing in the project.

vinimlo Apr 1, 2026

Thanks for the feedback! Im looking into it now!

vinimlo Apr 1, 2026

@animator Updated my PoC to cover the Sales Analytics MCP Apps server — 32 tests passing via Streamable HTTP, including all 4 tools and the full pipeline. PR: foss42/poc-experiments#8

ANIRUDDH-001 · 2026-03-26T14:34:25Z

ANIRUDDH-001
Mar 26, 2026

Hi, I'm Aniruddh, a BTech 3rd year student applying for GSoC 2026. I'd like to work on the MCP JSON-RPC Validation Layer. I've reviewed the codebase and plan to implement a type-safe validation engine in TypeScript using Zod, following the project's pattern of structured validation seen in validation_utils.dart. I also plan to integrate a layer-based error classifier to provide clear diagnostics for transport and protocol-level failures. Could you assign this to me?

1 reply

animator Apr 1, 2026
Maintainer Author

@ANIRUDDH-001 go through this resource to understand an MCP Apps server so that you can incorporate its testing in the project.

nazifsco · 2026-04-06T20:19:09Z

nazifsco
Apr 6, 2026

Hi @animator, submitted my PoC to gsoc-poc (PR #15). Python test client that connects directly to the sample-mcp-apps-chatflow server, runs JSON-RPC conformance checks, tool/resource discovery, and actual tool calls using the sales data.

Went with Python first to nail the logic, working on a TypeScript version next to match the stack.

Quick scope question: is the testing engine focused on server-side validation, or is client testing also in scope from the start?

0 replies

Discussion on #1 - MCP Testing #1225

Uh oh!

Uh oh!

animator Mar 1, 2026 Maintainer

Replies: 17 comments · 35 replies

Uh oh!

Uh oh!

Uh oh!

animator Mar 1, 2026 Maintainer Author

Uh oh!

animator Mar 12, 2026 Maintainer Author

Uh oh!

Uh oh!

animator Mar 7, 2026 Maintainer Author

Uh oh!

Uh oh!

animator Mar 7, 2026 Maintainer Author

Uh oh!

animator Mar 12, 2026 Maintainer Author

Uh oh!

Uh oh!

animator Mar 7, 2026 Maintainer Author

Uh oh!

animator Mar 12, 2026 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

animator Mar 7, 2026 Maintainer Author

Uh oh!

Uh oh!

animator Mar 12, 2026 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

animator Mar 7, 2026 Maintainer Author

Uh oh!

animator Mar 12, 2026 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

animator Mar 7, 2026 Maintainer Author

Uh oh!

animator Mar 12, 2026 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

animator Mar 7, 2026 Maintainer Author

Uh oh!

animator Mar 12, 2026 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

animator
Mar 1, 2026
Maintainer

Replies: 17 comments 35 replies

animator Mar 1, 2026
Maintainer Author

animator Mar 12, 2026
Maintainer Author

animator Mar 7, 2026
Maintainer Author

animator Mar 7, 2026
Maintainer Author

animator Mar 12, 2026
Maintainer Author

animator Mar 7, 2026
Maintainer Author

animator Mar 12, 2026
Maintainer Author

animator Mar 7, 2026
Maintainer Author

animator Mar 12, 2026
Maintainer Author

animator Mar 7, 2026
Maintainer Author

animator Mar 12, 2026
Maintainer Author

animator Mar 7, 2026
Maintainer Author

animator Mar 12, 2026
Maintainer Author

animator Mar 7, 2026
Maintainer Author

animator Mar 12, 2026
Maintainer Author