-
Notifications
You must be signed in to change notification settings - Fork 1.2k
ci(framework): Add e2e agentapp test #7467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
d0579d5
9149c55
ecd4ee0
11cee77
5fcb3f7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| name: Framework AgentApp E2E | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
|
panh99 marked this conversation as resolved.
|
||
|
|
||
| env: | ||
| FLWR_TELEMETRY_ENABLED: 0 | ||
|
|
||
| jobs: | ||
| agentapp: | ||
| runs-on: ubuntu-22.04 | ||
| timeout-minutes: 10 | ||
| name: AgentApp E2E | ||
| defaults: | ||
| run: | ||
| working-directory: framework/e2e/e2e-agentapp | ||
| steps: | ||
| - uses: actions/checkout@v5 | ||
| - name: Bootstrap | ||
| uses: ./.github/actions/bootstrap | ||
| - name: Install dependencies | ||
| run: python -m pip install --upgrade . | ||
| - name: Run AgentApp E2E | ||
|
panh99 marked this conversation as resolved.
|
||
| env: | ||
| BRAVE_API_KEY: ${{ secrets.BRAVE_API_KEY }} | ||
| EXA_API_KEY: ${{ secrets.EXA_API_KEY }} | ||
| FLWR_MODEL_API_KEY: ${{ secrets.FLWR_MODEL_API_KEY }} | ||
| TAVILY_API_KEY: ${{ secrets.TAVILY_API_KEY }} | ||
| run: ./../test_superlink.sh e2e-agentapp | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The README files for E2E tests are usually quite short, since we don’t typically expect people to run them manually. Is there a specific reason this README needs to be this detailed, for example, because the AgentApp setup is different from the other E2E tests? If not, could we keep this README closer to the usual E2E test README style? |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| # Custom AgentApp FAB example | ||
|
|
||
| This app demonstrates how to define a local `AgentApp`, bundle it into a FAB, | ||
| submit it with `flwr run`, and inspect streamed run events through the Control API. | ||
|
|
||
| ## Run | ||
|
|
||
| Start a SuperLink: | ||
|
|
||
|
Comment on lines
+6
to
+9
|
||
| ```bash | ||
| export FLWR_MODEL_API_KEY="YOUR API KEY" | ||
| uv run --no-sync --python=3.11.14 flower-superlink \ | ||
| --insecure \ | ||
| --control-api-address 127.0.0.1:39093 \ | ||
| --serverappio-api-address 127.0.0.1:39094 \ | ||
| --database /tmp/flwr-agent-run.db \ | ||
| --log-file /tmp/flwr-agent-superlink.log | ||
| ``` | ||
|
|
||
| Start a ServerApp SuperExec in another terminal: | ||
|
|
||
| ```bash | ||
| export FLWR_MODEL_API_KEY="YOUR API KEY" | ||
| export BRAVE_API_KEY="YOUR BRAVE API KEY" | ||
| uv run --no-sync --python=3.11.14 flower-superexec \ | ||
| --insecure \ | ||
| --appio-api-address 127.0.0.1:39094 \ | ||
| --plugin-type serverapp | ||
| ``` | ||
|
|
||
| For web search, set one of `BRAVE_API_KEY`, `TAVILY_API_KEY`, or `EXA_API_KEY` | ||
| in the SuperExec terminal. | ||
|
|
||
| Submit the local AgentApp. `flwr run` builds the FAB from this directory and | ||
| submits it to the SuperLink: | ||
|
|
||
| ```bash | ||
| uv run --no-sync --python=3.11.14 flwr run e2e/e2e-agentapp \ | ||
| --run-config 'agent.input="What is the Flower federated learning framework? Answer in one sentence."' | ||
| ``` | ||
|
Comment on lines
+37
to
+40
|
||
|
|
||
| Disable web search for a run with: | ||
|
|
||
| ```bash | ||
| uv run --no-sync --python=3.11.14 flwr run e2e/e2e-agentapp \ | ||
| --run-config 'agent.input="Say hello in one short sentence." agent.web-search=false' | ||
| ``` | ||
|
Comment on lines
+44
to
+47
|
||
|
|
||
| You can also write the FAB file explicitly: | ||
|
|
||
| ```bash | ||
| uv run --no-sync --python=3.11.14 flwr build --app e2e/e2e-agentapp | ||
| ``` | ||
|
|
||
| After `flwr run` prints the run ID, stream task events: | ||
|
|
||
| ```bash | ||
| grpcurl -plaintext \ | ||
| -import-path proto \ | ||
| -proto flwr/proto/control.proto \ | ||
| -d '{"run_id": 1}' \ | ||
| 127.0.0.1:39093 \ | ||
| flwr.proto.Control/StreamRunEvents | ||
| ``` | ||
|
|
||
| Replace `1` with the run ID returned by `flwr run`. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # Copyright 2026 Flower Labs GmbH. All Rights Reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # ============================================================================== | ||
| """Minimal custom Flower AgentApp example.""" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| # Copyright 2026 Flower Labs GmbH. All Rights Reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # ============================================================================== | ||
| """Custom AgentApp used to demonstrate local FAB bundling and execution.""" | ||
|
|
||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from typing import Any | ||
|
|
||
| from flwr.agentapp import AgentApp, AgentSession | ||
| from flwr.app import Context | ||
|
|
||
| _INPUT_KEY = "agent.input" | ||
| _INSTRUCTIONS_KEY = "agent.instructions" | ||
| _MAX_OUTPUT_TOKENS_KEY = "agent.max-output-tokens" | ||
| _MODEL_KEY = "agent.model" | ||
| _WEB_SEARCH_KEY = "agent.web-search" | ||
|
|
||
| app = AgentApp() | ||
|
|
||
|
|
||
| @app.main() | ||
| def main(agent: AgentSession, context: Context) -> None: | ||
| """Run a custom single-turn AgentApp.""" | ||
| run_config = context.run_config | ||
|
|
||
| agent_input = run_config.get(_INPUT_KEY) | ||
| if not isinstance(agent_input, str) or not agent_input.strip(): | ||
| raise ValueError(f"`{_INPUT_KEY}` must be a non-empty string.") | ||
|
|
||
| model = run_config.get(_MODEL_KEY) | ||
| if not isinstance(model, str) or not model.strip(): | ||
| raise ValueError(f"`{_MODEL_KEY}` must be a non-empty string.") | ||
|
|
||
| instructions = run_config.get(_INSTRUCTIONS_KEY) | ||
| if not isinstance(instructions, str): | ||
| raise ValueError(f"`{_INSTRUCTIONS_KEY}` must be a string.") | ||
|
|
||
| max_output_tokens = run_config.get(_MAX_OUTPUT_TOKENS_KEY) | ||
| if not isinstance(max_output_tokens, int) or max_output_tokens <= 0: | ||
| raise ValueError(f"`{_MAX_OUTPUT_TOKENS_KEY}` must be a positive integer.") | ||
|
|
||
| use_web_search = run_config.get(_WEB_SEARCH_KEY) | ||
| if not isinstance(use_web_search, bool): | ||
| raise ValueError(f"`{_WEB_SEARCH_KEY}` must be a boolean.") | ||
|
|
||
| request: dict[str, Any] = { | ||
| "model": model, | ||
| "input": agent_input, | ||
| "stream": True, | ||
| "max_output_tokens": max_output_tokens, | ||
| } | ||
| if instructions: | ||
| request["instructions"] = instructions | ||
| if use_web_search: | ||
| request["tools"] = ["web_search"] | ||
| request["tool_choice"] = "required" | ||
|
|
||
| agent.responses.create(request) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| [build-system] | ||
| requires = ["hatchling"] | ||
| build-backend = "hatchling.build" | ||
|
|
||
| [project] | ||
| name = "e2e-agentapp" | ||
| version = "0.1.0" | ||
| description = "Minimal custom Flower AgentApp example" | ||
| license = "Apache-2.0" | ||
| dependencies = ["flwr @ {root:parent:parent:uri}"] | ||
|
|
||
| [tool.hatch.build.targets.wheel] | ||
| packages = ["e2e_agentapp"] | ||
|
|
||
| [tool.hatch.metadata] | ||
| allow-direct-references = true | ||
|
|
||
| [tool.flwr.app] | ||
| publisher = "flwrlabs" | ||
| fab-include = ["e2e_agentapp/**/*.py"] | ||
|
|
||
| [tool.flwr.app.components] | ||
| agentapp = "e2e_agentapp.agent:app" | ||
|
|
||
| [tool.flwr.app.config.agent] | ||
| input = "" | ||
| instructions = "Use web search before answering, then answer in one short sentence." | ||
| model = "openai/gpt-5.5" | ||
| max-output-tokens = 96 | ||
| web-search = true | ||
|
Comment on lines
+18
to
+30
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’m a bit concerned about the amount of duplication introduced here. Do we need to duplicate the main body of the test code in this file, or could we refactor it so the shared logic is reused? Another option might be to move the shell script for the AgentApp E2E test into a dedicated test file. Right now, |
Uh oh!
There was an error while loading. Please reload this page.