Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .github/workflows/framework-agentapp-e2e.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: Framework AgentApp E2E

on:
workflow_dispatch:
Comment thread
panh99 marked this conversation as resolved.
Comment thread
panh99 marked this conversation as resolved.

env:
FLWR_TELEMETRY_ENABLED: 0

jobs:
agentapp:
runs-on: ubuntu-22.04
timeout-minutes: 10
name: AgentApp E2E
defaults:
run:
working-directory: framework/e2e/e2e-agentapp
steps:
- uses: actions/checkout@v5
- name: Bootstrap
uses: ./.github/actions/bootstrap
- name: Install dependencies
run: python -m pip install --upgrade .
- name: Run AgentApp E2E
Comment thread
panh99 marked this conversation as resolved.
env:
BRAVE_API_KEY: ${{ secrets.BRAVE_API_KEY }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
FLWR_MODEL_API_KEY: ${{ secrets.FLWR_MODEL_API_KEY }}
TAVILY_API_KEY: ${{ secrets.TAVILY_API_KEY }}
run: ./../test_superlink.sh e2e-agentapp
66 changes: 66 additions & 0 deletions framework/e2e/e2e-agentapp/README.md

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README files for E2E tests are usually quite short, since we don’t typically expect people to run them manually. Is there a specific reason this README needs to be this detailed, for example, because the AgentApp setup is different from the other E2E tests?

If not, could we keep this README closer to the usual E2E test README style?

Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Custom AgentApp FAB example

This app demonstrates how to define a local `AgentApp`, bundle it into a FAB,
submit it with `flwr run`, and inspect streamed run events through the Control API.

## Run

Start a SuperLink:

Comment on lines +6 to +9
```bash
export FLWR_MODEL_API_KEY="YOUR API KEY"
uv run --no-sync --python=3.11.14 flower-superlink \
--insecure \
--control-api-address 127.0.0.1:39093 \
--serverappio-api-address 127.0.0.1:39094 \
--database /tmp/flwr-agent-run.db \
--log-file /tmp/flwr-agent-superlink.log
```

Start a ServerApp SuperExec in another terminal:

```bash
export FLWR_MODEL_API_KEY="YOUR API KEY"
export BRAVE_API_KEY="YOUR BRAVE API KEY"
uv run --no-sync --python=3.11.14 flower-superexec \
--insecure \
--appio-api-address 127.0.0.1:39094 \
--plugin-type serverapp
```

For web search, set one of `BRAVE_API_KEY`, `TAVILY_API_KEY`, or `EXA_API_KEY`
in the SuperExec terminal.

Submit the local AgentApp. `flwr run` builds the FAB from this directory and
submits it to the SuperLink:

```bash
uv run --no-sync --python=3.11.14 flwr run e2e/e2e-agentapp \
--run-config 'agent.input="What is the Flower federated learning framework? Answer in one sentence."'
```
Comment on lines +37 to +40

Disable web search for a run with:

```bash
uv run --no-sync --python=3.11.14 flwr run e2e/e2e-agentapp \
--run-config 'agent.input="Say hello in one short sentence." agent.web-search=false'
```
Comment on lines +44 to +47

You can also write the FAB file explicitly:

```bash
uv run --no-sync --python=3.11.14 flwr build --app e2e/e2e-agentapp
```

After `flwr run` prints the run ID, stream task events:

```bash
grpcurl -plaintext \
-import-path proto \
-proto flwr/proto/control.proto \
-d '{"run_id": 1}' \
127.0.0.1:39093 \
flwr.proto.Control/StreamRunEvents
```

Replace `1` with the run ID returned by `flwr run`.
15 changes: 15 additions & 0 deletions framework/e2e/e2e-agentapp/e2e_agentapp/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright 2026 Flower Labs GmbH. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Minimal custom Flower AgentApp example."""
71 changes: 71 additions & 0 deletions framework/e2e/e2e-agentapp/e2e_agentapp/agent.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Copyright 2026 Flower Labs GmbH. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Custom AgentApp used to demonstrate local FAB bundling and execution."""


from __future__ import annotations

from typing import Any

from flwr.agentapp import AgentApp, AgentSession
from flwr.app import Context

_INPUT_KEY = "agent.input"
_INSTRUCTIONS_KEY = "agent.instructions"
_MAX_OUTPUT_TOKENS_KEY = "agent.max-output-tokens"
_MODEL_KEY = "agent.model"
_WEB_SEARCH_KEY = "agent.web-search"

app = AgentApp()


@app.main()
def main(agent: AgentSession, context: Context) -> None:
"""Run a custom single-turn AgentApp."""
run_config = context.run_config

agent_input = run_config.get(_INPUT_KEY)
if not isinstance(agent_input, str) or not agent_input.strip():
raise ValueError(f"`{_INPUT_KEY}` must be a non-empty string.")

model = run_config.get(_MODEL_KEY)
if not isinstance(model, str) or not model.strip():
raise ValueError(f"`{_MODEL_KEY}` must be a non-empty string.")

instructions = run_config.get(_INSTRUCTIONS_KEY)
if not isinstance(instructions, str):
raise ValueError(f"`{_INSTRUCTIONS_KEY}` must be a string.")

max_output_tokens = run_config.get(_MAX_OUTPUT_TOKENS_KEY)
if not isinstance(max_output_tokens, int) or max_output_tokens <= 0:
raise ValueError(f"`{_MAX_OUTPUT_TOKENS_KEY}` must be a positive integer.")

use_web_search = run_config.get(_WEB_SEARCH_KEY)
if not isinstance(use_web_search, bool):
raise ValueError(f"`{_WEB_SEARCH_KEY}` must be a boolean.")

request: dict[str, Any] = {
"model": model,
"input": agent_input,
"stream": True,
"max_output_tokens": max_output_tokens,
}
if instructions:
request["instructions"] = instructions
if use_web_search:
request["tools"] = ["web_search"]
request["tool_choice"] = "required"

agent.responses.create(request)
30 changes: 30 additions & 0 deletions framework/e2e/e2e-agentapp/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "e2e-agentapp"
version = "0.1.0"
description = "Minimal custom Flower AgentApp example"
license = "Apache-2.0"
dependencies = ["flwr @ {root:parent:parent:uri}"]

[tool.hatch.build.targets.wheel]
packages = ["e2e_agentapp"]

[tool.hatch.metadata]
allow-direct-references = true

[tool.flwr.app]
publisher = "flwrlabs"
fab-include = ["e2e_agentapp/**/*.py"]

[tool.flwr.app.components]
agentapp = "e2e_agentapp.agent:app"

[tool.flwr.app.config.agent]
input = ""
instructions = "Use web search before answering, then answer in one short sentence."
model = "openai/gpt-5.5"
max-output-tokens = 96
web-search = true
Comment on lines +18 to +30
67 changes: 67 additions & 0 deletions framework/e2e/test_superlink.sh

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m a bit concerned about the amount of duplication introduced here. Do we need to duplicate the main body of the test code in this file, or could we refactor it so the shared logic is reused?

Another option might be to move the shell script for the AgentApp E2E test into a dedicated test file. Right now, test_superlink.sh feels like it is serving two quite different purposes: it adds a large AgentApp E2E-specific block and then exits early. To me, that makes the structure a little surprising and harder to follow.

Original file line number Diff line number Diff line change
@@ -1,6 +1,73 @@
#!/bin/bash
set -e

if [ "$1" = "e2e-agentapp" ]; then
server_arg="--insecure"
server_app_address="127.0.0.1:9091"
db_arg="--database :flwr-in-memory:"
runtime_dependency_install_arg="--disable-runtime-dependency-installation"

# Install Flower app
pip install -e . --no-deps

# Remove any duplicates
sed -i '/^\[tool\.flwr\.federations\.e2e\]/,/^$/d' pyproject.toml

echo -e $"\n[tool.flwr.federations.e2e]\naddress = \"127.0.0.1:9093\"\ninsecure = true" >> pyproject.toml
Comment on lines +10 to +16

timeout 5m flower-superlink \
$server_arg $db_arg $runtime_dependency_install_arg \
--control-api-address 127.0.0.1:9093 \
--serverappio-api-address "$server_app_address" &
sl_pid=$!
sleep 3
Comment on lines +18 to +23

timeout 5m flower-superexec \
$server_arg \
--appio-api-address "$server_app_address" \
--plugin-type serverapp &
sx_pid=$!
sleep 3
Comment on lines +25 to +30

# Trigger migration
flwr ls "." e2e || true

timeout 1m flwr run "." e2e \
--run-config 'agent.input="What is the Flower federated learning framework? Answer in one sentence."'

found_success=false
timeout=120
elapsed=0

cleanup_and_exit() {
kill $sx_pid;
sleep 1; kill $sl_pid;
exit $1
}

Comment on lines +42 to +47
while [ "$found_success" = false ] && [ $elapsed -lt $timeout ]; do
output=$(flwr ls e2e --format=json)
status=$(echo "$output" | jq -r '.runs[0].status')

echo "Current status: $status"

if [ "$status" == "finished:completed" ]; then
found_success=true
echo "AgentApp worked correctly!"
cleanup_and_exit 0
else
echo "⏳ Not completed yet, retrying in 2s..."
sleep 2
elapsed=$((elapsed + 2))
fi
Comment on lines +54 to +62
done

if [ "$found_success" = false ]; then
echo "AgentApp had an issue and timed out."
cleanup_and_exit 1
fi
fi

case "$1" in
e2e-bare-https | e2e-bare-auth)
./generate.sh
Expand Down
Loading