Skip to content

ci(framework): Add e2e agentapp test#7467

Open
danielnugraha wants to merge 5 commits into
mainfrom
add-e2e-agentapp
Open

ci(framework): Add e2e agentapp test#7467
danielnugraha wants to merge 5 commits into
mainfrom
add-e2e-agentapp

Conversation

@danielnugraha

Copy link
Copy Markdown
Member

Issue

Description

Related issues/PRs

Proposal

Explanation

Checklist

  • Implement proposed change
  • Write tests
  • Update documentation
  • Address LLM-reviewer comments, if applicable (e.g., GitHub Copilot)
  • Make CI checks pass
  • Ping maintainers on Slack (channel #contributions)

Any other comments?

Copilot AI review requested due to automatic review settings June 23, 2026 08:53

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 11cee779b6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/framework-agentapp-e2e.yml

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new end-to-end (E2E) “AgentApp” example app under framework/e2e/ and wires it into CI via a dedicated workflow plus a new test_superlink.sh branch to exercise SuperLink/SuperExec + flwr run execution.

Changes:

  • Add a new E2E app framework/e2e/e2e-agentapp (AgentApp FAB example + docs/config).
  • Extend framework/e2e/test_superlink.sh with an e2e-agentapp execution path.
  • Add a GitHub Actions workflow to run the AgentApp E2E job.

Critical issues

  • framework/e2e/test_superlink.sh (AgentApp branch): PID handling/cleanup is unreliable ($! captures timeout, kill under set -e can abort cleanup, no trap), which can leak background processes and/or fail cleanup (comments 001–004).
  • framework/e2e/e2e-agentapp/README.md: flwr run examples omit the required positional SuperLink connection name, so the commands likely won’t target the SuperLink started in the guide (comments 008–009).

Simplicity/readability suggestions

  • N/A beyond the concrete fixes suggested in comments.

Consistency concerns

  • framework/e2e/e2e-agentapp/pyproject.toml lacks [tool.flwr.federations]/e2e definitions, unlike other E2E apps (e.g. framework/e2e/e2e-serverapp-heartbeat/pyproject.toml:27-33), which forces runtime patching and makes local usage harder (comment 006).

Whether the PR should be split

No.

Brief overall verdict

Changes are directionally fine, but the E2E harness needs fixes to be reliable and the README examples need to be corrected before this is safe to rely on.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
framework/e2e/test_superlink.sh Adds an e2e-agentapp run path; needs more robust process management/cleanup.
framework/e2e/e2e-agentapp/README.md Documents local execution; currently missing required flwr run superlink argument and working-dir clarity.
framework/e2e/e2e-agentapp/pyproject.toml Defines the new E2E AgentApp; should include tool.flwr.federations like other E2E apps.
framework/e2e/e2e-agentapp/e2e_agentapp/agent.py Implements the AgentApp main entrypoint using AgentSession.responses.create.
framework/e2e/e2e-agentapp/e2e_agentapp/init.py Package marker/docstring.
.github/workflows/framework-agentapp-e2e.yml New workflow to run the AgentApp E2E job.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +18 to +23
timeout 5m flower-superlink \
$server_arg $db_arg $runtime_dependency_install_arg \
--control-api-address 127.0.0.1:9093 \
--serverappio-api-address "$server_app_address" &
sl_pid=$!
sleep 3
Comment on lines +25 to +30
timeout 5m flower-superexec \
$server_arg \
--appio-api-address "$server_app_address" \
--plugin-type serverapp &
sx_pid=$!
sleep 3
Comment on lines +42 to +47
cleanup_and_exit() {
kill $sx_pid;
sleep 1; kill $sl_pid;
exit $1
}

Comment on lines +54 to +62
if [ "$status" == "finished:completed" ]; then
found_success=true
echo "AgentApp worked correctly!"
cleanup_and_exit 0
else
echo "⏳ Not completed yet, retrying in 2s..."
sleep 2
elapsed=$((elapsed + 2))
fi
Comment on lines +10 to +16
# Install Flower app
pip install -e . --no-deps

# Remove any duplicates
sed -i '/^\[tool\.flwr\.federations\.e2e\]/,/^$/d' pyproject.toml

echo -e $"\n[tool.flwr.federations.e2e]\naddress = \"127.0.0.1:9093\"\ninsecure = true" >> pyproject.toml
Comment on lines +18 to +30
[tool.flwr.app]
publisher = "flwrlabs"
fab-include = ["e2e_agentapp/**/*.py"]

[tool.flwr.app.components]
agentapp = "e2e_agentapp.agent:app"

[tool.flwr.app.config.agent]
input = ""
instructions = "Use web search before answering, then answer in one short sentence."
model = "openai/gpt-5.5"
max-output-tokens = 96
web-search = true
Comment on lines +6 to +9
## Run

Start a SuperLink:

Comment on lines +37 to +40
```bash
uv run --no-sync --python=3.11.14 flwr run e2e/e2e-agentapp \
--run-config 'agent.input="What is the Flower federated learning framework? Answer in one sentence."'
```
Comment on lines +44 to +47
```bash
uv run --no-sync --python=3.11.14 flwr run e2e/e2e-agentapp \
--run-config 'agent.input="Say hello in one short sentence." agent.web-search=false'
```
@github-actions github-actions Bot added the Maintainer Used to determine what PRs (mainly) come from Flower maintainers. label Jun 23, 2026
Comment thread .github/workflows/framework-agentapp-e2e.yml
Comment thread .github/workflows/framework-agentapp-e2e.yml

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m a bit concerned about the amount of duplication introduced here. Do we need to duplicate the main body of the test code in this file, or could we refactor it so the shared logic is reused?

Another option might be to move the shell script for the AgentApp E2E test into a dedicated test file. Right now, test_superlink.sh feels like it is serving two quite different purposes: it adds a large AgentApp E2E-specific block and then exits early. To me, that makes the structure a little surprising and harder to follow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README files for E2E tests are usually quite short, since we don’t typically expect people to run them manually. Is there a specific reason this README needs to be this detailed, for example, because the AgentApp setup is different from the other E2E tests?

If not, could we keep this README closer to the usual E2E test README style?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Maintainer Used to determine what PRs (mainly) come from Flower maintainers.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants