Skip to content

fix(antigravity): read AGY_TOKEN from bind-mounted target path#282

Open
ptone wants to merge 24 commits into
mainfrom
scion/antigravity-auth-fix
Open

fix(antigravity): read AGY_TOKEN from bind-mounted target path#282
ptone wants to merge 24 commits into
mainfrom
scion/antigravity-auth-fix

Conversation

@ptone

@ptone ptone commented Jun 22, 2026

Copy link
Copy Markdown
Owner

Summary

  • File-type secrets in scion are bind-mounted directly to their target path (~/.gemini/antigravity-cli/antigravity-oauth-token), not staged to the env secret directory (~/.scion/harness/secrets/AGY_TOKEN). The provisioner failed to find the token because it only checked the env secret path.
  • Adds fallback reads to the bind-mounted target path in _select_auth_method (token detection), _provision (token validation), and the generated agy-wrapper.sh (runtime keyring injection).

Test plan

  • Python syntax validates (py_compile)
  • go build ./... passes (no Go changes)
  • Deploy with a file-type AGY_TOKEN secret and verify provisioner reads from ~/.gemini/antigravity-cli/antigravity-oauth-token
  • Verify env-type AGY_TOKEN secrets still work (staging path checked first)
  • Verify agy-wrapper.sh logs correct source ("from target path" vs "from staging file")

ptone and others added 24 commits June 18, 2026 23:17
GoogleCloudPlatform#443)

Per-project import handlers now support NDJSON streaming when the client
sends Accept: application/x-ndjson, matching the unified endpoint. This
fixes the progress display (per-resource events) and the post-import
summary (the NDJSON done event uses the correct "imported" field name).

Also passes discovered names to executeImport even for the count===1
case, ensuring the import is scoped to the exact resource that was
discovered.

Co-authored-by: Scion Agent (template-import-selector-lead) <agent@scion.dev>
…orm#445)

* Warn when server binary is built without web assets

Add a prominent startup warning when web assets are not embedded and
no --web-assets-dir is provided. Serve the self-contained noAssetsPage
HTML from the static asset handler instead of a plain text 404, so
direct requests to /assets/* also show a helpful message.

* Use slog.Warn for missing web assets startup warning

Switch from log.Printf to slog.Warn so the warning formats correctly
in structured logging environments.

---------

Co-authored-by: Scion Agent (web-assets-warning-lead) <agent@scion.dev>
…Platform#444)

* Sync built image reference back to Hub after scion build

After building a harness config image, the new image tag was written to the
on-disk config.yaml but never synced to Hub. At agent start, Hub hydrated
from its stale stored copy (still pointing at the old upstream image), so
the locally-built image was never used.

Two fixes:

1. cmd/build.go: After updating on-disk config.yaml, auto-sync to Hub via
   syncHarnessConfigToHub(). If Hub is unavailable, print a warning
   suggesting the user run 'scion harness-config push'.

2. pkg/hub/maintenance_executors.go: After a successful build in
   BuildHarnessConfigImageExecutor, update the HarnessConfig record's
   Config.Image field in the DB and re-upload the modified config.yaml
   to Hub storage.

* Fix syncBuiltImage to update file manifest and content hash

After uploading the updated config.yaml to storage, recalculate the
hc.Files manifest entry (size + SHA-256 hash) and the overall
hc.ContentHash before writing the DB record. Without this, the Runtime
Broker's cache-invalidation check sees a stale ContentHash and never
pulls the updated config, and subsequent downloads fail manifest
validation against stale size/hash values.

Also validate that config.yaml's root node is a YAML mapping before
processing, returning an error instead of silently skipping malformed
files.

---------

Co-authored-by: Scion Agent (image-build-fix) <agent@scion.dev>
…atform#448)

Only set channel="web" when the authenticated user's ClientType is
"web", not for CLI or API callers hitting the same endpoint. Also
update stale doc-comment on the stream handler.

Co-authored-by: Scion Agent (msg-cleanups-lead) <agent@scion.dev>
…CloudPlatform#442)

* fix(image-build): use default builder for local docker builds

Using a custom docker-container builder (scion-builder) prevents BuildKit from resolving intermediate images built in previous steps when PUSH is false, because the container driver doesn't share the host's daemon image cache. Switching to the default builder for local builds resolves this.

* fix(image-build): use BUILDX_BUILDER env var to avoid mutating global config

Address PR GoogleCloudPlatform#442 reviewer feedback:
- Use the BUILDX_BUILDER environment variable instead of 'docker buildx use' to prevent mutating the user's global Docker CLI configuration.
- For push builds, export BUILDX_BUILDER="${BUILDX_INSTANCE}" and remove the --use flag from 'docker buildx create'.

---------

Co-authored-by: Preston Holmes <ptone@google.com>
…dPlatform#438)

Bumps [astro](https://github.com/withastro/astro/tree/HEAD/packages/astro) from 6.3.2 to 6.4.8.
- [Release notes](https://github.com/withastro/astro/releases)
- [Changelog](https://github.com/withastro/astro/blob/astro@6.4.8/packages/astro/CHANGELOG.md)
- [Commits](https://github.com/withastro/astro/commits/astro@6.4.8/packages/astro)

---
updated-dependencies:
- dependency-name: astro
  dependency-version: 6.4.8
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…Platform#439)

Bumps [dompurify](https://github.com/cure53/DOMPurify) from 3.4.9 to 3.4.11.
- [Release notes](https://github.com/cure53/DOMPurify/releases)
- [Commits](cure53/DOMPurify@3.4.9...3.4.11)

---
updated-dependencies:
- dependency-name: dompurify
  dependency-version: 3.4.11
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…oudPlatform#440)

Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 4.1.1 to 4.2.0.
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](https://github.com/nodeca/js-yaml/commits)

---
updated-dependencies:
- dependency-name: js-yaml
  dependency-version: 4.2.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…gleCloudPlatform#441)

Bumps [github.com/rclone/rclone](https://github.com/rclone/rclone) from 1.73.5 to 1.74.3.
- [Release notes](https://github.com/rclone/rclone/releases)
- [Changelog](https://github.com/rclone/rclone/blob/master/RELEASE.md)
- [Commits](rclone/rclone@v1.73.5...v1.74.3)

---
updated-dependencies:
- dependency-name: github.com/rclone/rclone
  dependency-version: 1.74.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* feat: add source_url field to HarnessConfig and Template Ent schemas

Add optional source_url field to track the import origin URL for
harness-configs and templates, enabling reimport/update flows.

* feat: thread source_url through store models, persistence, and import pipeline

Add SourceURL field to store.HarnessConfig, store.Template,
ResourceRecord, and the ent adapter CRUD operations. Update
Bootstrap() to accept sourceURL parameter and persist it on
create/update. Thread sourceURL from resourceDir through the
import worker loop.

* feat: add reimport endpoint, CLI update command, and source_url in show

- POST /api/v1/harness-configs/{id}/reimport endpoint that re-imports
  from stored source_url or an override URL
- CLI: scion harness-config update <name> [--url] [--all]
- Hub client: Reimport() method on HarnessConfigService
- Show command now displays source URL when present

* feat: add Refresh from Source button to harness-config detail page

- Display source URL as clickable link in metadata row
- Show Refresh from Source button when sourceUrl is present
- Button calls POST /api/v1/harness-configs/{id}/reimport
- Reload config details after successful reimport

* fix: add Reimport method to mockHarnessConfigService in tests

The mock was missing the new Reimport method added to the
HarnessConfigService interface, causing CI lint failures.

* fix: remove destructive migration flags from AutoMigrate

Remove WithDropColumn and WithDropIndex from Ent auto-migration.
These flags cause SQLite to silently drop data during table recreation
when schema changes occur between versions, which was the root cause
of URL-imported harness configs disappearing after server restarts.

* fix: add nameFilter param to importFromRemote call after upstream change

The importFromRemote signature gained a nameFilter []string parameter
upstream. Pass nil for the reimport endpoint since it reimports all
resources from the source URL.

* fix(harness): add no_auth config to antigravity harness

Without a no_auth section, selecting "no authentication" for the
antigravity harness causes the container to start with the full
agy-wrapper.sh command (dbus + gnome-keyring + AGY) instead of
dropping to a shell. This fails because AGY has no credentials.

Add no_auth behavior matching claude/gemini/codex/opencode harnesses.

* fix: address code review findings for reimport handler and CLI

C-1: Add user-scope authorization check in reimport handler. Without
this, any authenticated user could reimport another user's harness
config. Also add default-deny for unknown scopes.

M-1: Properly handle malformed JSON in reimport request body instead
of silently discarding parse errors.

H-1/H-2: Gate human-readable printf calls behind !isJSONOutput() in
the CLI update command so --format json produces clean structured
output.

* fix: address Gemini PR review feedback on reimport handler and CLI

- Use r.Body != nil && r.Body != http.NoBody instead of r.ContentLength
  for chunked/HTTP2 compatibility in reimport handler
- Add nil check for harness config after GetHarnessConfig lookup
- Use cmd.Context() instead of context.Background() in update command
- Add nil checks for resp after Hub List calls in both update functions

* fix: nil-check reimport result and return projectPath resolution errors

- Add nil check for result after Reimport() call to prevent panic on
  (nil, nil) return
- Return error when user-provided projectPath fails to resolve instead
  of silently ignoring it

---------

Co-authored-by: Scion Agent (harness-journey-p1-dev) <agent@scion.dev>
…CloudPlatform#450)

* fix: pass Hub-hydrated harness-config path to harness.Resolve

In broker mode, harness.Resolve() used FindHarnessConfigDir which only
searches local disk. Hub-managed harness configs hydrated into temp
directories were invisible to the resolver, causing it to fall back to
a Generic{} harness with no command config. This produced an empty
shell command (sh -c "") and the agent exited immediately.

Add ConfigDirPath to ResolveOptions so the broker can pass the
Hub-hydrated path directly. This ensures the harness config (including
no_auth, command, provisioner) is loaded correctly in broker mode.

* fix: pass Hub-hydrated path to harness.Resolve in ProvisionAgent

Same gap as run.go — provision.go:735 called harness.Resolve without
ConfigDirPath, so container-script harnesses were invisible in broker
mode during the provisioning step.

---------

Co-authored-by: Scion Agent (harness-journey-inv) <agent@scion.dev>
…s fallback (GoogleCloudPlatform#452)

* feat(web): harness-config detail delete, image section, and logs fallback

- Add delete button with confirmation dialog to harness-config detail
  page, matching the existing delete UX from the list view
- Add image status section below file list showing image path (local vs
  remote), last update time, and the Build button (moved from header)
- Add HarnessConfigData type to surface config.image from the API
- Always show the Logs tab on agent detail regardless of cloudLogging
  flag; fall back to broker-based /api/v1/agents/{id}/logs endpoint
  when Cloud Logging is not configured

* fix(hub): add user-scope authz to deleteHarnessConfig and default deleteFiles to false

Add missing user-scope ownership check in the delete handler — previously
user-scoped configs fell through with no authorization. Also deny unknown
scopes explicitly. Default the deleteFiles checkbox to unchecked so users
must opt in to file deletion.

* fix: address Gemini review feedback on harness-config detail

- Use explicit locale options for date formatting consistency
- Simplify delete response check (!response.ok is sufficient)

---------

Co-authored-by: Scion Agent (developer) <agent@scion.dev>
…oogleCloudPlatform#451)

When publishing a skill version, if the draft creation (step 1) succeeds
but upload/finalize (steps 2-3) fail or are interrupted, retrying from
step 1 would hit a UNIQUE constraint and return 409 Conflict, leaving
the user stuck.

Now when a version with the same number already exists as a draft, the
endpoint returns the existing draft with fresh upload URLs instead of
rejecting. Published/deprecated/archived versions still return 409.

Co-authored-by: Scion Agent (skill-bank-publish-fix) <agent@scion.dev>
…udPlatform#456)

* feat(config): add name field to harness-config config.yaml

When importing a harness-config from a repo where the subfolder name
doesn't match the desired config name, the import derives the wrong
name. Add a `name` field to HarnessConfigEntry so config authors can
declare the intended name in config.yaml.

Name resolution priority: CLI --name flag > config.yaml name field >
harness field > URL/directory-derived name.

* fix: validate name field against path traversal and read it in parent import

Sanitize the config.yaml name field in LoadHarnessConfigDir to reject
values containing path separators or traversal sequences (e.g. "../foo").
Also apply the same name-override logic to the parent-directory case in
discoverResourceDirs so each child's config.yaml name is honored during
bulk import, matching the existing leaf-case behavior.

* fix: add path-traversal test and schema pattern for config name field

Add unit test verifying that names like '../evil', 'sub/dir', and '..'
are rejected by LoadHarnessConfigDir. Add a regex pattern constraint
(^[a-zA-Z0-9][a-zA-Z0-9_-]*$) to the name field in the JSON schema
so invalid names are caught at validation time.

* fix: use strings.ContainsAny for platform-independent path traversal validation

Replace filepath.Base-based validation in LoadHarnessConfigDir with
strings.ContainsAny to catch backslash separators on all platforms.
Strengthen install name validation to also reject ".." and paths with
separators.

---------

Co-authored-by: Scion Agent (harness-name-dev) <agent@scion.dev>
…GoogleCloudPlatform#457)

Replace uname -m with Docker's TARGETARCH build arg, which is correct
under cross-compilation and emulation. Also restore explicit tar member
name to only extract the expected binary.

Co-authored-by: Scion Agent (antigravity-fix-dev) <agent@scion.dev>
* feat(skills): combined create + publish flow on skill creation page

Rewrite the skill creation page with Phase 1 of the create-ux-rework:

- SKILL.md content textarea with monospace font and frontmatter template
  placeholder. Supports paste and file upload.
- Client-side YAML frontmatter parsing (js-yaml) with debounced
  auto-populate of name, description, and tags fields.
- Manual edit tracking (editedFields Set) prevents auto-populate from
  overwriting user changes. "from SKILL.md" badges indicate
  auto-populated fields. "Reset from SKILL.md" clears overrides.
- Additional files drop zone with drag-and-drop support. Section header
  changes contextually based on whether SKILL.md content exists.
- Combined create + publish flow: inline progress view with multi-step
  state machine (form → creating → uploading → finalizing → done).
- File uploads use signed-URL pattern with SHA-256 hashing and
  concurrency-4 worker pool, matching the existing publish dialog.
- Per-file upload status indicators and retry support for failed uploads.
- Error recovery: "Retry Failed" for upload errors, "Go to Skill" link
  when create succeeds but publish fails.
- File validation: SKILL.md required for publish, max 50 files, 10 MB
  per file, 50 MB total, 512 KB paste limit.
- Conditional submit buttons: "Create & Publish v{version}" + "Create
  Only" when files present, "Create Skill" when no files.
- Version input defaults to 1.0.0, shown only when files are present.

* fix(skills): address code review feedback on create+publish flow

M1: Fix retryPublish() when version creation (step 2) fails. Track
versionCreated flag; when false, retry re-attempts from step 2 (create
version + upload) instead of falling through to finalize with no draft.

m1: Store redirect setTimeout IDs in redirectTimer property and clear
in disconnectedCallback to prevent stale history pushes on early unmount.

m2: Improve semver regex to allow hyphens in pre-release and optional
+build metadata segments.

m3: Add cleanVersion() helper that strips v-prefix before sending
version to API, not just for validation.

m4: Add @types/js-yaml dev dependency, remove @ts-ignore suppression.

* fix(skills): memoize allFiles, handle network errors, optimize failed-file lookup

- Cache allFiles getter result, invalidate only when skillMdContent or
  additionalFiles change to avoid recreating Blob/File on every render
- Wrap fetch retry paths in try/catch to handle network-level errors
  (DNS, CORS, connection refused) with user-friendly messages
- Use Map for uploadUrl lookup and Set for failed indices to avoid
  O(K*N) iteration in retry paths

* fix(skills): address 4 review findings in skill-create

- Clear skillMdContent and revert textarea on oversized paste
- Check validationError in validateForPublish and validateForCreate
- Clean up uploadFiles retry logic with consistent doUpload helper

---------

Co-authored-by: Scion Agent (p1-web-dev) <agent@scion.dev>
* Add diagnostic logging when pipeline receives data without cloud exporter

handleMetrics(), handleSpans(), and handleLogs() previously returned
silently when the cloud exporter was nil, causing complete data loss with
no diagnostic signal. Add a sync.Once warning on first invocation so
operators can see that telemetry data is being received but dropped.

* Resolve GCP project ID from metadata server as fallback

LoadConfig() now queries the GCE metadata server when no explicit
SCION_GCP_PROJECT_ID or credentials file provides a project ID.
The compute/metadata package respects GCE_METADATA_HOST, so this
transparently reaches the scion metadata emulator (localhost:18380)
in agent containers. A 2-second timeout prevents blocking startup
when no metadata server is reachable.

This fixes the primary blocker where the pipeline cloud exporter
failed with "GCP project ID is required" on instances like
scion-integration2 that have a working metadata server but no
explicit credentials file or env var.

* Route hook metrics through pipeline to prevent sampling rate violations

Each sciontool hook invocation previously created its own Cloud Monitoring
exporter, causing ~50% of metric writes to be rejected when hooks fired
within Cloud Monitoring's minimum sampling period.

Two changes fix this:
1. In GCP mode, short-lived processes (hooks, batch=false) now send
   metrics via OTLP to the local pipeline receiver instead of directly
   to Cloud Monitoring. This mirrors how logs already work.
2. The pipeline buffers incoming metrics and flushes to Cloud Monitoring
   every 15 seconds, consolidating rapid writes into safe batches.

Long-lived processes (init, batch=true) continue exporting directly
to Cloud Monitoring since they maintain stable cumulative counters.

* Register ModelResponse hook to enable token metrics from Claude Code

Claude Code's ModelResponse hook event carries per-turn token counts
(input_tokens, output_tokens, cached_tokens) but was not registered
in the embedded settings.json. The dialect parser and metric recording
code already handle this event type — only the registration was missing.

This enables gen_ai.tokens.input, gen_ai.tokens.output, and
gen_ai.tokens.cached metrics to be recorded from each model turn.
Also updates MaxModelCalls capability to SupportYes since model-end
events are now emitted.

* Skip metadata server query in tests to avoid unnecessary network calls

Move the testing.Testing() check before the metadata query instead of
after, so tests never make a network call to the metadata server only
to have the result discarded by the defense-in-depth guard.

* Update capability test to match ModelResponse hook registration

The Claude Code harness now supports MaxModelCalls (SupportYes) since
ModelResponse hooks are registered. Update the test expectation.

* Deduplicate buffered metrics before export to Cloud Monitoring

When multiple hook processes report the same cumulative counter (e.g.
agent.tool.calls with tool=Read) within the 15-second buffer window,
their data points conflict in Cloud Monitoring's sampling rate check.

The pipeline now deduplicates by (metric name, attribute set) before
exporting, keeping only the latest data point per combination. This
eliminates the remaining sampling-rate violations that the buffering
alone didn't solve.

* Remove metadata server fallback for GCP project ID resolution

The GCP project ID for metrics should be derived from the injected
telemetry service account credentials file, not the metadata server.
The metadata server returns the infrastructure project (where the VM
runs), which may differ from the intended metrics target project
specified in the SA key JSON.

* Enforce minimum interval between metric buffer flushes

Track the last successful export time and skip flushes that occur within
metricFlushInterval (15s) of the previous export. This prevents the
shutdown drain in Pipeline.Stop() from racing with a recent periodic
flush, which caused Cloud Monitoring to reject cumulative data points
written less than 10 seconds apart.

* Add metrics dashboard to hub admin UI

Backend: MetricsDashboardService queries Cloud Monitoring for Scion
telemetry metrics (sessions, API calls, tokens, active agents).
Supports summary, sessions, model-calls, and tokens views with
daily aggregation and 5-minute result caching.

Frontend: Lit page component with Chart.js charts, tabs for each
view, and period selector (7/14/30 days). Wired into admin nav
and route system.

* Address code review: fix error handling and label extraction

- Return errors on partial Cloud Monitoring query failures instead of
  silently caching empty results. Only cache fully successful responses.
  Partial data is still returned with an X-Metrics-Warning header.
- Extract the specific requested label key from groupByLabel instead of
  iterating over arbitrary map entries. Prevents mislabeling when series
  contain multiple metric labels.

* Fix ALIGN_SUM → ALIGN_DELTA for cumulative metrics

Cloud Monitoring requires ALIGN_DELTA (not ALIGN_SUM) for CUMULATIVE
counter metrics. ALIGN_SUM is only valid for GAUGE metrics.

* feat: move metrics dashboard from admin to main nav with relaxed auth

Phase 1 of metrics dashboard refactoring:
- Rename admin-metrics.ts → metrics-dashboard.ts, update element tag
  from scion-page-admin-metrics to scion-page-metrics
- Add /metrics route to main ROUTES, keep /admin/metrics as backward-compat redirect
- Move "Metrics" nav item from Admin section to Management section in sidebar
- Relax backend authorization from admin-only to all authenticated users
- Register new /api/v1/metrics/ endpoint, keep legacy /api/v1/admin/metrics-dashboard
- Update app-shell PAGE_TITLES
- Remove scion-page-admin-metrics from ADMIN_ROUTES set

* feat: add per-project metrics views with parameterized queries

Phase 2 of metrics dashboard refactoring:

Backend:
- Add QueryOption/WithProjectID functional options pattern for query parameterization
- Extend all query methods (QuerySummary, QuerySessions, QueryModelCalls, QueryTokens)
  to accept ...QueryOption and thread project_id filter through low-level query functions
- Add extraFilter parameter to queryGroupedTimeSeries, queryUniqueLabels,
  queryDailyUniqueCount for project-scoped Cloud Monitoring queries
- Add handleProjectMetricsDashboard handler with project view authorization
- Register /projects/:id/metrics sub-route in handleProjectRoutes
- Filter on metric.labels.grove_id (canonical label from GCP exporter)

Frontend:
- Add projectId property to metrics-dashboard.ts component
- Parse projectId from URL on connectedCallback (/projects/:id/metrics pattern)
- Route API calls to project-scoped endpoint when projectId is set
- Update header title to show "Project Metrics" when project-scoped
- Add /projects/:id/metrics route in main.ts (before catch-all project route)
- Add "Project Metrics" page title pattern in app-shell.ts
- Add "Metrics" button in project-detail.ts header-actions

* feat: add metrics summary row to project detail page

Phase 3 of metrics dashboard refactoring:

Backend:
- Add ProjectMetricsSummary struct with 24h scalar counters
  (sessions, API calls, tokens, active agents)
- Add QueryProjectSummary method with lightweight scalar-aggregate queries
- Add handleProjectMetricsSummary handler with project view authorization
- Register /projects/:id/metrics-summary sub-route (before /metrics to avoid
  prefix-match collision)
- Return {available: false} when metrics service is not configured

Frontend:
- Add metricsSummary state variable to project-detail.ts
- Fetch summary in loadData() as non-blocking parallel call
- Render metrics stats row below existing stats row when data available
- Include "View Details" link to /projects/:id/metrics
- Gracefully hide metrics row when unavailable (no error states)
- Add formatTokenCount helper for human-readable token numbers

* fix: address code review findings for metrics dashboard

- Fix H-1: Remove dead /dashboard suffix from frontend API path — use
  /api/v1/metrics/ directly to match backend registration
- Fix H-2: Add explicit code comment on handleAdminMetricsDashboard
  documenting the intentional auth relaxation on the legacy endpoint
- Fix M-1: Mark unused subPath parameter with _ in handleProjectMetricsDashboard
- Fix M-2: Use cacheKeySuffix() helper to avoid trailing colon in cache keys
  when ProjectID is empty (global queries)

* Add graph-up icon to Shoelace icon allowlist

The metrics dashboard nav item uses the graph-up Bootstrap Icon but it
was missing from the USED_ICONS array in the copy script, so it was
never copied to the public assets directory.

* Address PR GoogleCloudPlatform#458 review findings: shutdown race, dedup stability, sort order

Pipeline (pipeline.go):
- Add sync.WaitGroup to coordinate metric flush goroutine shutdown,
  preventing race between final flush and ticker-driven loop
- Add force parameter to flushMetricBuffer to bypass interval check
  during shutdown, preventing loss of buffered metrics
- Sort OTel attributes by key in attrSetKey to ensure stable
  deduplication keys regardless of protobuf attribute ordering

Backend (metrics_dashboard.go):
- Sort queryGroupedTimeSeries results by label for deterministic API
  responses and stable chart legend ordering
- Sort queryDailyUniqueCount results by timestamp for chronological
  ordering

Frontend (metrics-dashboard.ts):
- Move chart.destroy() inside requestAnimationFrame callback to prevent
  overlapping Chart.js instances during rapid tab switching

* Fix CI failures: gofmt formatting and errcheck in metrics dashboard

Run gofmt on metrics_dashboard.go (const block alignment) and
server.go (struct field alignment). Handle the error return from
metricsDashboard.Close() during shutdown to satisfy errcheck.

* Use project_id label instead of grove_id in metrics filter

The compat-literals CI check prohibits new grove terminology. The
telemetry exporter emits both grove_id and project_id labels, so
filtering on project_id is correct and follows project vocabulary.

---------

Co-authored-by: Scion Agent (metrics-validation-lead) <agent@scion.dev>
…latform#459)

Add a "Capture Auth" button visible only for no-auth running agents with
a resolved harness. The button calls POST /api/v1/agents/{id}/exec to
run capture_auth.py inside the container, handling exit codes 0 (success),
2 (not authenticated yet), and others (error).

Co-authored-by: Scion Agent (capture-auth-ui-dev) <agent@scion.dev>
…uth (GoogleCloudPlatform#460)

Replace gnome-keyring/DBUS token injection with direct file-based OAuth
token placement. The token is now written to
~/.gemini/antigravity-cli/antigravity-oauth-token by provision.py,
which AGY reads directly — no keyring daemons needed.

Changes:
- Dockerfile: remove dbus-x11, gnome-keyring, libsecret-1-0, libsecret-tools
- config.yaml: rename AGY_KEYRING_TOKEN→AGY_TOKEN (file secret), require
  GOOGLE_CLOUD_PROJECT + GOOGLE_CLOUD_LOCATION for vertex-ai, update
  capabilities to reflect file-based auth
- provision.py: remove _parse_env_output, keyring/DBUS wrapper logic;
  write token file directly; update _select_auth_method signature to
  accept secret_files
- capture_auth.py: new file-based capture script (standard pattern)
  replacing keyring extraction

Co-authored-by: Scion Agent (antigravity-nokey-dev) <agent@scion.dev>
…udPlatform#461)

AGY requires gnome-keyring to be initialized before it will write
~/.gemini/antigravity-cli/antigravity-oauth-token. The previous commit
removed keyring code, breaking both the no-auth capture flow and normal
auth flow. Restore keyring packages, DBUS/keyring initialization in the
wrapper script, and token injection via secret-tool — but keep the
AGY_TOKEN rename (not reverting to AGY_KEYRING_TOKEN).

Co-authored-by: Scion Agent (antigravity-keyring-restore) <agent@scion.dev>
…h GCP env (GoogleCloudPlatform#462)

The wrapper script's GCP gate only triggers on the enterprise marker
file or AGY_USE_GCP env var. For oauth-token auth with GOOGLE_CLOUD_PROJECT
injected into the container, neither condition is met so the gcp block
is never written to settings.json.

Add an elif to also set _use_gcp=true when GOOGLE_CLOUD_PROJECT is
present. The inner block already guards against empty project values.

Co-authored-by: Scion Agent (antigravity-gcp-dev) <agent@scion.dev>
File-type secrets in scion are bind-mounted directly to their target
path (~/.gemini/antigravity-cli/antigravity-oauth-token), not staged
to the env secret directory (~/.scion/harness/secrets/AGY_TOKEN).

Add fallback reads to the bind-mounted target path in three places:
- _select_auth_method: detect token presence for method selection
- _provision: read token value for JSON validation
- agy-wrapper.sh: inject token into keyring at runtime

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Google Cloud Monitoring metrics dashboard for Scion telemetry, enabling both global and project-scoped metrics visualization. It also adds a reimport feature to refresh harness configurations from their source URLs, updates the Antigravity harness to use file-based OAuth tokens with a new capture script, and enhances the skill creation page with a combined create-and-publish flow. On the feedback side, several improvements are recommended: the telemetry pipeline's attrSetKey should be updated to handle all OTLP scalar types to prevent incorrect metrics deduplication; file operations in capture_auth.py should be moved inside the try block to avoid unhandled exceptions; the harness-config update command should explicitly reject positional arguments when --all is specified; and duration conversions in the metrics dashboard should use the idiomatic Truncate(time.Second) method.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +533 to +539
if kv.Value != nil {
if sv := kv.Value.GetStringValue(); sv != "" {
b.WriteString(sv)
} else if iv := kv.Value.GetIntValue(); iv != 0 {
fmt.Fprintf(&b, "%d", iv)
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of attrSetKey only checks for string and non-zero integer values. If an attribute is a boolean (e.g., true or false), a double, or an integer with a value of 0, it will be ignored or mapped to the same empty/zero representation. This can cause different attribute sets (such as debug=true and debug=false) to produce the same cache key, leading to incorrect metrics deduplication and potential data loss.

Using a type switch over kv.Value.Value allows robust and correct handling of all OTLP scalar types.

Suggested change
if kv.Value != nil {
if sv := kv.Value.GetStringValue(); sv != "" {
b.WriteString(sv)
} else if iv := kv.Value.GetIntValue(); iv != 0 {
fmt.Fprintf(&b, "%d", iv)
}
}
if kv.Value != nil {
switch v := kv.Value.Value.(type) {
case *commonpb.AnyValue_StringValue:
b.WriteString(v.StringValue)
case *commonpb.AnyValue_IntValue:
fmt.Fprintf(&b, "%d", v.IntValue)
case *commonpb.AnyValue_BoolValue:
fmt.Fprintf(&b, "%t", v.BoolValue)
case *commonpb.AnyValue_DoubleValue:
fmt.Fprintf(&b, "%g", v.DoubleValue)
case *commonpb.AnyValue_BytesValue:
fmt.Fprintf(&b, "%x", v.BytesValue)
}
}

Comment on lines +60 to +64
with open(config_path, "r", encoding="utf-8") as f:
try:
data = json.load(f)
except (json.JSONDecodeError, OSError):
return []

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The with open statement is placed outside the try-except block. If the file cannot be opened (for example, due to a PermissionError or other OSError), an unhandled exception will be raised, crashing the script. Moving the with open statement inside the try block ensures all file-related errors are gracefully handled.

Suggested change
with open(config_path, "r", encoding="utf-8") as f:
try:
data = json.load(f)
except (json.JSONDecodeError, OSError):
return []
try:
with open(config_path, "r", encoding="utf-8") as f:
data = json.load(f)
except (json.JSONDecodeError, OSError):
return []

Comment on lines +51 to +53
if all && urlOverride != "" {
return fmt.Errorf("--all and --url cannot be used together")
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When the --all flag is specified, any positional arguments (harness-config names) are silently ignored. To prevent user confusion and ensure clear command usage, the command should return an error if both --all and positional arguments are provided.

	if len(args) > 0 && all {
		return fmt.Errorf("cannot specify a harness-config name when using --all")
	}
	if all && urlOverride != "" {

EndTime: timestamppb.New(end),
},
Aggregation: &monitoringpb.Aggregation{
AlignmentPeriod: durationpb.New(time.Duration(end.Sub(start).Seconds()) * time.Second),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Converting the duration to seconds as a float, casting it to time.Duration (which represents nanoseconds), and then multiplying by time.Second is highly non-idiomatic and hard to read. Using Truncate(time.Second) is much cleaner and idiomatic.

Suggested change
AlignmentPeriod: durationpb.New(time.Duration(end.Sub(start).Seconds()) * time.Second),
AlignmentPeriod: durationpb.New(end.Sub(start).Truncate(time.Second)),

EndTime: timestamppb.New(end),
},
Aggregation: &monitoringpb.Aggregation{
AlignmentPeriod: durationpb.New(time.Duration(end.Sub(start).Seconds()) * time.Second),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Converting the duration to seconds as a float, casting it to time.Duration (which represents nanoseconds), and then multiplying by time.Second is highly non-idiomatic and hard to read. Using Truncate(time.Second) is much cleaner and idiomatic.

Suggested change
AlignmentPeriod: durationpb.New(time.Duration(end.Sub(start).Seconds()) * time.Second),
AlignmentPeriod: durationpb.New(end.Sub(start).Truncate(time.Second)),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants