-
Notifications
You must be signed in to change notification settings - Fork 359
docs(MADR): add MADR 101 for status syncing rules in multizone #16083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,245 @@ | ||||||
| # Resource status ownership and syncing rules in multizone | ||||||
|
|
||||||
| * Status: accepted | ||||||
|
|
||||||
| Technical Story: none | ||||||
|
|
||||||
| ## Context and Problem Statement | ||||||
|
|
||||||
| Across multiple MADRs (039, 044, 051, 056, 096) we have established conventions | ||||||
| for how resource `status` is handled in multizone deployments. These conventions | ||||||
| are scattered and implicit. This MADR consolidates the rules into a single | ||||||
| reference and strengthens them with an explicit prohibition on computing status | ||||||
| on Global CP. | ||||||
|
|
||||||
| The core tension is: | ||||||
|
|
||||||
| 1. Resources are authored on one control plane (zone or global) and synced to | ||||||
| others via KDS. | ||||||
| 2. Status contains information that is inherently local to the zone where the | ||||||
|
Comment on lines
+17
to
+19
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These examples (VIPs, hostnames, proxy counts, availability) seem |
||||||
| resource is consumed (VIPs, hostnames, proxy counts, availability). | ||||||
| 3. Syncing status cross-zone would overwrite locally-computed values, causing | ||||||
| traffic interruptions and inconsistencies. | ||||||
| 4. Computing status on global would be costly and scale in O(number of entities globally) | ||||||
| which we try to avoid. | ||||||
|
|
||||||
| Previous MADRs addressed this: | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the list above and below is missing |
||||||
|
|
||||||
| | MADR | Decision | | ||||||
| |------|----------| | ||||||
| | 039 (MeshService API) | Status holds VIPs, hostnames, proxy counts, availability. Managed by CP. | | ||||||
| | 051 (MeshService multizone) | Status is NOT synced cross-zone. Each zone computes its own. | | ||||||
| | 056 (Identity sync) | Identity placed in `spec` (not `status`) to avoid partial status syncing. Motivated the single-writer model for spec vs status. | | ||||||
| | 044 (Zone-to-global policy sync) | Zone-originated policies sync to global for visibility only. | | ||||||
| | 096 (Ingress address sync) | `MeshZoneAddress` is status-less; address is in `spec`. | | ||||||
|
|
||||||
| ## Design | ||||||
|
|
||||||
| ### Rule 1: Status MUST be computed on zone CPs only | ||||||
|
|
||||||
| Status fields reflect zone-local state: VIPs allocated from the zone's CIDR | ||||||
| range, hostnames generated by the zone's `HostnameGenerator`, proxy counts from | ||||||
| dataplanes running in that zone, availability derived from local dataplane | ||||||
| health. | ||||||
|
|
||||||
| **Global CP MUST NOT compute or populate status fields on any resource.** Global | ||||||
| CP does not have the zone-local context required to produce correct values. | ||||||
|
|
||||||
| This applies to all resource types, including: | ||||||
| - `MeshService` | ||||||
| - `MeshMultiZoneService` | ||||||
| - `MeshExternalService` | ||||||
| - Any future resource with a `status` sub-resource | ||||||
|
|
||||||
| ### Rule 2: Status MUST NOT be synced cross-zone | ||||||
|
|
||||||
| When a resource is synced from zone A to global and then to zone B, the status | ||||||
| from zone A MUST be stripped. Zone B computes its own status. | ||||||
|
|
||||||
| Status **does** flow from zone to global and is stored there for visibility | ||||||
| purposes (e.g., the GUI can display per-zone service status). This is analogous | ||||||
| to how zone-originated policies sync to global for visibility (MADR 044). The | ||||||
| status stored on global is always scoped to the originating zone — it is never | ||||||
| merged across zones. | ||||||
|
|
||||||
| However, when global syncs that resource onward to other zones, the status is | ||||||
| stripped by the `RemoveStatus()` mapper (`pkg/kds/context/context.go`), which is | ||||||
| applied as a blanket transformation to all resources with `HasStatus: true` sent | ||||||
| from global to zones. On the receiving zone, the `IgnoreStatusChange` sync option | ||||||
| (`pkg/kds/v2/store/sync.go`) ensures that locally-computed status is preserved | ||||||
| and not overwritten by the empty status arriving from global. | ||||||
|
|
||||||
| **Implication: global-originated resources have no status on global.** Since | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it make sense to compute it for some resources? For Is the same true for Was this considered?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this MADR only tries to persist what the state of things are. I'm purposefully not trying to expand to new things to keep it focused. |
||||||
| status is only computed on zones (Rule 1) and only zone-originated resources are | ||||||
| synced from zone to global with their status (as described above), | ||||||
| global-originated resources (e.g., `MeshMultiZoneService`) will not have status | ||||||
| on global — there is simply no one computing it there. Zone-originated resources | ||||||
| (e.g., `MeshService`) do have status on global because it arrives with the | ||||||
| resource during zone→global sync. To access status of global-originated | ||||||
| resources, use the computed API approaches described in Rule 5. | ||||||
|
|
||||||
| ### Rule 3: Single-writer model for spec and status | ||||||
|
|
||||||
| Each resource follows a single control plane writer model — for any given | ||||||
| resource instance, `spec` and `status` are each written by exactly one CP: | ||||||
|
|
||||||
| | Field | Writer | Example | | ||||||
| |-------|--------|---------| | ||||||
| | `spec` | The originating control plane (zone or global) | Zone creates MeshService spec; global creates MeshMultiZoneService spec | | ||||||
| | `status` | The consuming zone CP | Each zone computes VIPs, hostnames, proxy stats independently | | ||||||
|
|
||||||
| "Single writer" here means single CP deployment, not single component. Within a zone CP, | ||||||
| multiple components may write to different parts of status (e.g., the VIP | ||||||
| allocator writes `status.VIPs`, the hostname generator writes | ||||||
| `status.Addresses`, the status updater writes `status.TLS` and | ||||||
| `status.DataplaneProxies`). These are coordinated via optimistic concurrency | ||||||
| (conflict retries). The key invariant is that no two CP deployments write to the same | ||||||
| resource instance's status. | ||||||
|
|
||||||
| Note that computed spec fields like `Spec.Identities` and `Spec.State` (see | ||||||
| Rule 4) are also written by the local zone's status updater, not by the user. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A bit related: should we consider rename since it's no longer "status updater"? computed fields reconciler?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes we could but didn't want to create work out of this MADR |
||||||
| This is consistent with the single-CP-writer model: for a local MeshService, the | ||||||
| originating zone writes both user-authored and computed spec fields. | ||||||
|
|
||||||
| This separation eliminates race conditions between spec updates from the origin | ||||||
| and status updates on the destination (the scenario described in MADR 056). | ||||||
|
|
||||||
| ### Rule 4: Data that must cross zone boundaries belongs in spec | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rule 4 formalizes "computed spec fields" as a concept but there's no structural way to distinguish them from user-authored spec fields.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably, though the goal of the MADR is to persist existing state not create new work. |
||||||
|
|
||||||
| If a piece of information needs to be available on zones other than where it was | ||||||
| produced, it MUST be placed in `spec`, not `status`. | ||||||
|
|
||||||
| This means some `spec` fields are not user-authored but computed by the local | ||||||
| zone CP. These are sometimes called "computed spec fields." They live in `spec` | ||||||
| because they need to cross zone boundaries via normal KDS sync, but they are | ||||||
| written by the zone's status updater rather than by the user. The single-CP- | ||||||
| writer model (Rule 3) still holds: these fields are written by the same CP that | ||||||
| originates the resource. | ||||||
|
|
||||||
| Precedents: | ||||||
| - **Identity** (MADR 056): `MeshService.Spec.Identities` rather than | ||||||
| `MeshService.Status.Identities` because other zones need this for mTLS SAN | ||||||
| verification. Computed by the zone status updater from matched Dataplane | ||||||
| proxies and MeshIdentity resources. | ||||||
| - **State**: `MeshService.Spec.State` carries availability information computed | ||||||
| by the zone status updater so that other zones and `MeshMultiZoneService` can | ||||||
| use it for routing decisions (e.g., excluding zones with no healthy endpoints). | ||||||
| - **Ingress address** (MADR 096): `MeshZoneAddress` carries the address in | ||||||
| `spec` (the resource is status-less). | ||||||
|
|
||||||
| ### Rule 5: Global-level visibility of zone status requires a dedicated computed API | ||||||
|
|
||||||
| If there is a need to view or aggregate zone-specific status at the global | ||||||
| level (e.g., for a GUI dashboard showing cross-zone health), this MUST be | ||||||
| served by a dedicated read-only computed API endpoint that recomputes the | ||||||
| information on demand rather than storing it. | ||||||
|
|
||||||
| Note: zone-originated resources arrive on global with their status intact (see | ||||||
| Rule 2). Passively storing this zone-synced status is expected — it enables | ||||||
| visibility via the global API. The prohibitions below concern global CP | ||||||
| **independently** computing or aggregating status: | ||||||
|
|
||||||
| Global CP MUST NOT: | ||||||
| - Independently compute or populate status fields on resources | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't it the 3rd point? |
||||||
| - Merge status from multiple zones into a single resource's status | ||||||
| - Compute a "global status" by aggregating synced data into stored resources | ||||||
|
|
||||||
| There are two approaches for exposing zone status at the global level, both | ||||||
| valid depending on the use case: | ||||||
|
|
||||||
| #### Approach A: Recompute on global from synced resources | ||||||
|
|
||||||
| When global CP already has enough synced data (specs, labels, zone-originated | ||||||
| resources) to derive the answer, the endpoint recomputes the result at request | ||||||
| time. | ||||||
|
|
||||||
| - Follow the MADR 072 convention: computed endpoints are prefixed with `_` | ||||||
| - Clearly indicate which zone each entry originates from | ||||||
| - Treated as an eventually-consistent view | ||||||
|
|
||||||
| **Existing example: the `_hostnames` endpoint** | ||||||
|
|
||||||
| `GET /meshes/{mesh}/{serviceType}/{name}/_hostnames` (where `serviceType` is | ||||||
| `meshservices`, `meshexternalservices`, or `meshmultizoneservices`) computes | ||||||
| hostnames on demand. It fetches all `HostnameGenerator` resources, evaluates | ||||||
| their Go templates against the requested service's metadata and labels, and | ||||||
| returns the generated hostnames with their zone associations. On Global CP it | ||||||
| tests both zone and global origin perspectives to capture all possible matches. | ||||||
| The endpoint does not store computed hostnames on global — results are | ||||||
| recomputed per request. | ||||||
|
|
||||||
| #### Approach B: Forward the request to the zone CP via KDS RPC | ||||||
|
|
||||||
| When the data can only be produced by the zone CP itself (e.g., it requires | ||||||
| access to zone-local state that is not synced to global), the global CP | ||||||
| forwards the request to the appropriate zone over the existing KDS bi-directional | ||||||
| stream using the reverse unary RPC mechanism (see MADR 014). | ||||||
|
|
||||||
| The flow is: | ||||||
| 1. User sends a REST request to global CP | ||||||
| 2. Global CP looks up `ZoneInsight` to find which global CP instance holds | ||||||
| the KDS stream for the target zone | ||||||
| 3. If the stream is on the local instance, global CP sends a request message | ||||||
| over the KDS stream and waits for the response (matched by request ID) | ||||||
| 4. If the stream is on another global CP instance, the request is forwarded | ||||||
| via the inter-CP gRPC service (`InterCPEnvoyAdminForwardService`), which | ||||||
| then sends it over KDS | ||||||
| 5. The zone CP processes the request locally and sends the response back | ||||||
| through the stream | ||||||
|
|
||||||
| **Existing example: Envoy admin data (XDS config dump, stats, clusters)** | ||||||
|
|
||||||
| The Envoy admin inspection endpoints forward requests to the zone CP that owns | ||||||
| the dataplane. The zone CP connects to the local `kuma-dp` proxy, retrieves the | ||||||
| Envoy admin data, and returns it through the KDS stream. This is implemented via | ||||||
| `GlobalKDSService.StreamXDSConfigs` / `StreamStats` / `StreamClusters` (defined | ||||||
| in `kds.proto`) with the reverse unary RPC pattern in | ||||||
| `pkg/util/grpc/reverse_unary_rpcs.go`. | ||||||
|
|
||||||
| #### Choosing between approaches | ||||||
|
|
||||||
| | | Approach A (recompute on global) | Approach B (KDS RPC to zone) | | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I'm leaning towards A
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't a choice to be made it's just 2 ways that are currently in use. |
||||||
| |---|---|---| | ||||||
| | **Use when** | Global has sufficient synced data to derive the answer | Data is only available on the zone CP | | ||||||
| | **Latency** | Lower — no cross-CP round trip | Higher — requires KDS stream round trip | | ||||||
| | **Availability** | Works even if zone is disconnected (stale but available) | Fails if zone is disconnected | | ||||||
| | **Complexity** | Lower — standard API handler | Higher — requires stream management, inter-CP forwarding | | ||||||
|
|
||||||
| Both approaches keep the global store free of zone-local state and avoid the | ||||||
| consistency issues that come with trying to keep aggregated status up to date. | ||||||
|
|
||||||
| #### Pre-existing exception: Insight resources | ||||||
|
|
||||||
| `MeshInsight` and `ServiceInsight` are legacy resources that predate these rules. | ||||||
| They are computed and stored by the resyncer (`pkg/insights/resyncer.go`) on | ||||||
| global CP and non-federated zone CPs, aggregating dataplane statistics across | ||||||
| the mesh. These resources have `HasStatus: false` (they are spec-only, read-only) | ||||||
| and are not status sub-resources in the sense of this MADR. | ||||||
|
|
||||||
| New features MUST NOT follow this pattern. The Insight resources are acknowledged | ||||||
| as a pre-existing exception, not a precedent. | ||||||
|
|
||||||
| ## Security implications and review | ||||||
|
|
||||||
| No new security implications. These rules reduce the risk of status data from | ||||||
| a compromised zone propagating to other zones. | ||||||
|
|
||||||
| ## Reliability implications | ||||||
|
|
||||||
| Enforcing zone-local status computation improves reliability: | ||||||
| - No cross-zone status overwrites that could cause VIP/hostname loss | ||||||
| - No dependency on global CP for status computation | ||||||
| - Zone can operate autonomously for status even when disconnected from global | ||||||
|
|
||||||
| ## Implications for Kong Mesh | ||||||
|
|
||||||
| None. These rules apply uniformly to all deployments. | ||||||
|
|
||||||
| ## Decision | ||||||
|
|
||||||
| Status is always zone-specific and MUST only be computed on zone CPs. | ||||||
| Global CP MUST NOT compute status. Cross-zone status visibility requires a | ||||||
| dedicated API. Data that needs to travel across zones belongs in `spec`. | ||||||
|
|
||||||
| These rules consolidate and strengthen existing conventions from MADRs 014, 039, | ||||||
| 044, 051, 056, 072, and 096. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does enforcement look like? The guards already exist in code but they're ad-hoc per component. Is code review sufficient or should there be a systematic check?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I was mostly trying to focus on persisting this. For what it's worth I got this MADR and the code reviewed by agents a few times with current use-cases for completeness. Which worked very well. I think as our usage of AI grows I think having skill to help self-reviews with targeted agents will probably get us a bigger effort/outcome than code enforcements. |
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context says these conventions are "scattered and implicit" - are there implicit ones in the code that aren't captured by any of the referenced MADRs?
Looking at the codebase:
rt.GetMode() == config_core.Zone(andpkg/ipam/components.go).IsLocalMeshService()filtering in the status updaterIs it worth mentioning these as existing enforcement mechanisms, or at least acknowledging them so future contributors know the guards already exist?