Skip to content

Errors of Gateway API Canary Deployment in A/B testing #1871

@pluniov99

Description

@pluniov99

Describe the bug

Errors when trying to start Gateway API Canary Deployment in A/B testing.

To Reproduce

Try to create Gateway API Canary Deployment according to the documentation for A/B testing with the following manifests:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test-gateway
  namespace: default
spec:
  gatewayClassName: istio
  listeners:
  - name: http
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: Same
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-service
  namespace: default
  labels:
    app: test-service
spec:
  minReadySeconds: 5
  revisionHistoryLimit: 5
  progressDeadlineSeconds: 60
  strategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
  selector:
    matchLabels:
      app: test-service
  template:
    metadata:
      namespace: default
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9797"
      labels:
        app: test-service
    spec:
      containers:
      - name: podinfod
        image: ghcr.io/stefanprodan/podinfo:6.0.0
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 9898
          protocol: TCP
        - name: http-metrics
          containerPort: 9797
          protocol: TCP
        - name: grpc
          containerPort: 9999
          protocol: TCP
        command:
        - ./podinfo
        - --port=9898
        - --port-metrics=9797
        - --grpc-port=9999
        - --grpc-service-name=podinfo
        - --level=info
        - --random-delay=false
        - --random-error=false
        env:
        - name: PODINFO_UI_COLOR
          value: "#34577c"
        livenessProbe:
          exec:
            command:
            - podcli
            - check
            - http
            - localhost:9898/healthz
          initialDelaySeconds: 5
          timeoutSeconds: 5
        readinessProbe:
          exec:
            command:
            - podcli
            - check
            - http
            - localhost:9898/readyz
          initialDelaySeconds: 5
          timeoutSeconds: 5
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: test-service
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-service
  minReplicas: 1
  maxReplicas: 1
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          # scale up if usage is above
          # averageUtilization% of the requested CPU
          averageUtilization: 99
---
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: test-service
  namespace: default
spec:
  provider: gatewayapi:v1
  # deployment reference
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-service
  # the maximum time in seconds for the canary deployment
  # to make progress before it is rollback (default 600s)
  progressDeadlineSeconds: 60
  # HPA reference (optional)
  autoscalerRef:
    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    name: test-service
  service:
    # service port number
    port: 80
    # container port number or name (optional)
    targetPort: 9898
    # Reference to the Gateway that the generated HTTPRoute would attach to.
    gatewayRefs:
      - name: test-gateway
        namespace: default
  analysis:
    # schedule interval (default 60s)
    interval: 1m
    # total number of iterations
    iterations: 10
    # max number of failed iterations before rollback
    threshold: 2
    # canary match condition
    match:
      - headers:
          test: 
            exact: "test-service"

As a result, we see all resources, created by canary

# generated 
deployment.apps/test-service-primary
horizontalpodautoscaler.autoscaling/test-service-primary
service/test-service
service/test-service-canary
service/test-service-primary
httproutes.gateway.networking.k8s.io/test-service

Canary has initialized status. All resources are ready. HTTProute has the following spec:

spec:
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: test-gateway
    namespace: default
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: test-service-primary
      port: 80
      weight: 100
    - group: ""
      kind: Service
      name: test-service-canary
      port: 80
      weight: 0
    matches:
    - headers:
      - name: test
        type: Exact
        value: test-service
      path:
        type: PathPrefix
        value: /
  - backendRefs:
    - group: ""
      kind: Service
      name: test-service-primary
      port: 80
      weight: 100
    matches:
    - path:
        type: PathPrefix
        value: /

After that when we try to start deployment of new version in flagger logs we see the next error:

HTTPRoute test-service.default update error: HTTPRoute.gateway.networking.k8s.io "test-service" is invalid: spec.rules[1].timeouts.request: Invalid value: "": spec.rules[1].timeouts.request in body should match '^([0-9]{1,5}(h|m|s|ms)){1,4}$' while setting weights

And canary stuck in progressing status.

Expected behavior

Successful canary deployment

Additional context

  • Flagger version: 1.42.0
  • Kubernetes version: 1.34
  • Service Mesh provider: gatewayapi:v1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions