Skip to content

KubernetesJobOperator task stuck in Running state when parallelism > completions #64867

@JoshuaPostel

Description

@JoshuaPostel

Under which category would you file this issue?

Providers

Apache Airflow version

3.1.6

What happened and how to reproduce it?

Using a airflow.providers.cncf.kubernetes.operators.job.KubernetesJobOperator with parallelism > completions causes the airflow task to get stuck in the "Running" state.  The specified job will be launched, but regardless of success, failure, or deletion, the airflow UI says the task state is "Running".

The task logs do not print anything after:

INFO - Building job my-job-name

namely, it never prints the log:

INFO - Found matching pod xyz with labels abc

so, I'm guessing parallelism > completions causes issues identifying/finding the job or pods.

What you think should happen instead?

KubernetesJobOperator should be able to identify/find the appropriate job and pods even if parallelism > completions, as it is a valid kubernetes configuration.  At the very least a warning or error should be raised so that the user is made aware why the task is stuck in a "Running" state.

Operating System

No response

Deployment

Official Apache Airflow Helm Chart

Apache Airflow Provider(s)

cncf-kubernetes

Versions of Apache Airflow Providers

apache-airflow-providers-cncf-kubernetes==10.12.0

Official Helm Chart version

1.19.0

Kubernetes Version

Not Applicable

Helm Chart configuration

Not Applicable

Docker Image customizations

Not Applicable

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions