Stargz Memory consumption

Hi,

We are deploying stargz on a cluster hosting gitlab runners. This cluster is used to run user workloads, which means we have short-lived pods (from a couple of minutes to a couple of hours), and the images used are always different.

The images can be quite large, ranging from 70 GB to 120 GB compressed, up to 20-50 layers.

The behaviour we are experiencing:
1. When we run the first pod on a fresh node, the stargz memory utilization grows to around 0.6G for one of the user images.
2. Once the pod finishes running, the memory consumption still stays the same. I think it's because the mounts are preserved.
3. When a new workload is scheduled on the same node, the memory consumption grows again, usually to around 1G-1.3G depending on the new image.

I think eventually some of the memory is released, but when we leave the cluster running for a couple of days, and there are many back to back workloads, stargz eventually consumes more and more memory until it exhausts the node.

Is there a way to allow all the user pods to run without killing the node after a while? I would like to preserve the mounts only if there are running pods, but not for pods that have already completed, because otherwise we kill the nodes. I tried to use fuse-manager, but it's the same behaviour, just the memory is being utilized by the fuse-manager process instead of stargz.

Restarting the stargz process doesn't release the memory either.

Can you advise?

Best regards,
Diana


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stargz Memory consumption #2275

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stargz Memory consumption #2275

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions