OSDC: Add lf-prod-aws-ue1 and lf-prod-aws-ue2 clusters#700
Conversation
|
Just saving my progress. This is not deployed yet and I have not run any validation steps yet. It is a work in progress. |
tofu plan — arc-cbr-production✅ Plan succeeded · commit Plan output |
tofu plan — arc-cbr-production-uw1✅ Plan succeeded · commit Plan output |
tofu plan — meta-prod-aws-ue1✅ Plan succeeded · commit Plan output |
7cb1ce6 to
5b9b1b1
Compare
5b9b1b1 to
13dfc7b
Compare
|
we might want full duplication on two regions, don't we? |
Yeah we could definitely do that for redundancy. Maybe we can do it as a followup once this cluster is actively working? |
13dfc7b to
f3c658c
Compare
| runner_name_prefix: "lf-" | ||
| runner_group: lf-prod-aws-ue1 | ||
| pypi_cache: | ||
| replicas: 2 |
There was a problem hiding this comment.
This looks low, other prod clusters use 10 here. We had an incident before where a burst of traffics kills 2 replicas
There was a problem hiding this comment.
I don't expect LF clusters to handle as much traffic as Meta's one since we have limited budget. So I set it to 6 replicas for now. We can tune it more later once we have more data on real load on the system.
There was a problem hiding this comment.
@jeanschmidt Did we do any analysis on the number of replicas there or did we just pick a reasonably high number and yolo go with it?
There was a problem hiding this comment.
no, I started with 2, and increased every time we had issues :)
So, I'll recommend stick with those numbers.
huydhn
left a comment
There was a problem hiding this comment.
LGTM! Plz resolve the remaining comments before merging
aa8b950 to
ca42ffb
Compare
@jeanschmidt wanted me to also tackle adding a 2nd cluster in this PR so I updated this PR to handle that. The 2nd cluster is not yet fully deployed pending AWS Quata increase request so hold off on merging this one until I can confirm deployment. |
|
I'm also reducing the buildkit replicas to 8 for the LF clusters. |
This adds a new lf-prod-aws-ue1 cluster to OSDC. Signed-off-by: Thanh Ha <thanh.ha@linuxfoundation.org>
Co-authored-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Thanh Ha <thanh.ha@linuxfoundation.org>
This is being removed in #703 so we'll pro-actively remove it. Signed-off-by: Thanh Ha <thanh.ha@linuxfoundation.org>
Signed-off-by: Thanh Ha <thanh.ha@linuxfoundation.org>
0581585 to
b012ad1
Compare
Signed-off-by: Thanh Ha <thanh.ha@linuxfoundation.org>
b012ad1 to
ef6a68b
Compare
Note that @huydhn is working on a change to make buildkit dynamic scalable AND available on all regions. So maybe you would be good to just keep the standard config and leverage those changes when we deploy them |
|
I believe we're missing CI changes for this, @zxiiro are you planning to add it as a separate PR? |
Yes I plan to add CI in a followup PR once this is live. Also I'm still waiting on AWS to increase our Quota in us-east-2, it's been taking awhile to get a response so I escalated it with our account team. |
This adds 2 new clusters to OSDC: lf-prod-aws-ue1 and lf-prod-aws-ue2