Add flock to prevent concurrent clustercheck runs using up connections#11
Add flock to prevent concurrent clustercheck runs using up connections#11tomgidden wants to merge 1 commit intoolafz:masterfrom
Conversation
|
Thanks for the pull request. I have a question though. Did you see many clustercheck-processes? Because there already is a timeout of 10 seconds in the execution of the mysql command, after which it exits. If the problem is a filling of ps, this won't solve your problem: With or without this change, there should never be any clustercheck-process running for more than 10 seconds. But instead of waiting for the mysql command, it will now wait for a file lock. But the ps-list still increases? |
|
As a production cluster, I was in a bit of a rush and didn't stop to investigate this incidental flaw, but there were ten to twenty clustercheck processes in To be clear, the Suffice to say, it was a fairly screwed-up situation that shouldn't have happened, but the numerous Anyway, the |
When one of our nodes got a bit tied up due to a disk space issue, clustercheck started filling up the
pslist, waiting on mysql queries.Wrapping the whole routine in this advice from
flock(1):( flock -n 9 || exit 1 # ... commands executed under lock ... ) 9>/var/lock/mylockfileAs a result of this extra nesting, the majority of the file has been indented.
I've also pulled the HTTP responses out into functions to avoid repetition. The
Content-Lengthcalculations might be slightly off, as I'm not sure whether or not all the\r\ns are counted or not, so it just uses a string length check.