Skip to content

Commit 4529b62

Browse files
authored
SRE-3755 ci: Add NFS mount retry to handle transient server readiness (#18118) (#18244)
The NFS mount in test_main_prep_node.sh can fail with "access denied" when the NFS server on FIRST_NODE hasn't fully registered its exports before client nodes attempt to mount. This is a race between setup_nfs.sh completing on the server and clush launching test_main_prep_node.sh on all nodes simultaneously. Add a retry loop (3 attempts, 5s apart) around the mount call, and on final failure print showmount/getent diagnostics to aid debugging. Also tighten the /proc/mounts grep to avoid false substring matches. Signed-off-by: Ryon Jensen <ryon.jensen@hpe.com>
1 parent f0e08e5 commit 4529b62

1 file changed

Lines changed: 21 additions & 2 deletions

File tree

ci/functional/test_main_prep_node.sh

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -339,9 +339,28 @@ if [ "$result" -ne 0 ]; then
339339
fi
340340

341341
set -x
342-
if [ -n "$FIRST_NODE" ] && ! grep /mnt/share /proc/mounts; then
342+
if [ -n "$FIRST_NODE" ] && ! grep -qs ' /mnt/share ' /proc/mounts; then
343343
mkdir -p /mnt/share
344-
mount "$FIRST_NODE":/export/share /mnt/share
344+
# Retry the NFS mount to handle the case where the NFS server on
345+
# FIRST_NODE has not fully registered its exports yet.
346+
nfs_mounted=false
347+
for attempt in $(seq 1 3); do
348+
if mount "$FIRST_NODE":/export/share /mnt/share; then
349+
nfs_mounted=true
350+
break
351+
fi
352+
echo "NFS mount attempt $attempt failed, retrying in 5s..."
353+
sleep 5
354+
done
355+
if ! "$nfs_mounted"; then
356+
echo "ERROR: NFS mount failed after $attempt attempts:"
357+
echo " $FIRST_NODE:/export/share -> /mnt/share"
358+
echo "Exports advertised by $FIRST_NODE:"
359+
showmount -e "$FIRST_NODE" || true
360+
echo "DNS/hosts resolution for $FIRST_NODE:"
361+
getent hosts "$FIRST_NODE" || true
362+
exit 32
363+
fi
345364
fi
346365

347366
# The package name defaults to "(root)" unless there is a dot in the

0 commit comments

Comments
 (0)