Skip to content

Containerize standard benchmark runs#38

Merged
kislaykishore merged 3 commits into
mainfrom
npi
Aug 13, 2025
Merged

Containerize standard benchmark runs#38
kislaykishore merged 3 commits into
mainfrom
npi

Conversation

@kislaykishore

Copy link
Copy Markdown
Collaborator

No description provided.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @kislaykishore, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive framework for containerized benchmarking of GCSFuse performance using the Flexible I/O Tester (FIO). The changes provide a set of Python scripts and Dockerfiles that automate the entire benchmarking process, from setting up the necessary environment and GCSFuse, to executing FIO tests against mounted Google Cloud Storage buckets, and collecting detailed results. This new system supports both individual FIO runs and advanced matrix-based benchmarking, allowing for systematic performance evaluation across various read and write workloads.

Highlights

  • Automated GCSFuse Benchmarking: New Python scripts (run_fio_benchmark.py, fio_benchmark_runner.py) have been introduced to automate the entire GCSFuse FIO benchmarking workflow, covering dependency installation, GCSFuse setup, GCS bucket management, mounting, FIO execution, and resource cleanup.
  • Containerized Execution: The pull request includes several Dockerfiles (fio.dockerfile, read.dockerfile, write.dockerfile, gcsfuse-perf-base.docker) to encapsulate the benchmarking environment, ensuring consistent and reproducible test runs across different systems.
  • Matrix Benchmarking Capability: A new script, run_fio_matrix.py, along with associated CSV configuration files (read_matrix.csv, write_matrix.csv) and FIO templates (read.fio, write.fio), enables running multiple FIO test configurations in an automated matrix fashion, allowing for comprehensive performance analysis across various scenarios.
  • Detailed FIO Output and Summary: The fio_benchmark_runner.py script now includes robust logic to parse FIO's JSON output and present a summarized table of key performance metrics such as bandwidth, IOPS, and latency, making it easier to interpret benchmark results.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@kislaykishore kislaykishore merged commit 47015e7 into main Aug 13, 2025
2 checks passed
@kislaykishore kislaykishore deleted the npi branch August 13, 2025 15:24

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a containerized environment for running FIO benchmarks against GCSFuse, which is a great step towards standardized and reproducible performance testing. The changes include Dockerfiles for building a base image and running various benchmark scenarios (read, write, matrix), along with the necessary Python scripts and FIO configurations.

My review focuses on improving the robustness and correctness of the benchmark runner scripts and ensuring the documentation is accurate. I've found a critical bug in the FIO output parser, some areas where robustness can be improved (like handling mount operations), and inconsistencies in logging and code style. Most importantly, the provided README.md is outdated and does not reflect the new containerized workflow, which could lead to confusion. Addressing these points will make the benchmarking tool more reliable and easier to use.

stats = job[op]
# Bandwidth is in KiB/s, convert to MiB/s
bw_mibps = stats.get("bw", 0) / 1024.0
if bw == 0:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a NameError here because the variable bw is not defined. You likely intended to use bw_mibps, which was defined on the preceding line. This bug will cause the script to crash when parsing FIO results.

Suggested change
if bw == 0:
if bw_mibps == 0:

Comment thread npi/fio/README.md
Comment on lines +1 to +90
# GCSFuse FIO Benchmark Runner

## Overview

This Python script automates the process of benchmarking GCSFuse performance using the Flexible I/O Tester (FIO). It handles the entire workflow from setting up dependencies to running the benchmarks and cleaning up resources.

The script performs the following actions:
1. **Dependency Installation**: Installs `git`, `fio`, and `fuse` if they are not present (supports Debian/Ubuntu and RHEL/CentOS based systems).
2. **GCSFuse Setup**: Clones the GCSFuse GitHub repository, checks out a specific version (branch, tag, or commit), and builds the binary.
3. **GCS Bucket Management**: Creates a temporary GCS bucket for the test and deletes it upon completion.
4. **Mounting**: Mounts the GCS bucket using the built GCSFuse binary and specified flags.
5. **Benchmarking**: Runs FIO tests against the mounted directory based on a provided FIO configuration file for a specified number of iterations.
6. **Cleanup**: Unmounts the GCSFuse directory and deletes the GCS bucket, ensuring a clean state.

## Prerequisites

Before running the script, ensure you have the following installed and configured:

- **Python 3.8+**
- **Go (1.21 or newer)**: The script requires Go to build GCSFuse from source.
- **Google Cloud SDK (`gcloud`)**:
- Authenticated: `gcloud auth login`
- Project configured: `gcloud config set project <YOUR_PROJECT_ID>`
- **Sudo privileges**: The script requires `sudo` to install packages and clear system caches.

## Usage

The script is invoked from the command line with several arguments to control the benchmark run.

```bash
python3 run_fio_benchmark.py [OPTIONS]
```

### Arguments

- `--gcsfuse-version`: (Required) The GCSFuse version to test (e.g., `v1.2.0`, `master`, or a commit hash).
- `--project-id`: (Required) Your Google Cloud Project ID.
- `--location`: (Required) The GCP location (region or zone) for the GCS bucket (e.g., `us-central1`).
- `--fio-config`: (Required) Path to the FIO configuration file.
- `--gcsfuse-flags`: (Optional) Flags for GCSFuse, enclosed in quotes (e.g., `"--implicit-dirs --max-conns-per-host 100"`). Default is empty.
- `--iterations`: (Optional) Number of FIO test iterations. Default is `1`.
- `--work-dir`: (Optional) A temporary directory for builds and mounts. Default is `/tmp/gcsfuse_benchmark`.
- `--output-dir`: (Optional) Directory to save FIO JSON output files. Default is `./fio_results`.
- `--skip-deps-install`: (Optional) Skip the automatic dependency installation check.

### Example

1. **Create an FIO config file (`sample.fio`):**

```ini
[global]
ioengine=libaio
direct=1
runtime=30
time_based
group_reporting
filename=testfile

[random-read-4k]
bs=4k
rw=randread
size=1G

[random-write-1m]
bs=1m
rw=randwrite
size=1G
```

2. **Run the benchmark script:**

```bash
python3 run_fio_benchmark.py \
--gcsfuse-version master \
--project-id your-gcp-project-id \
--location us-central1 \
--fio-config ./sample.fio \
--gcsfuse-flags "--implicit-dirs" \
--iterations 3
```

## Output

The script will create an output directory (e.g., `./fio_results/`) containing the FIO results in JSON format, with one file per iteration.

- `fio_results_iter_1.json`
- `fio_results_iter_2.json`
- `fio_results_iter_3.json`

These files can be parsed for detailed performance analysis. No newline at end of file

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The documentation in this README is significantly outdated and does not align with the new containerized approach introduced in this pull request. For example:

  • It describes a standalone script that handles dependency installation and GCSFuse compilation, which is now handled by the Dockerfiles.
  • The command-line arguments listed (e.g., --gcsfuse-version, --project-id, --skip-deps-install) do not match the arguments in the new scripts (run_fio_benchmark.py, run_fio_matrix.py).
  • The setup and execution flow described is for a non-containerized environment.

This discrepancy will cause confusion for users. Please update this README to accurately describe the new container-based workflow, including how to build the Docker images and run the benchmarks using docker run, and document the correct command-line arguments for the new scripts.

logging.info(f"STDOUT: {result.stdout.strip()}")
if result.stderr:
# Use warning for stderr as some tools write info there
logging.info(f"STDERR: {result.stderr.strip()}")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment on the preceding line suggests using logging.warning for stderr, as some tools may write non-error information to it. However, the implementation uses logging.info. To align with the intention described in the comment and to better classify these messages, please change this to logging.warning.

Suggested change
logging.info(f"STDERR: {result.stderr.strip()}")
logging.warning(f"STDERR: {result.stderr.strip()}")

Comment on lines +56 to +59
time.sleep(2) # Give a moment for the mount to register
if not os.path.ismount(mount_point):
logging.error("Mounting failed. Check GCSFuse logs (e.g., in /var/log/syslog).")
sys.exit(1)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a fixed time.sleep(2) to wait for the mount to register is not fully reliable. On a heavily loaded system, mounting could take longer, leading to a race condition where the script incorrectly reports a failure. A more robust approach is to poll for the mount status with a timeout.

Suggested change
time.sleep(2) # Give a moment for the mount to register
if not os.path.ismount(mount_point):
logging.error("Mounting failed. Check GCSFuse logs (e.g., in /var/log/syslog).")
sys.exit(1)
# Give up to 10s for the mount to register, polling every second.
for _ in range(10):
if os.path.ismount(mount_point):
break
time.sleep(1)
else:
logging.error("Mounting failed. Check GCSFuse logs (e.g., in /var/log/syslog).")
sys.exit(1)

try:
run_command(["fusermount", "-u", mount_point])
except (FileNotFoundError, subprocess.CalledProcessError):
logging.warning("`fusermount -u` failed. Retrying with `sudo umount`.")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment "Retrying with sudo umount" is misleading, as the subsequent command does not use sudo. Since these scripts are intended to run as root within a Docker container, sudo is not necessary. Please update the comment to accurately reflect the command being executed to avoid confusion.

Suggested change
logging.warning("`fusermount -u` failed. Retrying with `sudo umount`.")
logging.warning("`fusermount -u` failed. Retrying with `umount`.")

Comment thread npi/fio/run_fio_matrix.py
Comment on lines +32 to +121
def main():
"""Parses args and runs the benchmark matrix."""
parser = argparse.ArgumentParser(
description="Run a matrix of GCSFuse FIO benchmarks from a config file."
)
parser.add_argument(
"--gcsfuse-flags",
default="",
help="Flags for GCSFuse, as a single quoted string.",
)
parser.add_argument(
"--bucket-name", required=True, help="Name of the GCS bucket."
)
parser.add_argument(
"--iterations",
type=int,
default=1,
help="Number of FIO test iterations per configuration.",
)
parser.add_argument(
"--fio-template",
required=True,
help="Path to the FIO config template file.",
)
parser.add_argument(
"--matrix-config",
required=True,
help="Path to the CSV file with benchmark parameters.",
)
parser.add_argument(
"--work-dir",
default="/tmp/gcsfuse_benchmark",
help="Working directory for clones and builds.",
)
parser.add_argument(
"--output-dir",
default="./fio_results_matrix",
help="Directory to save FIO JSON results.",
)
args = parser.parse_args()

try:
with open(args.matrix_config, "r", newline="") as f:
reader = csv.DictReader(f)
configs = list(reader)
except FileNotFoundError:
logging.error("Matrix config file not found: %s", args.matrix_config)
sys.exit(1)
except Exception as e:
logging.error("Error reading matrix config file: %s", e)
sys.exit(1)

logging.info(
"Found %d configurations to run from %s", len(configs), args.matrix_config
)

for i, config in enumerate(configs):
# Create a string representation of the configuration for logging.
config_str = ", ".join([f"{k}={v}" for k, v in sorted(config.items())])

logging.info("--- Starting Matrix Run %d/%d ---", i + 1, len(configs))
logging.info("Configuration: %s", config_str)

# All columns from the CSV are passed as environment variables to FIO.
fio_env = config

# Create a unique subdirectory for this configuration's results.
# The name is generated from the config parameters to be unique and
# descriptive.
config_name_parts = [
f"{k}_{v}" for k, v in sorted(config.items())
]
config_name = "_".join(config_name_parts)
config_output_dir = os.path.join(args.output_dir, config_name)

try:
fio_benchmark_runner.run_benchmark(
gcsfuse_flags=args.gcsfuse_flags, bucket_name=args.bucket_name,
iterations=args.iterations, fio_config=args.fio_template,
work_dir=args.work_dir, output_dir=config_output_dir, fio_env=fio_env)
except Exception as e:
logging.error("Benchmark run failed for configuration %s: %s", config, e)
# Continue to the next configuration
continue

logging.info("--- All benchmark matrix runs complete. ---")


if __name__ == "__main__":
main()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This file uses 2-space indentation, which is inconsistent with the other Python files in this pull request (e.g., fio_benchmark_runner.py, run_fio_benchmark.py) that use 4-space indentation. To improve code readability and maintain consistency across the project, please update the indentation in this file to 4 spaces, in line with the PEP 8 style guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant