- Introduction
- Hardware and Software Configuration
- Structure
- Dependency and Installation
- Workflow
- Source of Graph Datasets
- Previous Work
FaaSBoard is a scalable, efficient, and cost-effective graph processing framework powered by serverless computing and deployed on AWS serverless infrastructure.
FaaSBoard packages compiled graph processing code and partitioned graph data into images to construct AWS Lambda functions, where multiple Lambda functions collaborate to complete a graph processing task. The image-based graph loading approach significantly accelerates data loading during warm starts. Since serverless functions cannot communicate directly, we introduce a proxy server to handle communication logic, implementing computation-communication interleaving and message consolidation. Meanwhile, we propose a triple-mode vector format to balance different communication vector formats and optimize overall performance. Furthermore, the 2D balanced partitioning strategy achieves better load balancing while maintaining
FaaSBoard runs on AWS serverless infrastructure and also supports local single-host execution for testing. We especially recommend testing on AWS EC2.
- Local execution: We recommend a host machine with at least 40 CPU cores, 120 GB memory, and 150 GB SSD storage. The reference operating system is Ubuntu 22.04.
- AWS deployment: The proxy server runs on Ubuntu 22.04, and Lambda functions are built from the
public.ecr.aws/lambda/provided:al2023base image. Each Lambda function is configured with 2 vCPUs and 3538 MB memory.
build: Compiled test programs and applications.data: Graph datasets and graph processing results.include: Core configurations and implementations of the graph processing system.app: Applications that run on the system.communication: Communication setup for workers.compute: Graph data structures and compute kernels.preprocess: Functions for preprocessing, storage, partitioning, and simulation.util: Common classes, macros, and utility functions.mpmc: Multi-producer, multi-consumer concurrent queues.spsc: Single-producer, single-consumer concurrent queues.
script: Scripts for environment setup, dataset preparation, utilities, CPU tests, code upload.analyze: Scripts for log analysis
test: System test programs.others: Basic tests for correctness, CPU behavior, weights, and related functionality.preprocess: Programs for preprocessing graph datasets and storing them in memory.
To configure the runtime environment and install dependencies, clone the repository, enter the project directory, and run:
cd script
./prepare_machine.shThis installs Docker, AWS SDK, glog, and other required dependencies.
To compile the test programs and applications:
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=~/lambda-install
makeTo build AWS Lambda code packages:
make aws-lambda-package-lambda_bfs
make aws-lambda-package-lambda_pr
make aws-lambda-package-lambda_sssp
make aws-lambda-package-lambda_ccDownload the dataset and extract the raw text file:
gzip -d <DATA_TXT_GZ_PATH>Move the dataset into the repository data directory:
mv <DATA_TXT> FaaSBoard/data/You can also enter FaaSBoard/script, run this command to download download datasets, unzip that and move the datasets to FaaSBoard/data:
./prepare_dataset.shAfter datasets are ready, enter the build directory and preprocess the raw graph before running an application. Different applications require different preprocessing parameters; see FaaSBoard/script/preprocess_graph.sh for details. Run:
# Enter the build directory
cd build
# Start preprocessing
./preprocess_and_save <GRAPH_FILE> <GRAPH_ROOT_DIR> <VERTICES> <EDGES> <PARTITIONS>
-- GRAPH_FILE: original graph dataset file
-- GRAPH_ROOT_DIR: root directory of the graph dataset in CSR binary format
-- VERTICES: num of vertices
-- EDGES: number of edges
-- PARTITIONS: number of partitions
# example
sudo ./preprocess_and_save -graph_file ../data/soc-LiveJournal1.txt -graph_root_dir ../data/livejournal/unweighted -vertices 4847571 -edges 68993773 -partitions 4 --v 1See Run Locally for the complete local execution workflow.
The following example runs a basic single-threaded algorithm to validate the correctness of the concurrent implementation in a single-host environment.
- Enter the build directory:
cd build- Run the correctness test:
./correctness_test <GRAPH_FILE> <GRAPH_ROOT_DIR> <VERTICES> <EDGES> <APPLICATION> <BFS_ROOT>
-- GRAPH_FILE: original graph dataset file
-- GRAPH_ROOT_DIR: root directory of the graph dataset in CSR binary format
-- VERTICES: number of vertices
-- EDGES: number of edges
-- APPLICATION: graph application (for example, bfs)
# example
./correctness_test -graph_file ../original/soc-LiveJournal1.txt -graph_root_dir ../data/livejournal -vertices 4847571 -edges 68993773 -application bfs -bfs_root 0 --v 1The program issues a graph processing query and prints the result after execution completes.
See Run on AWS for the complete AWS deployment and execution workflow.
FaaSBoard/script/run.sh provides a compact entry point for both local execution and AWS Lambda invocation. It selects the proper dataset layout for each application automatically:
bfs,pr->data/<graph>/unweightedsssp->data/<graph>/weightedcc->data/<graph>/undirected
Basic usage:
cd script/
./run.sh -mode <local|aws> -app <bfs|pr|sssp|cc> -graph <livejournal|twitter|friendster|rmat27> [options]However, you need to complete the preparations for executing the graph processing task locally/on AWS, as instructed previously, before you can call FaaSBoard/script/run.sh.
Special thanks to SNAP for providing the datasets used in our experiments.
livejournal: https://snap.stanford.edu/data/soc-LiveJournal1.htmltwitter: https://snap.stanford.edu/data/twitter-2010.htmlfriendster: https://snap.stanford.edu/data/com-Friendster.htmlrmat27:cd script/ && python gen_rmat27.py
- concurrentqueue: vendored under
include/util/mpmc/, used as a blocking multi-producer/multi-consumer queue to pipeline computation and communication. - readerwriterqueue: vendored under
include/util/spsc/, used as a blocking single-producer/single-consumer ring buffer in the proxy server. The epoll loop enqueues ready client file descriptors into per-core queues, and worker threads dequeue them to handle network I/O and segment reduction. - METIS: used to implement our metis-based partition strategy for benchmarking during preprocessing (METIS partitioning and vertex reordering by parts).
- Gorder: used to generate graphs with optimized vertex orderings to study FaaSBoard's robustness under different graph orderings.
We thank the authors and maintainers of these open-source projects for making this work possible.
@inproceedings{liu2024faasgraph,
title={FaaSGraph: Enabling Scalable, Efficient, and Cost-Effective Graph Processing with Serverless Computing},
author={Liu, Yushi and Sun, Shixuan and Li, Zijun and Chen, Quan and Gao, Sen and He, Bingsheng and Li, Chao and Guo, Minyi},
booktitle={Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2},
pages={385--400},
year={2024}
}