Skip to content

Hardware Mapping Limitations of Quantized ViT-B/32  #198

@billangel

Description

@billangel

The model selected for FPGA deployment is the ViT-B/32 (Vision Transformer Base). The model was adapted for classification on 200 classes of the Tiny-ImageNet dataset. The quantization process was carried out using Xilinx's Brevitas library, employing Quantization-Aware Training (QAT) with 8-bit weights and 8-bit activations (a8w8), using per-tensor symmetric quantization. The quantized model was exported in QONNX format (Quantized ONNX) using Brevitas's export_qonnx function. For the deployment of the quantized model on FPGA, the Xilinx FINN framework was used in combination with finn-plus.
The results from the completed build are shown below. The screenshots are taken from the final step of the build process. As can be observed, certain nodes could not be absorbed into hardware operators — both the colored nodes (which represent hardware operators) and the uncolored nodes (which represent software nodes executing on the CPU).Out of a total of 959 nodes, only 51 nodes were successfully mapped to hardware operators, while the remaining 908 nodes remain as software nodes executing on the CPU.
As a first proposed solution, re-exporting the model from Brevitas using ONNX opset version 11 is recommended. This approach would resolve the root cause of the hardware mapping limitations, as opset 11 does not support the fused LayerNormalization node that is introduced in opset 17.

Image Image Image Image

build_finn_nohw.py
@fpjentzsch

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions