You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. Follow the [tutorial](https://pytorch.org/executorch/main/getting-started-setup) to set up ExecuTorch.
128
126
2. Follow the [tutorial](https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html) to build Qualcomm AI Engine Direct Backend.
129
127
130
-
### 2. Enable Flag
128
+
##Instructions
131
129
132
-
When executing the script, please add the flag `--dump_intermediate_outputs`. This tells QNN to dump all intermediate tensors during execution.
130
+
### 1. Initialize debugger and build binary
131
+
132
+
Create a `QNNIntermediateDebugger` with a sample input and pass it to `build_executorch_binary`. The `--dump_intermediate_outputs` flag tells QNN to dump all intermediate tensors during execution.
133
133
134
-
### 3. Add debugger to the example script
135
-
Initialize a `QNNIntermediateDebugger`. Please pass initialized `QNNIntermediateDebugger` and the `args.dump_intermediate_outputs` to `build_executorch_binary` method as well.
136
-
#### Example:
137
134
```python
138
135
from executorch.backends.qualcomm.export_utils import build_executorch_binary
139
-
from executorch.backends.qualcomm.debugger.qnn_intermediate_debugger import QNNIntermediateDebugger
136
+
from executorch.backends.qualcomm.debugger.qnn_intermediate_debugger import (
It is perfectly fine for users to pass the desired amount of datasets to `build_executorch_binary`, which helps achieve better quantization results. However, after `build_executorch_binary` is called, we need to ensure that we only perform one inference during execution. Please ensure that CPU and QNN is using the same input during execution; otherwise, the debugging results might not be accurate.
151
+
After `build_executorch_binary()`, the debugger holds:
152
+
-`edge_ep` — edge `ExportedProgram` for CPU golden inference.
153
+
-`etrecord_file_path` — path to the generated ET record.
154
+
155
+
### 2. Execute on device
156
+
157
+
Ensure `dump_intermediate_outputs` is enabled in your `QnnConfig` (or pass `--dump_intermediate_outputs` via CLI). Only run **one inference** for debugging — multiple executions are not supported.
153
158
154
-
### 5: Pull and process the results.
155
-
After QNN execution with the runner, if the previous steps are done correctly, we should be able to get two files: `etdump.etdp` and `debug_output.bin`.
156
-
The following example pulls the files back and calls a callback function to process the results. In this callback function, we create the `Inspector`. Then we perform CPU inference to get CPU intermediate results. Now, we have both QNN and CPU intermediate results, we can start generating results to compare the accuracy. Taking the following example, we should be able to get `debug_graph.svg` as an output in the current directory.
157
-
#### Example:
158
159
```python
159
-
from executorch.backends.qualcomm.debugger.qnn_intermediate_debugger import OutputFormat
160
+
from executorch.examples.qualcomm.utils import SimpleADB
After execution, pull `etdump.etdp` and `debug_output.bin` from the device. Use `setup_inspector()` to create the `Inspector`, then create comparators and generate results.
174
+
175
+
Before comparing per-layer outputs, it is highly recommended to verify that the edge program's final output aligns with the original `nn.Module`. The debugger uses the edge program as the CPU golden reference, so if the edge graph itself has diverged (e.g., due to weights quantization or pass transformations), per-layer comparisons against it may be misleading.
176
+
177
+
```python
178
+
from executorch.backends.qualcomm.debugger.qcom_numerical_comparator_sample import (
The above example sets output formats as SVG and evaluation metrics using Cosine Similarity. Based on different needs, users can choose other output formats as shown in the `OutputFormat` class under [qnn_intermediate_debugger](./qnn_intermediate_debugger.py)
213
+
## Comparators
214
+
215
+
Create comparators via the `create_comparator()` factory, which automatically injects the `edge_ep`. A couple sample comparators are provided under [qcom_numerical_comparator_sample.py](./qcom_numerical_comparator_sample.py):
216
+
181
217
```python
182
-
classOutputFormat(IntEnum):
183
-
SVG_GRAPHS=0
184
-
CSV_FILES=1
185
-
DUMP_RAW=2
218
+
cos = qnn_intermediate_debugger.create_comparator(QcomCosineSimilarityComparator, threshold=0.9)
For evaluation metrics, if users would like to implement their own metrics, we have provided the option to implement [MetricEvaluatorBase](./metrics_evaluator.py). The following shows how to define custom metrics.
222
+
### Custom comparators
223
+
224
+
Users can also define their own comparator by implementing a derived class from [QcomNumericalComparatorBase](./qcom_numerical_comparator_base.py). Inside the derived class, users will need to implement `metric_name()`, `is_valid_score()`, and `element_compare()`. The base class handles QNN-specific preprocessing (dequantization, layout conversion) internally — `preprocessing` cannot be overridden.
We have provided an inception_v3 demo script to help users better understand how to apply the debugger to their scripts. Please refer to [qnn_intermediate_debugger_demo.py](../../../examples/qualcomm/util_scripts/qnn_intermediate_debugger_demo.py) for the example script.
An Inception_V3 demo script is provided at [qnn_intermediate_debugger_demo.py](../../../examples/qualcomm/util_scripts/qnn_intermediate_debugger_demo.py).
216
256
217
-
Before running the example script, please ensure that dataset is downloaded. Example dataset can be retrieved [here](https://www.kaggle.com/datasets/ifigotin/imagenetmini-1000).
257
+
Before running, ensure the dataset is downloaded. An example dataset can be retrieved [here](https://www.kaggle.com/datasets/ifigotin/imagenetmini-1000).
1. The current debugger only supports performing one execution. Multiple executions may cause unknown behavior and are not recommended.
226
-
2. Please ignore this if you are using `qnn_executor_runner`. If you have decided to write your own runner, please follow the [tutorial](https://pytorch.org/executorch/stable/etdump.html) on how to implement etdump into your own runner.
227
-
3. The current debugger does not support graph with partitions. (WIP)
228
-
4. The current debugger does not support LLM models. (WIP)
263
+
## Limitations
264
+
1. Only one execution per debug session — multiple executions may cause unknown behavior.
265
+
2. If you have decided to write your own runner (instead of `qnn_executor_runner`), follow the [tutorial](https://pytorch.org/executorch/stable/etdump.html) on how to implement etdump.
266
+
3. Does not support graphs with partitions (partial delegation).
0 commit comments