If you are training your own AI models, you need a way to quantify how accurate they are. The Validation Module compares human-annotated Ground Truth masks against AI-generated Prediction masks to generate scientific performance metrics.
Open the validation window by clicking Validation > Open Validation... in the top menu bar.
Before running validation, you need two folders of images:
- Ground Truth Masks: These are binary (black and white) images representing true defects. You can generate these from your AnnoMate workspace by going to Data > Export Binary Masks.
- Prediction Masks: These are binary masks generated by your AI pipeline (e.g., Anomalib's testing output).
Note: The system uses smart file-name matching (e.g., 118_images_003.png will match eval_118_003.png). They do not need to be named exactly identically, provided the core identifiers match.
- In the Validation window, click Select GT Masks and choose your Ground Truth folder.
- Click Select Predictions and choose your AI output folder.
- Click Run Comparison.
As the worker processes the images, it will generate a live feed of results. For every matched image pair, you will receive a visual card containing:
- IoU (Intersection over Union): The primary accuracy score. It measures the overlap between the GT and Prediction divided by their total combined area. (100% is a perfect match).
- Precision: The percentage of the AI's prediction that was actually correct (helps identify False Positives).
- Recall: The percentage of the actual defect that the AI successfully found (helps identify False Negatives).
The system generates a color-coded image to help you quickly diagnose AI behavior:
- White/Green Fill: True Positives (The AI and Human agree).
- Red Fill: False Positives (The AI hallucinated a defect) or False Negatives (The AI missed a defect).
- Blue Outline: The exact contour of the human Ground Truth.
All metrics and overlay images are automatically saved to an evaluation_results folder in your current working directory. The folder contains:
- The generated overlay
.pngfiles. - A detailed
evaluation_log.txtcontaining the mathematical breakdown (areas, centroids, Euclidean distances) for every image evaluated.