Advanced Evaluation for Model Performance

Advanced Evaluation

Next to the Metrics Tab, you can now select a new feature which shows sample inference by the model at evaluation checkpoints through direct comparison between Ground Truth and Checkpoint Prediction. On the Controls Panel on the right, you can hide and show the labels, display the tag name as well as the confidence metric for how confident the model is in the prediction, as well as a sliding bar to compare inference at each evaluation checkpoint for the same image to analyse how the model is progressing in learning to perform your chosen tasks.

The image below displays an example of the advanced evaluation page.

2022

Advanced Evaluation Example (Click image to enlarge)

Confusion Matrix

Nexus’ Confusion Matrix can be used as another form of evaluation during training, alongside other features like our real-time training dashboard and Advanced Evaluation. During training, the Confusion Matrix tab can be found at the top of the training dashboard, in the Run page, as shown below.

Our confusion matrix is computed in exactly the methods described here, with ground truth classes being represented as columns, and the prediction classes represented as rows. To aid in ease of interpretability, it uses color gradients to highlight differences in distributions, where lighter colors represent low proportions and darker colors represent higher proportions.

Background Class

To improve the user experience, we’ve also added a few options to improve interpretability. In the default view, the background class entries are omitted, but these can be toggled with ‘Yes’ or ‘No’ buttons under Background Class. The background class represents any area in which there is no object of interest. How area is interpreted depends on what computer vision task is involved. For classification, background class can be assigned to a whole image. For semantic segmentation, background class is assigned on a per-pixel basis. For instance segmentation, object detection, and keypoint detection, background class is assigned to spatial regions that have no class instances in them, so if an annotation is assigned background class as the ground-truth, that means there was no annotation at all. If a predicted annotation has the background class, then it means there was no prediction made.

The background class entries can be toggled with ‘Yes’ or ‘No’ buttons under Background Class.

The background class entries can be toggled with ‘Yes’ or ‘No’ buttons under Background Class.

Additionally, in the default view, the confusion matrix is normalized based on ground truth, meaning that the numbers are percentages calculated row-wise. This means that the sum of all the entries over the row is 100%. These normalized values provide the proportion of confusion to other classes for each model’s predicted class, as well as the precision per class along the diagonal. This will automatically update to include or not include the background class, depending on whether it is toggled on or off. The percentage view allows users to more easily perceive the distribution of results that can occur. One can still view the raw computed values before normalizing by selecting Absolute. This will allow users to view and verify the underlying values.

Finally, similar to Advanced Evaluation, one can scroll across the evaluation checkpoints to see how the confusion matrix for the model has evolved. Ideally, one should observe that the proportions for each row should converge towards the downward diagonal, such that prediction and ground truth classes are maximally aligned.

Training Log

Another tab that will provide more transparency for the user is our direct logs for the training data as it is ongoing. This makes it easy for you to copy and parse results for your documentation and shows transparency on the training processes on our end to show exactly the logs we register and the data we use to render our graphs.

Example of a Training Log in a Training Run (click on image to enlarge)

Example of a Training Log in a Training Run (click on image to enlarge)