What you see in the above images is that the left image shows the perfect model with the perfect data where the calculated scores perfectly match the accuracy (incase you were wondering, this does not exist in real life). The middle image shows our current solution to calculate the score. As you can see most of the scores ended up in the regions between 50% – 100%. The new calculations which are shown in the right image, show that the calculated scores are more in line with the accuracy.
What this means, for you as an admin, is that the accuracy scores in general will be lower than before. However this does not mean the model is doing worse. We tried to add a stronger correlation between scores and accuracy: lower scores are more likely to correspond to bad predictions.
As you are an admin in the Metamaze application, we recommend you to review the thresholds after your next training. It might be needed to lower them to obtain the same automation rate as before.