Hello,
I have a question regarding the evaluation procedure used to compute the leaderboard scores on the test dataset.
The competition announcement states that an optimal monotonic transformation is applied before evaluation, implying that participants are not required to predict the exact scale of the target values.
However, based on the leaderboard scores reported for the test dataset, it appears that prediction scale may still have a substantial impact on the reported MSE score, which seems difficult to reconcile with the evaluation description.
Could you please clarify whether the announced optimal monotonic transformation is currently being applied when computing the leaderboard scores on the test dataset? In addition, could you confirm whether this transformation is an isotonic regression (or another form of monotonic calibration), and provide any details that may help participants better understand how the final MSE score is computed?
Thank you.