Normalized Cross Entropy

Created
TagsMetrics

NCE=logloss(model)logloss(rate)\text{NCE} = \frac{\text{logloss(model)}} {\text{logloss(rate)}}

  1. Always non-negative.
  1. Only 0 if your predictions match the labels perfectly.
  1. Unbounded; can grow arbitrarily large.
  1. Intuitive scale: NCE < 1: the model has learned something. NCE > 1: the model is less accurate than always predicting the average

NCE=1Ni=1n(1+yi2log(pi))+(1yi2log(1pi))(plog(p)+(1p)log(1p))NCE = \frac{-\frac{1}{N} \sum_{i=1}^n \left(\frac{1+y_i}{2} \log(p_i)\right) + \left(\frac{1-y_i}{2}\log(1-p_i)\right)} {-(p*\log(p) +(1-p)*\log(1-p))}

  1. The lower the value, the better the model’s prediction.
  1. The reason for this normalization is that the closer the background CTR is to either 0 or 1, the easier it is to achieve a better log loss.
  1. Dividing by the entropy of the background CTR makes the NE insensitive to the background CTR.