In class I said that the entropy is always maximized for the uniform probability distribution, but I realized that this statement is not that clear:
- Firstly, what I mean here is: Fix an alphabet with k outcomes. Among all probability distributions on these k outcomes, Shannon entropy is maximized by the uniform distribution p_i=1/k.
- Secondly, what is this maximal value? Using the formula I gave in class, you can compute that it's \log_2(k). This could certainly be greater than 1!
(So: to whoever was asking me about the entropy of a 6-sided die, you were suspicious because you got an answer greater than 1 but that's actually correct!)
