本答案是英文原版的配套答案,与翻译的中文版课本题序不太一样但内容一样。翻译的中文版增加了题量。
2.2values翻译Entropy of functions. Let be a random variable taking on a finite number of values. What is the (general) inequality relationship of and if
(a) ?
(b) ?
Solution:  Let . Then
.
Consider any set of ’s that map onto a single . For this set
,
Since is a monotone increasing function and . Extending this argument to the entire range of (and ), we obtain
                      ,
with equality iff if one-to-one with probability one.
(a) is one-to-one and hence the entropy, which is just a function of the probabilities does not change, i.e., .
(b)is not necessarily one-to-one. Hence all that we can say is that , which equality if cosine is one-to-one on the range of .
2.16. Example of joint entropy. Let be given by
   
0
1
0
1/3
1/3
1
0
1/3
Find
(a) ,.
(b) ,.
(c)
(d) .
(e)
(f) Draw a Venn diagram for the quantities in (a) through (e).
Solution:
Fig. 1 Venn diagram
(a).
(b)()()
(c)
(d)
(e)
(f)See Figure 1.
2.29 Inequalities. Let , and be joint random variables. Prove the following inequalities and find conditions for equality.
(a)
(b)
(c)
(d)
Solution:
(a)Using the chain rule for conditional entropy,
With equality iff ,that is, when is a function of and .
(b)Using the chain rule for mutual information,
,
With equality iff , that is, when and are conditionally independent given .
(c)Using first the chain rule for entropy and then definition of conditional mutual  information,
          ,
With equality iff , that is, when and are conditionally independent given .
(d)Using the chain rule for mutual information,
And therefore this inequality is actually an equality in all cases.
4.5  Entropy rates of Markov chains.
(a)  Find the entropy rate of the two-state Markov chain with transition matrix
(b) What values of ,maximize the rate of part (a)?
(c)Find the entropy rate of the two-state Markov chain with transition matrix
(d)Find the maximum value of the entropy rate of the Markov chain of part (c). We expect that the maximizing value of should be less than, since the 0 state permits more information to be generated than the 1 state.
Solution:
(a)The stationary distribution is easily calculated.
Therefore the entropy rate is
(b)The entropy rate is at most 1 bit because the process has only two states. This rate can be achieved if( and only if) , in which case the process is actually i.i.d. with
.
(c)As a special case of the general two-state Markov chain, the entropy rate is
.
(d)By straightforward calculus, we find that the maximum value of of part (c) occurs for . The maximum value is