The communication rates at which the tasks of network information theory can be derived by using the Packing lemma, which in turn depends on the Joint typicality lemma.
We derive the proof of these two results from first principles.
Let (U,X,Y)∼pUXY(u,x,y). Let p˜U˜Y(˜un,˜yn) be some probability distribution, not necessarily p˜Un˜Yn(˜un,˜yn)=∏ni=1pUY(ui,yi), and let (˜Un,˜Yn)∼p˜Un˜Yn.
Let Xn(m),m∈M, where |M|≤2nR, be random sequences, each distributed according to n∏i=1pX|U(xi|˜ui). Assume that Xn(m),m∈M is pairwise conditionally independent of ˜Yn given ˜Un, but is arbitrarily dependent on other Xn(m) sequences.
Then, there exists δ(ϵ)→0 as ϵ→0 such that Pr{(˜Un,Xn(m),˜Yn)∈T(n)ϵ for some m∈M}→0, as n→∞, if R<I(X;Y|U)−δ(ϵ).
This corresponds to a situation where a superposition encoding is used to encode messages ℓ and m into Xn(l,m) using random codebooks {un(ℓ)}∼pUn(un),{xn(ℓ,m)}∼n∏i=1pX|U(xi|ui(ℓ)), in which case we will have p˜Un˜Yn=pUnYn. The packing lemma, gives us a bound on the error associated with decoding the message m (the superimposed message) given that we correctly decoded ℓ (the cloud centre).
Assume U=∅. The random sequences Xn(m),m∈M represent different codewords. The ˜Yn sequence represents the received sequence as a result of sending a codeword not in the set {Xn(m)}.
The lemma shows that under any probability mass function on ˜Yn the probability that some codeword in M is jointly typical with ˜Yn goes to zero as n→∞ if the rate of the code R<I(X;Y).
Define the event:
Em≡{(˜Un,Xn(m),˜Yn)∈T(n)ϵ(U,X,Y)},m∈M.
By the union events bound, the probability of the event of interest can be bounded as Pr{∪m∈MEm}≤∑m∈MPr{Em}.
Now consider Pr{Em}=Pr{(˜Un,Xn(m),˜Yn)∈T(n)ϵ(U,X,Y)}=∑(˜un,˜yn)∈T(n)ϵ(˜U,˜Y)p˜Un˜Yn(˜un,˜yn)Pr{(˜un,Xn(m),˜yn)∈T(n)ϵ(U,X,Y)}=∑(˜un,˜yn)∈T(n)ϵ(˜U,˜Y)p˜Un˜Yn(˜un,˜yn)∑xn∈T(n)ϵ(X|˜un,˜yn)p(xn|˜un)≤∑(˜un,˜yn)∈T(n)ϵ(˜U,˜Y)p˜Un˜Yn(˜un,˜yn)|T(n)ϵ(X|˜un,˜yn)|⋅2−n[H(X|U)−δ′(ϵ)]≤∑(˜un,˜yn)∈T(n)ϵ(˜U,˜Y)p˜Un˜Yn(˜un,˜yn)2n[H(X|U,Y)−δ′′(ϵ)]⋅2−n[H(X|U)−δ′(ϵ)]=∑(˜un,˜yn)∈T(n)ϵ(˜U,˜Y)p˜Un˜Yn(˜un,˜yn)2−n[I(X;Y|U)−δ(ϵ)]=2−n[I(X;Y|U)−δ(ϵ)]∑(˜un,˜yn)∈T(n)ϵ(˜U,˜Y)p˜Un˜Yn(˜un,˜yn)≤2−n[I(X;Y|U)−δ(ϵ)]⋅1. The third equality follows from the way the xn is constructed: Xn(m)∼∏ni=1pX|U(xi|˜ui). The two inequalities follow from the properties of typical sequences (also known as the joint typicality lemma).
Hence Pr{∪m∈MEm}≤∑m∈MPr{Em}≤|M|2−n[I(X;Y|U)−δ(ϵ)]≤2−n[I(X;Y|U)−R−δ(ϵ)]. Thus if we choose R<I(X;Y|U)−δ(ϵ) the probability of error will tend to zero as n→∞.
Let (U,X,Y)∼pUXY(u,x,y). Then there exists δ(ϵ)→0 as ϵ→0 such that the following statements hold: If (˜Un,˜Yn) is distributed according to an arbitrary pmf p(˜un,˜yn) and ˜Zn∼∏ni=1pX|U(˜xi|˜ui), conditionally independent of ˜Yn given ˜Un, then Pr{(˜Un,˜Xn,˜Yn)∈T(n)ϵ(U,X,Y)}≤2−n[I(X;Y|U)−δ(ϵ)]
Consider the following argument Pr{(˜Un,˜Xn,˜Yn)∈T(n)ϵ(U,X,Y)}=∑˜xn∈T(n)ϵ(X|un,yn)p(˜xn|˜un)≤|T(n)ϵ(X|un,yn)|⋅2−n[H(X|U)−δ′(ϵ)]≤2n[H(X|U,Y)−δ′′(ϵ)]⋅2−n[H(X|U)−δ′(ϵ)]=2−n[I(X;Y|U)−δ(ϵ)]
The packing lemma works for quantum states and superposition encoding.
Modification of packing lemma to two senders = simultaneous decoding.
Analogue – but because proof slightly different we can't just say “quantum packing lemma” – we have to look at the error analysis in each case.