ENGG5301 Information Theory 2025 Midterm Exam P3：Causal Encoding

題目為回憶版，解答是 GPT-5 寫的。

考試時(shí) (1) 問就想偏了，考后看到 GPT-5 的答案很氣，不等式想不到直接 (1)(2)(3) 連跪，搞的 (4)(5) 問也沒做。

從初中就開始爛完的不等式水平又發(fā)力了，但這課確實(shí)沒啥心思去刷教材/習(xí)題，符合預(yù)期。

Problem

Let random variables \(X_1,\dots,X_n\) be i.i.d. with distribution \(p_X\).

We define an encoding sequence \(M_1,\dots,M_n\) subject to the following causality constraints:

\(M_i\) is a function of \(X_1,\dots,X_i\), i.e., \(M_i = f_i(X_1,\dots,X_i)\);
\(X_i\) is a function of \(M_1,\dots,M_i\), i.e., \(X_i = g_i(M_1,\dots,M_i)\).

Answer the following:

Find the minimum value of \(\sum_{i=1}^n H(M_i)\) in terms of \(p_X\).
Show that, under the optimal encoding, \(M_1,\dots,M_n\) are mutually independent.
Show that, under the optimal encoding, \(H(M_i \mid X_i) = 0\).
If we drop optimality but require causal encoding and prefix-free constraint , show that there exists an encoding such that

\[\mathbb{E}\!\left(\frac{1}{n}\sum_{i=1}^n L(M_i)\right) < H(p_X) + 1. \]
Under the same causal and prefix-free constraint, show that for any \(\varepsilon>0\), there exists an encoding such that

\[\mathbb{E}\!\left(\frac{1}{n}\sum_{i=1}^n L(M_i)\right) < H(p_X) + \varepsilon. \]

Solution

(a) Minimum of \(\sum_i H(M_i)\) under optimal encoding

Result:

\[\min \sum_{i=1}^n H(M_i) = n\,H(p_X). \]

Proof:

By the data processing inequality and lossless reconstruction,

\[H(X_1,\dots,X_n) \le H(M_1,\dots,M_n) \le \sum_{i=1}^n H(M_i). \]

Since \(X_1,\dots,X_n\) are i.i.d.,

\[H(X_1,\dots,X_n) = n H(X). \]

Thus,

\[\sum_{i=1}^n H(M_i) \ge n H(X). \]

Choosing \(M_i=f(X_i)\), i.e., encoding each symbol separately, achieves \(H(M_i)=H(X)\) and satisfies both causality and recoverability, hence the bound is tight.

(b) Independence of \(M_1,\dots,M_n\) under optimal encoding

Result:
Under the optimal encoding, \(M_1,\dots,M_n\) are mutually independent (and thus pairwise independent).

Proof:

Equality in part (a) requires that

\[\sum_{i=1}^n H(M_i) = H(M_1,\dots,M_n). \]

Equality between the sum of marginal entropies and the joint entropy holds if and only if the variables are mutually independent.
Hence, the optimal \(M_1,\dots,M_n\) are independent.

(c) Determinism: \(H(M_i\mid X_i)=0\) under optimal encoding

Result:
Under the optimal scheme, \(M_i\) is a deterministic function of \(X_i\).

Proof:

From the reconstruction condition,

\[H(X_i\mid M_1,\dots,M_i) = 0. \]

Since \(X_i\) is independent of previous messages \((M_1,\dots,M_{i-1})\),

\[H(X_i\mid M_1,\dots,M_{i-1}) = H(X_i). \]

Hence the mutual information satisfies

\[I(X_i; M_i \mid M_1,\dots,M_{i-1}) = H(X_i) - 0 = H(X_i). \]

On the other hand,

\[I(X_i; M_i \mid M_1,\dots,M_{i-1}) = H(M_i \mid M_1,\dots,M_{i-1}) - H(M_i \mid X_i, M_1,\dots,M_{i-1}). \]

Under the optimal encoding, \(H(M_i\mid M_1,\dots,M_{i-1}) = H(X_i)\), so

\[H(M_i\mid X_i, M_1,\dots,M_{i-1}) = 0. \]

Because \(M_i\) is independent of \((M_1,\dots,M_{i-1})\), it follows that

\[H(M_i\mid X_i)=0. \]

Therefore, each \(M_i\) is a deterministic function of \(X_i\).

(d) Existence of a causal, prefix-free encoding with average length \(< H(p_X)+1\)

We want to construct \(M_1,\dots,M_n\) satisfying causality and a prefix-free constraint, such that

\[\mathbb{E}\Big[\frac{1}{n}\sum_{i=1}^n L(M_i)\Big] < H(p_X)+1. \]

Construction (symbol-by-symbol / Huffman coding):

Since \(X_1,\dots,X_n\) are i.i.d., we can encode each symbol independently.
Build a prefix-free code \(C\) (e.g., Huffman code) for \(p_X\).
For each \(i\), set

\[M_i = C(X_i). \]

This clearly satisfies causality:

\(M_i\) depends on \(X_i\) (and trivially on \(X_1,\dots,X_{i-1}\))
\(\Rightarrow M_i = g(X_1,\dots,X_i)\).
\(X_i\) can be decoded from \(M_i\) alone
\(\Rightarrow X_i = f(M_1,\dots,M_i)\).

Average length:

\[\mathbb{E}[L(M_i)] = \sum_x p_X(x) L(C(x)) = L(C), \quad \forall i. \]

Hence

\[\mathbb{E}\Big[\frac{1}{n}\sum_{i=1}^n L(M_i)\Big] = L(C) < H(p_X)+1 \]

by the standard bound for Huffman coding.

? This proves part (d).

(e) Existence of a causal, prefix-free encoding with average length \(< H(p_X)+\varepsilon\)

To achieve \(H(p_X)+\varepsilon\), we need block coding.

Treat the whole sequence \(X^n=(X_1,\dots,X_n)\) as a block.
Use an \(n\)-symbol prefix-free code \(C_n\) (e.g., arithmetic coding), which satisfies

\[H(X^n) \le \mathbb{E}[L(C_n(X^n))] < H(X^n)+1. \]

Since \(X^n\) is i.i.d.,

\[H(X^n) = n H(p_X). \]

Implement \(C_n\) sequentially (arithmetic coding can be output symbol by symbol):

Let \(M_1,\dots,M_n\) be the incremental outputs, so that

\[M_1 M_2 \dots M_n = C_n(X^n). \]

Then causality is preserved:
- \(M_i\) depends on \(X_1,\dots,X_i\).
- \(X_i\) can be recovered from \(M_1,\dots,M_i\).

Average length per symbol:

\[\mathbb{E}\Big[\frac{1}{n}\sum_{i=1}^n L(M_i)\Big] = \frac{1}{n} \mathbb{E}[L(C_n(X^n))] < \frac{n H(p_X)+1}{n} = H(p_X) + \frac{1}{n}. \]

By choosing \(n > 1/\varepsilon\), we guarantee

\[\mathbb{E}\Big[\frac{1}{n}\sum_{i=1}^n L(M_i)\Big] < H(p_X) + \varepsilon. \]

? This proves part (e).

posted @ 2025-10-30 17:45 Cold_Chair 閱讀(5) 評(píng)論(0) 收藏舉報(bào)

刷新頁面返回頂部

Cold_Chair的博客

天天被錘爆！怎么辦？菜哭了啊o(╥﹏╥)o

ENGG5301 Information Theory 2025 Midterm Exam P3：Causal Encoding

Problem

Solution

(a) Minimum of \(\sum_i H(M_i)\) under optimal encoding

(b) Independence of \(M_1,\dots,M_n\) under optimal encoding

(c) Determinism: \(H(M_i\mid X_i)=0\) under optimal encoding

(d) Existence of a causal, prefix-free encoding with average length \(< H(p_X)+1\)

(e) Existence of a causal, prefix-free encoding with average length \(< H(p_X)+\varepsilon\)

公告