Processing math: 87%

Entropy

H(X)=p(x)logp(x)

 

Properties of H

  1. H(X)0
  2. Hb(X)=logbaHa(X)
  3. Conditioning reduces entropy
    H(X|Y)H(X)   (w/ eq. iff X and Y are independet)
  4. H(X1,X2,...,Xn)H(Xi)    (w/ eq. iff Xi are independent)
  5. H(X)log|X|    (w/ eq. iff X is uniformly distributed over X)
  6. H(p) is concave in p

 

Relative Entropy

D(p||q)=p(x)logp(x)q(x)

 

Mutual Information

I(X:Y)=xyp(x,y)logp(x,y)p(x)p(y)

 

Alternative expressions

  • H(X)=Eplog1p(X)
  • H(X,Y)=Eplog1p(X,Y)
  • H(X|Y)=Eplog1p(X|Y)
  • I(X:Y)=Eplogp(X,Y)p(X)p(Y)
  • D(p||q)=Eplogp(X)q(X)

 

Properties of D and I

  1. I(X:Y)=H(X)H(X|Y)=H(Y)H(Y|X)=H(X)+H(Y)H(X,Y)
  2. D(p||q)0   (w/ eq. iff p(x)=q(x))
  3. I(X:Y)=D(p(x,y)||p(x)p(y))0    (w/ eq. iff p(x,y)=p(x)p(y))
  4. If |X|=m, and u is uniform distribution, then D(p||u)=log(m)H(p)
  5. D(p||q)는 (p,q)에서 convex이다.

 

Chain rules

Entropy : H(X1,X2,...,Xn)=H(Xi|Xi1,...,X1)

Mutual Information : I(X1,X2,...,Xn:Y)=I(Xi:Y|X1,...,Xi1)

Relative entropy : D(p(x,y)||q(x,y))=D(p(x)||q(x))+D(p(y|x)||q(y|x))

 

Jensen's inequality

f가 convex 함수면, E[f(X)]f(E[X])

 

Log sum inequality

ailogaibi(ai)logaibi

w/ eq. iff aibi=constant

 

Data-processing inequality

XYZ가 Markov chain이면, I(X:Y)I(X:Z)

 

Sufficient statistic

T(X)fθ(x)에 대해 sufficient하다. iff I(θ:X)=I(θ:T(X))

 

Fano's inequality

Pe=Pr{ˆX(Y)X}라 하자.

H(Pe)+Pelog|X|H(X|Y)

 

Inequality

XX이 independent and identically distributed면,

Pr(X=X)2H(X)

 

AEP

"Almost all events are almost equally surprising"

특히, X1,X2,...,가 iid이고 p(x)를 따르는 경우,

1nlogp(X1,X2,...,Xn)H(X) in probability

 

Typical set

A(n)ϵ은 다음 조건을 만족하는 x1,...,xn의 시퀀스 집합이다.

2n(H(X)+ϵ)p(x1,x2,...,xn)2n(H(X)ϵ)

 

Properties of the typical set

  1. 만약 (x1,x2,...,xn)이 typical set이라면, p(x1,x2,...,xn)=2n(H±ϵ)
  2. n이 충분히 큰 경우에 Pr{Aepsilon(n)>1ϵ
  3. |A(n)epsilon|2n(H(X)+ϵ)

 

Definition

n일 때, an=bn이면 1nloganbn0

 

Entropy rate

두가지 정의가 있다.

H(X)=lim

H'(X) = \lim_{n \to \infty} H(X_n |X_{n-1}, ..., X_1)

 

Stationary process의 경우, H(X) = H'(X)

 

Entropy rate of stationary Markov chain

H(X) = - \sum_{i,j} \mu_i P_{ij} log P_{ij}

 

Functions of Markov chain

X_1, X_2, ..., X_n이 stationary Markov chain이고 Y_i = \phi(X_i)이면,

H(Y_n|Y_{n-1},...,Y_1, X_1) \leq H(Y) \leq H(Y_n|Y_{n-1}, ..., Y_1)

 

그리고

\lim_{n \to \infty} H(Y_n|Y_{n-1}, ..., Y_1, X_1) = H(Y) = \lim_{n \rightarrow \infty} H(Y_n | Y_{n-1}, ..., Y_1)

'Study > Coding and Information Theory' 카테고리의 다른 글

[07] Channel Capacity  (1) 2024.06.09
[06] Data Compression  (1) 2024.06.08
[05] Entropy Rates  (1) 2024.04.10
[04] AEP, Asymptotic Equipartition Property  (0) 2024.04.10
[03-2] Jensen's Inequality  (0) 2024.04.09