Algorithm for binary prefix code
In information theory, Shannon–Fano–Elias coding is a precursor to arithmetic coding, in which probabilities are used to determine codewords.[1] It is named for Claude Shannon, Robert Fano, and Peter Elias.
Algorithm description
[edit]
Given a discrete random variable X of ordered values to be encoded, let
be the probability for any x in X. Define a function
data:image/s3,"s3://crabby-images/146b6/146b6a0e0bd6933b52a82ecab0839a6e2d230488" alt="{\displaystyle {\bar {F}}(x)=\sum _{x_{i}<x}p(x_{i})+{\frac {1}{2}}p(x)}"
Algorithm:
- For each x in X,
- Let Z be the binary expansion of
.
- Choose the length of the encoding of x,
, to be the integer data:image/s3,"s3://crabby-images/bed91/bed9114ae54728d2043534bd9a3b828b13ac1165" alt="{\displaystyle \left\lceil \log _{2}{\frac {1}{p(x)}}\right\rceil +1}"
- Choose the encoding of x,
, be the first
most significant bits after the decimal point of Z.
Let X = {A, B, C, D}, with probabilities p = {1/3, 1/4, 1/6, 1/4}.
- For A
data:image/s3,"s3://crabby-images/eb4d5/eb4d5c36f1b45ff192885557d6639008ae7b0649" alt="{\displaystyle {\bar {F}}(A)={\frac {1}{2}}p(A)={\frac {1}{2}}\cdot {\frac {1}{3}}=0.1666\ldots }"
- In binary, Z(A) = 0.0010101010...
data:image/s3,"s3://crabby-images/d8455/d845552de732f0d33ec2826ec0d4fe3f22939c31" alt="{\displaystyle L(A)=\left\lceil \log _{2}{\frac {1}{\frac {1}{3}}}\right\rceil +1=\mathbf {3} }"
- code(A) is 001
- For B
data:image/s3,"s3://crabby-images/eb3f5/eb3f500927a51056bd0857e81caf3897c6c8e863" alt="{\displaystyle {\bar {F}}(B)=p(A)+{\frac {1}{2}}p(B)={\frac {1}{3}}+{\frac {1}{2}}\cdot {\frac {1}{4}}=0.4583333\ldots }"
- In binary, Z(B) = 0.01110101010101...
data:image/s3,"s3://crabby-images/5bf7b/5bf7b33b135861bc6082f03eda76f532534fd823" alt="{\displaystyle L(B)=\left\lceil \log _{2}{\frac {1}{\frac {1}{4}}}\right\rceil +1=\mathbf {3} }"
- code(B) is 011
- For C
data:image/s3,"s3://crabby-images/fc300/fc300a153dfa4df37ddfb1ecdd3f0480aaa329bb" alt="{\displaystyle {\bar {F}}(C)=p(A)+p(B)+{\frac {1}{2}}p(C)={\frac {1}{3}}+{\frac {1}{4}}+{\frac {1}{2}}\cdot {\frac {1}{6}}=0.66666\ldots }"
- In binary, Z(C) = 0.101010101010...
data:image/s3,"s3://crabby-images/04ca2/04ca2fe88ca5df87e1d4496e11a6d912cb655a79" alt="{\displaystyle L(C)=\left\lceil \log _{2}{\frac {1}{\frac {1}{6}}}\right\rceil +1=\mathbf {4} }"
- code(C) is 1010
- For D
data:image/s3,"s3://crabby-images/e729a/e729a913b94d9175e6f16ba3d69fa4723763f193" alt="{\displaystyle {\bar {F}}(D)=p(A)+p(B)+p(C)+{\frac {1}{2}}p(D)={\frac {1}{3}}+{\frac {1}{4}}+{\frac {1}{6}}+{\frac {1}{2}}\cdot {\frac {1}{4}}=0.875}"
- In binary, Z(D) = 0.111
data:image/s3,"s3://crabby-images/8f5bf/8f5bf432d424e2fe8f69e0f0f7e6ff4106e466f3" alt="{\displaystyle L(D)=\left\lceil \log _{2}{\frac {1}{\frac {1}{4}}}\right\rceil +1=\mathbf {3} }"
- code(D) is 111
Shannon–Fano–Elias coding produces a binary prefix code, allowing for direct decoding.
Let bcode(x) be the rational number formed by adding a decimal point before a binary code. For example, if code(C) = 1010 then bcode(C) = 0.1010. For all x, if no y exists such that
data:image/s3,"s3://crabby-images/68b3e/68b3ee068ddd6268601330270acce6cc2af47afe" alt="{\displaystyle \operatorname {bcode} (x)\leq \operatorname {bcode} (y)<\operatorname {bcode} (x)+2^{-L(x)}}"
then all the codes form a prefix code.
By comparing F to the CDF of X, this property may be demonstrated graphically for Shannon–Fano–Elias coding.
By definition of L it follows that
data:image/s3,"s3://crabby-images/f1905/f190583218e26aa30322a4fb7e351d6a09037be8" alt="{\displaystyle 2^{-L(x)}\leq {\frac {1}{2}}p(x)}"
And because the bits after L(y) are truncated from F(y) to form code(y), it follows that
data:image/s3,"s3://crabby-images/17c8c/17c8c0b1138e564ef6e699100fc706c45d22b426" alt="{\displaystyle {\bar {F}}(y)-\operatorname {bcode} (y)\leq 2^{-L(y)}}"
thus bcode(y) must be no less than CDF(x).
So the above graph demonstrates that the
, therefore the prefix property holds.
The average code length is
.
Thus for H(X), the entropy of the random variable X,
data:image/s3,"s3://crabby-images/39efc/39efc6d4c6d5c60a012dcb5de19a5697580866a6" alt="{\displaystyle H(X)+1\leq LC(X)<H(X)+2}"
Shannon Fano Elias codes from 1 to 2 extra bits per symbol from X than entropy, so the code is not used in practice.