Copyright © 2001 by Neal R. Wagner. All rights reserved.
NOTE: This site is obsolete. See book draft (in PDF):
This section starts with work on an answer to the exercise at the end of the section on Coding and Information Theory. The exercise asks for a simulation of Shannon's random codes, so the simulation starts with a random code table. This is not the same as Shannon's proof of his theorem, which uses the average of all possible code tables. In practice however, a single specific random table shows the behavior expected by Shannon's theorem. (I don't know if it's possible to find an analytic solution to this simulation.)
The simulation is a challenge in several ways, since to get a reasonable code, a very large blocksize is required by Shannon's theorem. Large blocksizes require exponentially larger code tables, and these quickly exhaust the memory of any actual system.
Please keep in mind that all this discussion assumes a binary symmetric channel with probability p = 0.75, that is, on the average, one out of every four bits transmitted is reversed -- an extremely high error rate. The corresponding channel capacity is 0.18872.
Here are some sample sizes for the random codeword table, where the block sizes are just arbitrarily set to multiples of 5, that is 5, 10, 15, 20, 25, ..., and where the codeword lengths are taken to be at the channel capacity (which would be the minimum possible value for a good code).
| Blocksize (bits) |
Number of Entries |
Minimum Codeword Length (at capacity, bits) |
Minimum Code Table Size (in K bytes) |
| 5 | 32 | 27 | 0.1 K |
| 10 | 1 024 | 53 | 6 K |
| 15 | 32 768 | 80 | 327 K |
| 20 | 1 048 576 | 106 | 13 894 K |
| 25 | 33 554 432 | 132 | 553 648 K |
| 30 | 1 073 741 842 | 159 | 21 206 401 K |
In practice, simulation runs need longer codeword lengths, perhaps up to twice the channel capacity values. I was able to run the simulation for blocksize = 20, using up to 30 Megabytes of main memory on an iMac. Hypothetical runs for blocksizes of 25 or 30 would take up to 1 Gigabyte or 40 Gigabytes of memory. (The memory requirements grow exponentially as the blocksize grows: adding 5 to the blocksize makes the memory requirements about 40 times as great.)
Initial results of simulation runs were discouraging and equivocal.
It seemed that I would need a far larger blocksize than my computer
could run in order to get reasonable results. For example,
the results of runs answering the original questions in the
exercise were:
For a blocksize of 15, runs with code word lengths of 64, 80, and 104 produce respective error rates of 56%, 36%, and 15%.Thus at substantially over the channel capacity for codeword lengths (104 bits or 131% of capacity), the error rate is still 15%, indicating that the blocksize is not remotely close to a value needed for reasonable accuracy. More discouraging, these runs gave no hint of a magic channel capacity value behind the scenes.
I continued with simulation runs for block sizes of 5, 10, 15, and 20 and for a variety of codeword lengths (all divisible by 8 for convenience of the simulation program). Eventually I plotted all the data in a graph, and got an astonishing picture:
Graph displaying the simulation results: Postscript, or PDF. The Postscript file that creates the graph is here.
The above remarkable graph shows how far away the simulation is from the good results predicted by Shannon's theorem. At the same time, the table gives a clear idea of what the results would be if one could keep increasing the blocksize: the graph would be increasingly vertical as it crosses the vertical dashed line marking the channel capacity.
This specific graph shows the actual simulation results, with points connected by straight lines. (No curve-fitting was carried out.) The black line (for blocksize = 5), is accurate since each point represents 10 000 000 trials, but there are only 14 points, so this graph has an angular look. The graphs for blocksizes of 10 and 15 are smoother (because more points are plotted), but less accurate (smaller number of trials per point). Finally, the red graph (blocksize = 20) is somewhat irregular because only 10 000 trials per point did not produce the accuracy of the other graphs.
It is important to realize that the graph only illustrates what Shannon's theorem proves. The graph and these simulation results prove nothing, but just give an indication of what might be true.
The graph also illustrates another theorem known as the Converse of Shannon's Noisy Coding Theorem. Roughly speaking, the converse says that if one is signaling at a fixed rate more than channel capacity (to the left of the vertical dashed line in the picture), and if the block size gets arbitrarily large, then the error rate will get arbitrarily close to 100%. Stated another way, at more than channel capacity, as the block size gets larger and larger, the error rate must get closer and closer to 100%. Contrast this with Shannon's theorem, which says that if one signals at less than channel capacity, and if the block size get arbitrarily large, then the error rate will get arbitrarily close to 0. In the limit as the block size tends to infinity, the graph will look like a step function: at 100 to the left of the dashed vertical line and at 0 to the right.
The proofs of both of Shannon's theorems are found in books on information theory. In the terms of this course they are very difficult and technical.
The Java simulation program in three files: simulation program.