Skip to content

Information Theory

Intermediate

Information theory is a mathematical framework for quantifying, storing, and communicating information, originally developed by Claude Shannon in his landmark 1948 paper 'A Mathematical Theory of Communication.' At its core, the theory provides precise definitions for intuitive concepts like information content, uncertainty, and redundancy using the language of probability and logarithms. Shannon demonstrated that information can be measured in bits, established fundamental limits on data compression (the source coding theorem), and proved that reliable communication is possible over noisy channels up to a calculable maximum rate (the channel capacity theorem). These results laid the theoretical groundwork for the entire digital age.

The central quantity in information theory is entropy, which measures the average uncertainty or surprise associated with a random variable's outcomes. High entropy indicates greater unpredictability and thus more information content per message, while low entropy signals redundancy and predictability. Related measures such as mutual information, relative entropy (Kullback-Leibler divergence), and conditional entropy allow researchers to quantify how much one variable reveals about another, how different two probability distributions are, and how much uncertainty remains after partial observation. These tools have proven indispensable not only in communications engineering but also in statistics, machine learning, neuroscience, and physics.

Beyond its origins in electrical engineering, information theory has become a unifying language across the sciences. In machine learning, cross-entropy and KL divergence are standard loss functions for training classification models and variational autoencoders. In biology, information-theoretic measures help quantify genetic diversity and neural coding efficiency. In physics, the Bekenstein-Hawking entropy connects information to black hole thermodynamics, while quantum information theory extends Shannon's framework to qubits and entanglement. Whether one is designing a 5G cellular network, compressing video, training a large language model, or studying the thermodynamics of computation, information theory provides the essential quantitative toolkit.

Practice a little. See where you stand.

Ready to practice?5 minutes. No pressure.

Key Concepts

One concept at a time.

Explore your way

Choose a different way to engage with this topic — no grading, just richer thinking.

Explore your way — choose one:

Explore with AI →
Curriculum alignment— Standards-aligned

Grade level

College+

Learning objectives

  • Analyze Shannon entropy, mutual information, and channel capacity to quantify information content and transmission limits
  • Apply source coding theorems including Huffman coding and arithmetic coding to achieve optimal data compression rates
  • Evaluate error-correcting codes including Hamming, Reed-Solomon, and turbo codes for reliable communication over noisy channels
  • Distinguish between lossless and lossy compression techniques and their information-theoretic bounds for practical applications

Recommended Resources

This page contains affiliate links. We may earn a commission at no extra cost to you.

Books

Elements of Information Theory

by Thomas M. Cover & Joy A. Thomas

Information Theory, Inference, and Learning Algorithms

by David J.C. MacKay

The Mathematical Theory of Communication

by Claude E. Shannon & Warren Weaver

Information Theory and Reliable Communication

by Robert G. Gallager

A Student's Guide to Entropy

by Don S. Lemons

Courses

Information Theory

MIT OpenCourseWareEnroll

Information Theory, Pattern Recognition, and Neural Networks

University of Cambridge (David MacKay lectures)Enroll

Information Theory

Khan Academy / Stanford (via Coursera)Enroll
Information Theory - Learn, Quiz & Study | PiqCue