# Introduction to Information Theory (Fall 2021)

Program: Bachelor Mathematics, Informatics, AI, Physics & Astronomy (WI, WInf, IN, KI, NS)
Lecturer: Michael Walter
Teaching assistants: Yachen Liu, Misho Yanakiev
Schedule: Nov-Dec 2021 (period 2)
Further information: DataNose , Canvas

The Canvas course page is the primary source for all course material. This page is offered only for your convenience.

## Course description

This course will give an introduction to information theory information theory – the mathematical theory of information. Ever since its inception, information theory has had a profound impact on society. It underpins important technological developments, from reliable memories to mobile phone standards, and its versatile mathematical toolbox has found use in computer science, machine learning, physics, and even pure mathematics.

Starting from probability theory, we will discuss how to mathematically model information sources and communication channels, how to optimally compress information, and how to design error-correcting codes that allow us to reliably communicate over noisy communication channels. We will also see how techniques used in information theory can be applied more generally to make predictions from noisy data.

See also last year’s course homepage for an impression of what we will cover in this course.

We will discuss rigorous definitions and proofs, hence some “mathematical maturity” will be assumed. We will use basic probability theory throughout the course (and will remind you of the most important facts). In addition, part of the homework will require programming in Python.

## Lecture notes and video recordings

Video recordings are available on Canvas.

• Lecture 1 (Nov 1, 2021): Welcome, Introduction to Information Theory
• Lecture 2 (Nov 3, 2021): Probability Theory Refresher
• Lecture 3 (Nov 8, 2021): Numerical Random Variables, Convexity and Concavity, Entropy
• Lecture 4 (Nov 10, 2021): Symbol Codes: Lossless Compression, Huffman Algorithm
• Lecture 5 (Nov 15, 2021): Block Codes: Shannon’s Source Coding Theorem, its Proof, and Variations
• Lecture 6 (Nov 17, 2021): Stream Codes: Lempel-Ziv Algorithm
• Lecture 7 (Nov 22, 2021): Stream Codes: Arithmetic Coding
• Lecture 8 (Nov 24, 2021): Joint Entropies & Communication over Noisy Channels
• Lecture 9 (Nov 29, 2021): Shannon’s Noisy Coding Theorem
• Lecture 10 (Dec 1, 2021): Proof of the Noisy Coding Theorem
• Lecture 11 (Dec 6, 2021): Proof of the Converse, Shannon’s Theory vs Practice
• Lecture 12 (Dec 8, 2021): Reed-Solomon Codes
• Lecture 13 (Dec 13, 2021): Message Passing for Decoding and Inference, Outlook
• Student Presentations (Dec 15, 2021)

Our main reference (which is optional, since handwritten lecture notes and video recordings will be provided) is David J. C. MacKay’s beautiful textbook “Information Theory, Inference, and Learning Algorithms”, Cambridge (2003). It is available online 📖 and contains much more material than what we will be able to discuss in class.

## Practice Problems

In addition, we will offer many practice problems (one set for each exercise class) so that you can test your understanding as the course progresses.

## Learning objectives

At the end of the course, you will be able to:

• mathematically model memoryless and Markov information sources using probability theory
• mathematically model memoryless information channels using probability theory
• compute basic properties of probability distributions, such as entropies and essential bit contents, using mathematical techniques such as Jensen’s inequality
• define typical sets and estimate their size and probability
• prove and apply Shannon’s source coding theorem
• prove and apply the Kraft inequality for uniquely decodable (UD) codes, its converse for prefix codes, and use it to bound the average length for optimal UD or prefix codes
• design optimal symbol codes using Huffman’s coding algorithm
• analyze and apply the Lempel-Ziv algorithm for stream coding
• analyze and apply arithmetic coding for different probabilistic models
• compute basic properties of joint probability distributions, such as conditional entropies and mutual informations
• prove and apply Shannon’s noisy channel coding theorem
• understand basic properties of block codes, such as their rate and various probabilities of error
• encode Reed-Solomon codes and decode them for erasure errors
• understand maximum likelihood approach to codeword and bitwise decoding
• design and apply message passing algorithms for encoding and inference problems

We will check off these objectives as we move along in the course.

## Format

Lectures, exercises classes, self-study, and a short presentation.

## Assessment

The final grade will be determined by the following calculation: