Fundamentals of ML and MAP Decoding
1. Introduction
In digital communication:
- A transmitter sends a symbol s ∈ 𝒮 (from a finite set of possible symbols).
- The channel adds noise, so the receiver observes y.
- The goal of the receiver is to decode y to the most likely transmitted symbol s.
This is where ML and MAP decoding come in.
2. Maximum Likelihood (ML) Decoding
Idea: Choose the symbol s that maximizes the likelihood of receiving y, assuming all symbols are equally likely.
Mathematically:
ŝ_ML = argmax_{s ∈ 𝒮} P(y | s)
Where P(y|s) = probability of observing y given that s was transmitted (likelihood function).
Intuition: Pick the symbol that makes the received signal y “most probable” based on the channel.
Notes:
- ML decoding does not consider prior probabilities of symbols.
- Common in AWGN channels: it reduces to minimum Euclidean distance decoding for equally likely symbols.
3. Maximum a Posteriori (MAP) Decoding
Idea: Choose the symbol s that maximizes the posterior probability given the observation y.
Mathematically:
ŝ_MAP = argmax_{s ∈ 𝒮} P(s | y)
By Bayes’ theorem:
P(s | y) = (P(y | s) P(s)) / P(y)
Since P(y) is constant for all symbols:
ŝ_MAP = argmax_{s ∈ 𝒮} P(y | s) P(s)
Where:
- P(s) = prior probability of symbol s
- P(y|s) = likelihood
Intuition: MAP combines channel observation and prior knowledge of symbol probabilities.
Notes:
- If all symbols are equally likely: P(s) = const → MAP = ML.
- MAP is Bayesian optimal, minimizing the probability of error when priors are known.
4. Comparison Table
| Feature | ML Decoding | MAP Decoding |
|---|---|---|
| Goal | Maximize likelihood P(y | s) | Maximize posterior P(s | y) |
| Uses prior | No | Yes, P(s) |
| Optimality | Optimal if symbols equally likely | Optimal in Bayesian sense |
| Simplification | Often Euclidean distance minimization | Likelihood × Prior weighting |
5. Intuition
- ML: “Which symbol would most likely produce what I received?”
- MAP: “Considering what I know about symbol probabilities, which symbol is most probable given the received signal?”
Think of ML as purely observation-driven and MAP as observation + prior knowledge-driven.
6. Example Setup
Suppose we have a binary communication system:
- Transmitted symbols: S = {0, 1}
- Channel: Binary Symmetric Channel (BSC) with crossover probability p = 0.1
- Observed symbol at receiver: y ∈ {0, 1}
- Goal: Decide which symbol was transmitted.
6.1. Maximum Likelihood (ML) Decoding
- Assume all symbols are equally likely: P(0) = P(1) = 0.5
Likelihoods:
P(y = 0 | s = 0) = 0.9 P(y = 0 | s = 1) = 0.1
ML rule: choose s that maximizes P(y|s)
Case 1: Receiver sees y = 0
P(y=0|s=0) = 0.9 > P(y=0|s=1) = 0.1 ŝ_ML = 0
Case 2: Receiver sees y = 1
P(y=1|s=1) = 0.9 > P(y=1|s=0) = 0.1 ŝ_ML = 1
ML just picks the symbol most likely to produce the received bit, assuming equal probability of 0 and 1.
6.2. Maximum a Posteriori (MAP) Decoding
- Now, suppose priors are unequal: P(0) = 0.8, P(1) = 0.2
Posterior:
P(s|y) ∝ P(y|s) * P(s)
Case 1: Receiver sees y = 0
P(0|0) ∝ 0.9 × 0.8 = 0.72 P(1|0) ∝ 0.1 × 0.2 = 0.02 ŝ_MAP = 0
Case 2: Receiver sees y = 1
P(0|1) ∝ 0.1 × 0.8 = 0.08 P(1|1) ∝ 0.9 × 0.2 = 0.18 ŝ_MAP = 1
Notice how MAP incorporates priors. If the prior was more extreme (e.g., P(0)=0.99), MAP could decode y=1 as 0, while ML would still pick 1.
Summary
- ML ignores priors, MAP uses them.
- ML = MAP when all symbols are equally likely.
- In practical communication, MAP can reduce probability of error when symbol probabilities are unequal.