Linear Predictive Coding (LPC) in Speech Signal Processing
What is LPC?
Linear Predictive Coding (LPC) is a method that represents a speech signal using a small number of parameters. It models the speech signal as the output of a linear filter excited by a source (voice or noise).
LPC is widely used in speech compression, speech synthesis, coding, and recognition.
1. Core Idea of LPC
LPC assumes that the current speech sample can be approximated by a linear combination of past samples:
x[n] ≈ a₁ x[n−1] + a₂ x[n−2] + ... + aโ x[n−p]
The coefficients a₁, a₂, ..., aโ are chosen to minimize the prediction error.
2. Why This Works for Speech
The human vocal tract behaves like an all-pole acoustic filter. Thus speech can be approximated by the model:
x[n] = − ฮฃ (aโ x[n−k]) + G e[n]
Where:
- aโ = LPC coefficients (vocal tract shape)
- e[n] = excitation (voiced/unvoiced source)
- G = gain
So LPC effectively models the vocal tract filter.
3. Does LPC Predict Future Samples?
Yes. LPC predicts the next speech sample:
x̂[n] = ฮฃ (aโ x[n−k])
However, the goal is not prediction—the prediction process is only used to estimate the vocal tract filter.
Prediction Error
e[n] = x[n] − x̂[n]
Error Minimization
The LPC coefficients minimize the total squared error:
E = ฮฃ e[n]²
4. How LPC Coefficients Are Computed
LPC coefficients are typically computed using:
- Autocorrelation method
- Levinson–Durbin recursion (efficient solver)
These coefficients describe the vocal tract filter:
H(z) = G / (1 − ฮฃ aโ z⁻แต)
The poles of this filter correspond to formant frequencies.
5. What LPC Represents
- Resonance structure of the vocal tract
- Formant frequencies
- Spectral envelope of speech
So LPC is primarily a vocal tract model, not a prediction tool.
6. Applications of LPC
- Speech compression (CELP, GSM, VoIP)
- Text-to-speech synthesis
- Speech recognition
- Pitch and formant analysis
- Speech coding
LPC-Based Speech Compression
LPC Speech Compression
Linear Predictive Coding (LPC) compresses speech by modeling it as a linear filter excited by a source:
x[n] = -ฮฃ (a_k x[n-k]) + G e[n]
- a_k: LPC coefficients (vocal tract filter)
- e[n]: excitation (glottal pulses or noise)
- G: gain
Instead of sending every speech sample, LPC transmits only the parameters, drastically reducing data.
How LPC Compression Works Step by Step
- Frame the speech signal: divide into short frames (10–30 ms), assuming stationarity.
- Compute LPC coefficients: use autocorrelation + Levinson-Durbin; typical order 10–20.
- Compute the excitation signal: voiced frames → periodic pitch; unvoiced → noise.
- Quantize coefficients and excitation: convert LPC coefficients, gain, pitch to bits for transmission.
- Transmit parameters: only coefficients, gain, pitch, and voiced/unvoiced flag are sent.
- Synthesize speech at decoder: use LPC filter with excitation signal to reconstruct waveform.
Example Compression Ratio
- Original: 8 kHz, 16-bit PCM → 128 kbps
- LPC: 10 coefficients + gain + pitch → ~2–4 kbps
- Compression ratio: 30–50×
Why LPC Achieves Compression
- Speech is highly correlated; samples are predictable.
- LPC models this correlation with a few coefficients:
x[n] ≈ ฮฃ a_k x[n-k]
- Only the residual excitation carries new information, reducing bandwidth.
Advanced LPC-Based Coders
- CELP (Code-Excited Linear Prediction): uses codebook for excitation → bit rates 4.8–16 kbps.
- G.729, GSM: LPC-based coders used in telephony and mobile.
Summary
LPC predicts the current speech sample using previous samples. This prediction is used to estimate the vocal tract filter. LPC coefficients represent filter parameters, not speech samples. The result is an efficient, compact model of speech.
- LPC compresses speech by sending parameters instead of raw samples.
- Key parameters: LPC coefficients, gain, excitation.
- Works because speech is predictable and highly correlated.
- Drastically reduces bit rate while maintaining intelligibility.