Document 7817405

Download Report

Transcript Document 7817405

Voice Coding in 3G Networks
S-38.130 Postgraduate Course in Telecommunications
Spring 2001
Tommi Koistinen
Nokia Networks
Contents
PART I
• Short introduction to 3GPP reference architecture models
• Media Gateway (MG)
• Multimedia Resource Functions (MRF)
PART II
• Speech compression – why ?
• Tandem avoidance
• Adaptive Multirate (AMR) speech codec
• Wideband speech coding (AMR-WB)
• Demonstrations
2
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
3GPP Release 99
• R99; first phase of 3G
• entities involved with speech processing are circled with red
SGSN
SGSN
GGSN
Multimedia
IP networks
Iu-PS
MT
MT
UT
UTRAN
RAN
HLR
HLR
Iu-CS
3G
3GMSC
MSC
TTranscoder
ranscoder
3
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
PSTN/legacy
networks
3GPP R4
• separates MSC to MSC Server and to Media Gateway
HSS/CSCF
SGSN
GGSN
Iu-PS
MT
MT
UT
UTRAN
RAN
MGW
MGW
PSTN/legacy
networks
Iu-CS
user data
Iu-CS
control
4
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Multimedia
IP networks
MSC
MSC
Server
Server
MSC
MSC
Server
Server
3GPP R4…R5
• IP Multimedia Subsystem (IMS)
HSS/CSCF
SGSN
GGSN
Multimedia
IP networks
Iu-PS
MT
MT
UT
UTRAN
RAN
MRF
MRF
MGW
MGW
PSTN/legacy
networks
5
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Media Gateway
• support for several interfaces (A-interface for 2G and Iu-interface for 3G)
and for several transmission protocols (ATM, IP, TDM)
• support for several codecs including the Adaptive Multirate (AMR) codec
and future coming wideband codecs
• electric and acoustic echo cancellation
• announcement services
• DTMF and call progress tone generation and detection
• support for fax/modem/data protocols
• support for Tandem Free Operation (TFO) and Transcoder Free Operation
(TrFO)
• bad frame handling
• IP protocol handling (RTP/RTCP, encryption, QoS support)
6
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Media Resource Functions Unit
• audio/video conferencing services
• speech enhancements ?
MT
UTRAN
SGSN
MT
7
© NOKIA
UTRAN
Backgrounds_c.PPT/ 27.01.2000 / ao
MRF
IP terminal
terminal
GGSN
Multimedia
IP networks
Tandem Avoidance in 2G
MSC
MSC
PSTN
64 kbps
MSC
MSC
T
Transcoder
ranscoder
64
64 
 16
16
T
Transcoder
ranscoder
64
64 
 16
16
BSS
BSS
BSS
BSS
MS
MS
MS
MS
Current status: no Tandem Free Operation (TFO)
8
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Tandem Avoidance in 2G
MSC
MSC
PSTN
48(16) kbps
MSC
T
Transcoder
ranscoder
16
16 
 16
16
T
Transcoder
ranscoder
16
16 
 16
16
BSS
BSS
MS
MS
MS
MS
Better speech quality with Tandem Free Operation (TFO)
9
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Tandem Avoidance in 3G
AMR
MT
UT
UTRAN
RAN
EFR!
MGW
MGW
MGW
GSM
GSM BSS
BSS
AMR ?
MSC
MSC
Server
Server
MSC
MSC
Server
Server
PSTN/legacy
networks
AMR ?
Transcoder Free Operation (TrFO)
AMR modes are negotiated by inband procedure.
10
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Speech Compression – Why ?
• to save transmission capacity
• to save radio resources
• to save storage capacity
• more compression (40%) with voice activity detection (VAD)
and discontinuous transmission (DTX)
• error robustness with bad frame handling (BFH)
11
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Speech coding techniques
• Waveform coders
• correlation between adjacent samples
• G.711, G.726 ADPCM etc.
• Analysis-by-synthesis types of coders
• Code Excited Linear Prediction (CELP)
• G.723, G.729, GSM EFR, GSM AMR
12
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
The CELP model
adaptive codebook
gp
v(n)
fixed
codebook
+
gc
”glottis”
13
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
c(n)
u(n)
1
A(z)
^
s(n)
LP synthesis
”vocal tract”
post-filtering
^
s'(n)
Adaptive Multirate (AMR) speech codec
• only mandatory codec for 3G
• improved speech quality in both half-rate and full-rate modes
by means of codec mode adaptation i.e. varying the balance
between speech and channel coding for the same gross bitrate
• ability to trade speech quality and capacity smoothly and
flexibly by a combination of channel and codec mode
adaptation; this can be controlled by the network operator on a
cell by cell basis
M
O
S
Mode 1
Mode 2
Mode 3
C/I
14
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
AMR source rates
Codec mode
AMR_12.20
AMR_10.20
AMR_7.95
AMR_7.40
AMR_6.70
AMR_5.90
AMR_5.15
AMR_4.75
AMR_SID
15
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Source codec bit-rate
12.20 kbit/s FR
10.20 kbit/s FR
7.95 kbit/s FR / HR
7.40 kbit/s FR / HR
6.70 kbit/s FR / HR
5.90 kbit/s FR / HR
5.15 kbit/s FR / HR
4.75 kbit/s FR / HR
1.80 kbit/s FR / HR
Structure of AMR encoder
frame
Pre-processing
LPC analysis
(twice per frame)
Pre-processing
s(n)
subframe
Open-loop pitch search
(twice per frame)
A(z)
windowing
and
autocorrelation
R[ ]
interpolation
for the 4
subframes
LSP A(z)
Innovative codebook
search
Adaptive codebook
search
^
A(z)
To
compute target
for adaptive
codebook
x(n)
x(n)
pitch
index
find best delay
and gain
compute target
for
innovation
x2(n)
find best
innovation
Filter memory
update
update filter
memories for
next subframe
code
index
compute
excitation
h(n)
LevinsonDurbin
R[ ]
A(z)
A(z)
LSP
indices
quantize
LTP-gain
compute
A(z)
weighted
speech
(4 subframes)
LSP
LSP
quantization
compute
adaptive
codebook
contribution
find
open-loop pitch
A(z)
^
A(z)
interpolation
for the 4
subframes
^
LSP A(z)
16
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
LTP
gain
index
compute
impulse
response
fixed
codebook
gain
quantization
fixed codebook
gain index
h(n)
Encoder output
Parameter
2 LSP sets
Pitch delay
Pitch gain
Fixed code
Fixed gain
Total
17
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
1st
2nd
3rd
4th
9
4
35
5
6
4
35
5
9
4
35
5
6
4
35
5
Total
38
30
16
140
20
244
Structure of AMR decoder
frame
subframe
post-processing
LSP
indices
decode LSP
interpolation
of LSP for the
4 subframes
LSP
18
© NOKIA
pitch
index
decode
adaptive
codebook
gains
indices
decode
gains
code
index
decode
innovative
codebook
^
A(z)
Backgrounds_c.PPT/ 27.01.2000 / ao
construct
excitation
^
synthesis s(n)
filter
post filter
^s'(n)
Demostration I: Full Rate vs. AMR-NB
Erroneous channel (C/I= 26…4 dB) :
1. sample: FR 13 kbps
2. sample: AMR-NB 5.9-12.2 kbps
19
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Wideband speech coding
• Narrowband 300 – 3400 Hz
• Wideband 50 – 7000 Hz
• Wideband AMR speech codec (3GPP R5)
20
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
AMR-WB source rates
Codec mode
AMR-WB_23.85
AMR-WB_23.05
AMR-WB_19.85
AMR-WB_18.25
AMR-WB_15.85
AMR-WB_14.25
AMR-WB_12.65
AMR-WB_8.85
AMR-WB_6.6
AMR-WB_SID
21
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Source codec bit-rate
23.80 kbit/s
23.05 kbit/s
19.85 kbit/s
18.25 kbit/s
15.85 kbit/s
14.25kbit/s
12.65 kbit/s
8.85 kbit/s
6.6 kbit/s
1.75 kbit/s
EFR vs. AMR-NB vs. AMR-WB
Subjective speech quality
(in 16 kbps full rate traffic channel)
AMR-WB
Excellent
AMR-NB
Very good
EFR
Good
Poor
Unacceptable
Error-free
13
10
7
Carrier-to-interface ratio (dB)
22
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
4
Demostration II: AMR-NB vs. AMR-WB
Clean speech (highest modes):
1. sample: AMR-NB 12.2 kbps
2. sample: AMR-WB 23.85 kbps
23
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Demostration III: GSM EFR vs. AMR-WB
Erroneous channel:
1. sample: GSM EFR 12.2 kbps
2. sample: AMR-WB 6.6-14.25 kbps
24
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao
Demostration IV: AMR-NB vs. AMR-WB
Music (highest modes):
1. sample: AMR-NB 12.2 kbps
2. sample: AMR-WB 23.85 kbps
25
© NOKIA
Backgrounds_c.PPT/ 27.01.2000 / ao