• Keine Ergebnisse gefunden

Introduction to Video Coding

N/A
N/A
Protected

Academic year: 2022

Aktie "Introduction to Video Coding"

Copied!
41
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

schmidt@informatik.

haw-hamburg.de

Introduction to Video Coding

o Motivation & Fundamentals o Principles of Video Coding o Coding Standards

Special Thanks to Hans L. Cycon from FHTW Berlin for

providing first-hand knowledge and much of the material !

(2)

schmidt@informatik.

haw-hamburg.de

Video Data – the Problem

o PAL uncompressed

- 768x576 pixels per frame

- x 3 bytes per pixel (24 bit colour) - x 25 frames per second

- ≈ 32 MB per second - ≈ 1.9 GB per minute

→ Raw video data not device compliant!

→ Even cameras need immediate compression

(3)

schmidt@informatik.

haw-hamburg.de

Signal Transmission Scheme

channel

Coder Decoder

Saving of bit rate Reconstruction of signal

(4)

schmidt@informatik.

haw-hamburg.de

Fundamentals

Why don’t we just use *zip?

o Suppose our video-pixels attain N values i with probability p

i

o and we know nothing about them (just iid random)

o Then (Shannon):

The Entropy

is the minimal bound for data needed (mean of information)

o For individually encoded pixels this results in optimal compression rates around 1.33 …

! Image and video pixels are not iid random, but highly correlated

! Correlations are hidden from the individual pixel level

=

=

N

i

i

i

p

p H

1

2

( )

log

(5)

Image Compression Concepts

schmidt@informatik.

haw-hamburg.de

o lossless, by removing redundancies

- spatial redundancies - temporal redundancies

- spatial-temporal correlations - statistical redundancies

o lossy, by removing (visually) irrelevant information

- reduction of accuracy in colors,

contours and motion

(6)

schmidt@informatik.

haw-hamburg.de

Image Quality Measure

⎟⎟ ⎠

⎜⎜ ⎞

= ⎛

=

f f MSE

N PSNR

Q

N

i

cmp i org

i

2

2

log 255 10

) 1 (

log 255 20

) (

higher performance

(7)

schmidt@informatik.

haw-hamburg.de

The Idea of Transformation

o Mathematically an image can be considered as a matrix in some high dimensional space o Transformations rotate this matrix into an

advantageous position (of sparse population) o This results in ‘compactification of energy’:

most of the coefficients will be (nearly) zero o Leads to simplified separation of irrelevant

information

(8)

schmidt@informatik.

haw-hamburg.de

T Q PC/C

o Transformation: De-correlation, compactification of energy, reversible

o Quantisation: Elimination of psycho-visual irrelevant information, not reversible

o Pre-Coder: Pre-processing for additional elimination of statistical redundancies, reversible

o Coder: Generation of variable length Codes, reversible Bitstream

Initial Image Compressor

Transform Coding

(9)

schmidt@informatik.

haw-hamburg.de

Spatial Decorrelation:

Discrete Cosine Transform - DCT

Transformation of spatial into frequency coordinates

⎪⎩

⎪ ⎨

⎧ =

= Λ

⋅ ⋅

⋅ +

⋅ + Λ

= Λ ∑∑

= =

otherwise for

j i v f

j u

i v

v u u F

i j

....

...

1

0 .

. ...

2 1 )

(

) , 16 (

) 1 2

cos ( 16

) 1 2

cos ( 4

) ( ) ) (

, (

7

0 7

0

ξ ξ

π

π

(10)

schmidt@informatik.

haw-hamburg.de

Concept of conventional DCT coding (JPEG, MPEG, H.26x)

block DCT

scanning

quanti- sation

zig-zag scanning

channel

VLC

90, 70, 10, 20, 10, 10, 30, 10, 10, 0, 0, 0, ....

8 x 8 x 10 bit

= 640 bit

90 72 11 31 0 0 0 0 14 13 5 0 0 0 0 0 15 6 3 0 0 0 0 0 4 4 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

8 x 8 x 4 bit

= 256 bit

90 70 10 30 0 0 0 0 10 10 100 0 0 0 0 20 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

01, 00111, 01, 01, 01, 01, 010, 01, 000001

000001 = EOB -> 26 bit

8 x 8 x 8 bit

= 512 bit

Source: Schäfer HHI [W2]

compression factor = 512/26 ≈ 20

(11)

schmidt@informatik.

haw-hamburg.de

Transformed Representation

o Concentration of information in few spectral coefficients (decorrelation)

1 2 3 4

5 6

7 8

8 6

4 2

0 50 100 150 200 250 300 350

(12)

schmidt@informatik.

haw-hamburg.de

Transformed Representation

o Concentration of information in few spectral coefficients (decorrelation)

1 2 3 4

5 6

7 8

8 6

4 2

0 50 100 150 200 250 300 350

16 of 64 coefficients

(13)

schmidt@informatik.

haw-hamburg.de

Transformed Representation

o Concentration of information in few spectral coefficients (decorrelation)

1 2 3 4

5 6

7 8

8 6

4 2

0 50 100 150 200 250 300 350

16 of 64 coefficients

4 of 64 coefficients

(14)

schmidt@informatik.

haw-hamburg.de

Transformed Representation

o Concentration of information in few spectral coefficients (decorrelation)

1 2 3 4

5 6

7 8

8 6

4 2

0 50 100 150 200 250 300 350

16 of 64 coefficients

4 of 64 coefficients

1 of 64 coefficients

Source: Schäfer HHI [W2]

(15)

schmidt@informatik.

haw-hamburg.de

Problem of DCT: Blocking Artefacts

DCT 1:64

Original

(16)

schmidt@informatik.

haw-hamburg.de

Alternative Transformation: DWT

DCT 1:64 WLT 1:64

Original

(17)

schmidt@informatik.

haw-hamburg.de

Transform Coding Decoding (DCT- or Wavelet- based)

T Q

IT IQ IC

Image

Rec.Image

compressed bitstream lossless

decorelation

lossy Quantizer entropy coder C

(18)

Temporal Decorrelation:

Difference Coding

schmidt@informatik.

haw-hamburg.de

In slow moving scenes many subsequent images are nearly alike:

→ Temporal Redundancy is eliminated by coding only the difference of subsequent images (Inter-Frames).

→ To limit accumulating errors full images (Intra-Frames) are coded regularly ( ≈ one of 50 frames)

t

GOP

I = Intra P = Inter

I P P P P I P P P

(19)

schmidt@informatik.

haw-hamburg.de

Hybrid Decorrelation: Difference Coding with Motion Prediction

Source: Schäfer HHI [W2]

(20)

schmidt@informatik.

haw-hamburg.de

Block Motion Compensation Prediction

1 2

3 4

5 6 7 8

9 10

11 12 13 14 15

15 16 frame k-1

1 2 3 4

5 7 8

9 10 11 12

13 14 15 16

6

frame k

Block Matching

o Decomposition of previous picture into blocks

o Move & match blocks on top of next picture

o Simplify by motion vector discretisation

(21)

schmidt@informatik.

haw-hamburg.de

Bidirektional Prediction Coding

...

I frames - Intracoding (JPEG)

(22)

schmidt@informatik.

haw-hamburg.de

Bidirektional Prediction Coding

... P

I frames - Intracoding (JPEG)

P frames - Uni-directional predictive coding

(23)

schmidt@informatik.

haw-hamburg.de

Bidirektional Prediction Coding

B P

...

I frames - Intracoding (JPEG)

P frames - Uni-directional predictive coding

B frames - Bi-directional predictive coding

(24)

schmidt@informatik.

haw-hamburg.de

Bidirektional Prediction Coding

B B P

...

I frames - Intracoding (JPEG)

P frames - Uni-directional predictive coding

B frames - Bi-directional predictive coding

(25)

schmidt@informatik.

haw-hamburg.de

Bidirektional Prediction Coding

B B P P ...

...

I frames - Intracoding (JPEG)

P frames - Uni-directional predictive coding

B frames - Bi-directional predictive coding

(26)

schmidt@informatik.

haw-hamburg.de

Bidirektional Prediction Coding

B B P B B P ...

...

I frames - Intracoding (JPEG)

P frames - Uni-directional predictive coding

B frames - Bi-directional predictive coding

(27)

schmidt@informatik.

haw-hamburg.de

Bi-directional

Prediction

(28)

schmidt@informatik.

haw-hamburg.de

Statistical Coding Principles/

Entropy Coding

Huffmann Coder (variable length symbolic coder)

• Assign to every fixed word a variable length code word

• Frequent words → short code word, rare words → long code

Improvement: Arithmetic Coder

• Map entire sequences of symbols on [0,1] (also binary mapping)

Run-Length Coder

• abbbbbbbbcc → a7b!cc

Pattern Substitution: Dictionary Coding

• Represent repeating sequences of symbols by pointers

Context Modelling (Pre-Coding)

• Determine local conditional probabilities for symbols, instead of global frequencies

(29)

schmidt@informatik.

haw-hamburg.de

Layered Coding

Scalability and adaptability to varying play-out scenarios may be achieved through coding layers:

o Spatial layers → range of (pixel) resolutions

o Data partitioning layers → high and low priority data o SNR layers → range of ‘visual’ resolutions

o Temporal layers → range of frame rates

(30)

schmidt@informatik.

haw-hamburg.de

Video Coding Standards

Video Coding Standards are defined in ranges of applicability (image resolution, bandwidth, computational complexity, power consumption …), initially for specific target groups:

o ISO Moving Pictures Experts Group MPEG

- MPEG-1 (1989): CD-ROM applications at ≈ 1,5 Mb/s - MPEG-2 (1991): High Quality Coding at 2 – 50 Mb/s

- MPEG-4 (1998): Scalable ≈ 64 kb/s – 4 Mb/s – 100 Mb/s (V3) o ITU-T

- H.261 (1991): Video telephony, video conferencing ≈ 64 kb/s – 1 Mb/s - H.263 (1996): Low bit rate coding (ISDN) ≈ 8 kb/s – 1 Mb/s

- H.26L (2001): Low bit rate, low complexity

- H.264/AVC (2003): Joint with ISO, dbld. compr. of MPEG-4, 8 kb/s – 100 Mb/s

(31)

schmidt@informatik.

haw-hamburg.de

Milestones in Video Compression

0 100 200 300 400 500

26 28 PSNR

[dB]

DCT

(Motion JPEG) (1985)

Foreman 10 Hz, QCIF

133 frames encoded

Bit-Rate [kbps]

MPEG1/2 1994 MPEG4/H263

1998 H.120

1988

H.261 1991 H.26L

(2001) H264

2002

Bit rate Reduction 85%

30 32 34 36 38

30 32 34 36 38

Visual Gain 10dB

(32)

schmidt@informatik.

haw-hamburg.de

MPEG-2

o Aiming at TV quality (interlacing), but generic picture format: The ‘DVD-Standard’

o Discrete Cosine Transform (8 x 8 blocks)

o Motion compensation and prediction (I, P, B – Frames) o Supports coding layers

o Error resilience by interpolation

o Supports multiple audio

and video flows

(33)

schmidt@informatik.

haw-hamburg.de

H.263

o Aiming at telecommunication: CIF + QCIF formats.

The ‘old’ video conferencing standard

o Discrete Cosine Transform (8 x 8 blocks)

o Improved motion compensation (precision, variable block size, overlapping blocks)

o Prediction with PB-frame (interpolated B component) o Advanced negotiability

o Arithmetic coding

(34)

schmidt@informatik.

haw-hamburg.de

MPEG-4

o Ambitious standard to encode ‘multimedia streams’

(including interactivity)

o Focus of interest on video compression, based on a collection of profiles: Simple, advanced simple, …

o Content based compression, motion prediction, scaling o Concept of Video Object Planes (I/P/B-VOPs)

- Motion estimation and compensation - Shape coding

- Texture coding (DCT, but also wavelet based) - Sprite coding

o Adaptive techniques (motion comp., arithmetic coding,

error resilience …)

(35)

MPEG4

Generic Coding Scheme

schmidt@informatik.

haw-hamburg.de

(36)

schmidt@informatik.

haw-hamburg.de

MPEG4

System

Model

(37)

schmidt@informatik.

haw-hamburg.de

H.264/AVC

o Aiming at full scalability: from 3GPP to HDTV o Approval May 2003 (Editor T. Wiegand, HHI) o New 4x4 integer transform (of DCT kind)

o Many modes:

- Adaptive block size for transform

- Adaptive blocking for motion compensation - Adaptive Intra prediction

- Two VL Entropy codings: CAVLC + CABAC (D. Marpe, HHI)

o Content adaptive deblocking filters o Complexity:

- 8 – 10 times MPEG-2 for encoding

- 3 times MPEG-2 for decoding

(38)

schmidt@informatik.

haw-hamburg.de

H.264: Structure

Deq./Inv.

Transform

Motion- Compensated

Predictor

Control Data Quant.

Transf. coeffs

Motion Data 0

Intra/Inter

Coder Control

Decoder

Motion Estimator Transform/

Quantizer

-

Entropy Coding

(39)

schmidt@informatik.

haw-hamburg.de

Deblocking Filter

Source: Schäfer HHI [W2]

(40)

schmidt@informatik.

haw-hamburg.de

What else?

o MPEG-7: Multimedia Content Description Interface - Meta data standard

- Goal: describe multimedia data for search, retrieval and (combined/synchronized) play out

o MPEG-21: Multimedia Framework (just finishing) - Meta data standard for multimedia applications o Proprietary codecs:

- RealNetworks: Helix - Microsoft: VC-1

- a few more …

- (at most) similar performance, similar ‘ideas’ visible

- pay per ???

(41)

schmidt@informatik.

haw-hamburg.de

References

Digitale Audio- und Videotechnik

• Hans L. Cycon: , Vorlesungsskript 2005.

• Ralf Schäfer HHI, http://bs.hhi.de/presentations/presentations.htm

• W.Effelsberg, R.Steinmetz: Video Compression Techniques, dpunkt.verlag 1998.

• Y. Shi, H. Sun: Image and Video Compression for Multimedia Engineering, CRC Press, Boca Raton 2000.

• N. Chapman, J. Chapman: Digital Multimedia, 2

nd

edition, Wiley, Chichester, GB, 2004.

• Detlev Marpe, Thomas Wiegand, and Gary J. Sullivan:

The H.264/MPEG4-AVC Standard and its Fidelity Range Extensions, IEEE

Communications Magazine, September 2005.

Referenzen

ÄHNLICHE DOKUMENTE

Golestani | Institut für Nachrichtentechnik | 04.07.2018 | SVCP 2018 | Hannover,

Keywords: video coding, affine motion-compensated prediction ( MCP ), simplified affine motion-compensated prediction, rate-distortion theory, aerial surveillance, global

In his work he showed that “the spatial power spectrum of the motion- compensated prediction error can be calculated from the signal power spectrum and the displacement estimation

The behaviorally generated optic flow was modified by inserting two objects close to the flight trajectory and by changing the size of the flight arena (Fig. 1) in order to

The first step is initiated by the user via the interface by dragging a game object associated with a chain head and modules into the game object field presented by the New

8.30 Coronal, sagittal and axial slices for patient number three without motion correction (left column), with mo- tion correction using the statistical motion model (cen- tral

The feasibility of the approach is demonstrated by means of a leave-out evaluation using 4D CT image sequences of ten lung tumor patients and simulating three different types

Weitere Fachgebiete > EDV, Informatik > Datenbanken, Informationssicherheit, Geschäftssoftware > Datenkompression, Dokumentaustauschformate..