Video Systems Home Page
  Buyers Guide     Research & Tools  
  Search     in          Tips  


Table of Contents
Magazine Home Page
Magazine Home Page

January 2002
Viewpoint
When film becomes video
Cynthia Wisehart, Editorial Director

Features
Get 'em While They're Hot
By Peter H. Putman, CTS

Lessons in HD
By Darroch Greer

Shrink to Fit
By Philip De Lancie

Web News Comes of Age
By Stephen Porter

Numbers
January 2002 Numbers
Compiled by Andrea Harden

Products
Products

Solutions
Blue notes in high-def
By Trevor Boyer

Boarding planes vs. the boardroom
By Trevor Boyer

From takeoff to landing
By Trevor Boyer

The Cut
Thinking outside the Boxx
By Bob Turner

web.video
Making money on the Web
By Frank McMahon

Audio Tracks
Getting started with web audio
By Gary Eskow

Reviews
1 Canon XL1S
By Steve Mullen

2 Miranda ARC-372p
By Erik Holsinger

3 Corel Bryce 5.0
By Frank McMahon

Musings
A new point of view
By Cody Holt

Spotlight
Post 9/11
By Darroch Greer

Inbox
NASCAR impressions

 
Article
 
Shrink to Fit

By Philip De Lancie

Video Systems, Jan 1, 2002
  Brought to you by:
 
Print-friendly format
E-mail this information

Sidebars
"Mpeg at the High End"

This article is available in PDF format. To view it, you must have the Adobe Acrobat Reader, which can be downloaded for free.

Mpeg Video FAQ

When the International Standards Organization (ISO) established a Moving Picture Experts Group (MPEG) back in 1988, there was no QuickTime, Web, intranets, DVD, or digital cable. A committee open to accredited experts, the group's initial goal was to develop a standardized compression/decompression algorithm (codec) allowing playback of video at the data rate of a single-speed CD drive (1.4Mbps). As new media delivery technologies have evolved in the intervening years, MPEG's mission has broadened.

In 1992, MPEG finalized MPEG-1, the video standard for such products as Video CD and the source of the MP3 audio codec (which is MPEG-1, Layer 3 audio). MPEG-2, the video standard behind DVD and DTV set-top boxes, was finalized in 1994. The MPEG-2 standard is backward-compatible, meaning that an MPEG-2 decoder must also be able to decode MPEG-1.

MPEG-4, the standard for coding of audiovisual objects (often described as multimedia for the fixed and mobile Web) was approved in 1998 and extended in 1999. While MPEG-4 incorporates existing video codecs and includes new methodologies for encoding moving pictures at very low data rates, its main focus is defining a container format for scalable, synchronized integration of media in multiple formats.

MPEG standards still under development include MPEG-7, a “Multimedia Content Description Interface” for description and search of audio and visual content, and MPEG-21 a “Multimedia Framework” for an environment supporting “delivery and use of all content types by different categories of users in multiple application domains.”

Overall, MPEG's purview today is “the development of international standards for compression, decompression, processing, and coded representation of moving pictures, audio, and their combination, in order to satisfy a wide variety of applications.”

These applications include: DVD, Video CD, DTV set-top boxes, digital cable, nonlinear editing, and media for wired and wireless networks.

Even as the variety of applications has widened, the foundation of MPEG's efforts remains the principle that open standards are preferable to competing, incompatible, proprietary approaches. But because the group's name now pops up in many different contexts, it's not always clear what the label “MPEG” means in a given setting. In the questions and answers below, we'll look specifically at MPEG standards for video — how they work and how they are used.

What is MPEG video? Why is it needed?

The MPEG-1 and -2 video standards define methods for decoding video and audio information that has been coded into digital form for storage and transmission. Their primary purpose is to reduce the bandwidth (bit-rate) required to store and transmit digital video at acceptable quality. A series of digital images requires a tremendous amount of storage space and bandwidth (248Mbps to express a series of uncompressed 720×480 images at 30fps in 24-bit RGB color). MPEG's data compression allows video to be used in many applications where the available bandwidth is much too small for uncompressed video.

The encoding methodology used to achieve MPEG video compression is left up to the developers of encoding tools, as long as the data streams generated during encoding may be correctly decoded by a standards-compliant decoder.

What are the basic principles underlying MPEG video compression?

The MPEG video compression process is based on two fundamental assumptions. One is that much of the information in moving pictures is redundant, both within each frame and between a series of consecutive frames. This redundancy is part of what makes it possible to express picture information in a way that requires fewer bits.


During encoding, the data in each frame is organized in blocks. Each block is an 8x8 matrix of values. These are compressed, and the compressed blocks are grouped into macroblocks, which make up a picture. The pictures are grouped into a sequence.(Click the image for a larger view.)

In a single frame, for example, if you've got eight pixels in a row of a given color, it's more efficient to store the count (8) and the color value than to store the same color value eight times in a row. Known as run-length encoding, this “lossless” compression reduces bandwidth requirements while retaining all the information necessary to re-create the original exactly. The higher the redundancy in the source (e.g., large areas of the same color), the lower the bit-rate needed to encode it accurately.

The same principle applies to a series of frames. Within a given camera shot, the bulk of the information from frame to frame is the same. It takes much less data to store only the differences between frames than to store every frame in its entirety.

MPEG's other basic assumption is that there is some information in each video frame that may be discarded without noticeably affecting the way that image is perceived when played back. Based on our understanding of visual perception, this observation makes it possible to store only a subset of the information contained in the source, while re-creating from that subset an image that seems identical. Because the result is not actually identical, this approach to compression is referred to as “lossy.”

The lower the target bit-rate for an encoded stream, the more lossy compression will be needed to supplement the work of lossless compression techniques. The trick for an encoding system is to make good decisions about what information to discard, leaving intact the parts of the source signal that are most crucial to visual perception.

How does MPEG compression work within a video frame?

The starting point of the MPEG encoding process is intraframe compression. If you digitize a single-frame color image, the result is a grid of pixels. Each pixel may be thought of as a measure (sample) of the source image at a point in the grid. In a typical 24-bit RGB image, the value of each pixel is expressed as the intensity of each of the three primary colors of light (red, green, and blue, corresponding to the cones in our retinas). Each color is allocated 8 bits of the 24-bit sample, making 256 possible intensity values (0 to 255) for each color.

Mix red, green, and blue in every possible combination of intensities, and you've theoretically got a color range of more than 16 million shades. However, the human eye can't accurately discern all of those shades. Some are simply out of range (too dark or too bright). But even within range, our vision is more sensitive to brightness than to color, so some shades may be too subtly different from one another to be distinguished by eye. Thus, some color information may be discarded without noticeably affecting the perceived quality of the resultant image.

MPEG encoding operates on images in the YUV color space — also referred to as YCrCb. In YUV, brightness information (luminance) is separated from color information (chrominance). The luminance value (Y) for each pixel is left uncompressed. However, because the eye is less sensitive to color, the red and blue chrominance values (Cr and Cb) of a given pixel may each be averaged with those of adjacent pixels, thereby reducing the volume of data. (Green chrominance, meanwhile, is derived during playback by subtracting red and blue from luminance.) MPEG-1 uses 4:1 chrominance compression; MPEG-2 supports 4:1, 2:1, or none.

During encoding, the data in each frame is organized into blocks. Each block is an 8×8 matrix of values. A luminance block contains data for an 8-square-pixel area of the original image; a chrominance block at 4:1 compression contains data for a 16-square-pixel area covering four luminance blocks. The values in each matrix are subjected to a series of mathematical operations (discrete cosine transform or DCT, run-length encoding, scanning, quantizing, Huffman encoding) designed to minimize the data required to approximately re-create the area from which they were derived. The set of compressed blocks for each 16-square-pixel area (four luminance blocks plus one chrominance block each for Cr and Cb) are then grouped into data units called “macroblocks.”

The overall effect of these operations — particularly the quantizing — is to average adjacent pixels of similar shades into areas of a single shade. The lower the target data rate, the more heavy handed this averaging will be, making such artifacts as blockiness and banding more noticeable to the naked eye.

How does MPEG compression work across a series of frames?

So far, we've looked only at intraframe compression, resulting in an I-frame that can be re-created during decoding without reference to any surrounding frames. An I-frame uses a lot less data than an uncompressed frame (an average of about 1 bit per pixel in the typical MPEG-1 stream). But while a series of I-frames offers the highest possible quality of compression, the data rate is still much too high for most applications for which MPEG video was devised.

What's required to squeeze even more bits out of the data stream is a method of interframe compression that takes advantage of the temporal nature of moving pictures. The solution is to use I-frames sparingly as key frames in a group of pictures (GOP) that also contains frames that are coded only in terms of how they differ from a reference frame.

Frame-to-frame differences in MPEG are largely defined in terms of the location of macroblocks. For each area defined by a macroblock in a given I-frame, the encoder looks for similar areas in earlier and later frames in the same GOP. If a match is found, what's stored is a value representing the change (if any) in the location of the macroblock (plus changes in individual DCT coefficients). Thus the areas of continuity between frames do not need to be stored independently for every frame in which they appear. Where no matching areas are found, new macroblocks are generated by the same process used in I-frames.

MPEG supports two types of difference frames: P-frames (predictive) and B-frames (bi-directional). In a P-frame, the difference information is relative to the most recent past reference frame, either an I- or P-frame. In a B-frame, the difference information may be relative to the nearest past reference frame, next reference frame, or both (averaged), depending on which is most efficient to code.

A GOP may contain I-frames only, I- and P-frames, or all three types. A typical GOP with all three contains either 12 or 15 frames, starting with an I-frame and then alternating two B-frames with a P-frame (IBBPB BPBBPBB). The source frame for the I-frame is compressed first, followed by the processing of P-frames, and finally the intervening B-frames. The frame order is then shuffled to minimize memory requirements for buffering in the decoder; the original order is restored by the decoder during playback.

What are CBR and VBR?

CBR stands for constant bit-rate encoding, meaning that bits are allocated at the same rate over the entire length of the program material. CBR does not take into account the fact that it requires more bits to accurately encode fast-moving or complex scenes than scenes that are static (temporally redundant) or simple in their composition (spatially redundant).

In variable bit-rate (VBR) encoding, the bits that are not needed for acceptable coding of simple scenes are allocated instead to improve quality in more complex scenes. However, while CBR is generally a one-pass, realtime process, high-quality VBR requires two or three passes.

What is the difference between encoding and transcoding?

Encoding generally refers to the application of a coding algorithm to a signal, such as a videostream playing back from videotape. There may actually be several processes going on at the same time: conversion to digital (if the source stream is analog), data compression, and capture (storage) of the resultant stream to a file on a workstation.

Transcoding refers to the application of a coding algorithm during the conversion of an existing file from one format to another.

What are the differences between MPEG-1 and MPEG-2, and what are the main applications for each?

FirstView.com was one of the first websites to incorporate MPEG-4. The site uses e-Vue still-image MPEG-4 technology to stream footage of fashion shows in New York and Paris.

The MPEG-1 standard was developed to encode video at bit-rates less than the 1.4Mbps data transfer rate of a single-speed CD drive, thus enabling the Video CD format. MPEG-1 delivers a full 30fps, but a spatial resolution of only 352×240 pixels (for NTSC), one-quarter of the standard (ITU-R BT.601) broadcast resolution of 720×480. In addition to Video CD, which never caught on in the United States but is very well established in Asia, MPEG-1 is frequently used for video over networks at low bit-rates.

MPEG-2 was developed for applications allowing much higher bit-rates, and thus supports much better quality. It is commonly used for DVD, direct broadcast satellite, and digital cable. There are also many MPEG-2 nonlinear video editing systems, often supporting GOPs using all I-frames or only I- and P-frames.

The video coding scheme used in MPEG-2 is similar to that of MPEG-1, but MPEG-2 supports full broadcast resolution, as well as high definition (HDTV). MPEG-2 also offers greater flexibility. MPEG-1 handles progressive video only, while MPEG-2 handles interlaced video as well. MPEG-1 officially handles only CBR, while MPEG-2 also supports VBR. MPEG-2 also supports scalability of conformance, using profiles (describing functionalities) and levels (describing resolutions) so that every device is not required to support every level of performance.

What can be done in production and post to optimize video for MPEG-encoding?

At a given target bit-rate, MPEG encoding yields the best results when intraframe and interframe redundancy is highest. If you are shooting with MPEG delivery in mind, you can take this into account with static backgrounds, minimal camera movement (fixed angles, using a tripod), solid-color sets and costumes, etc.

Even when it's not possible or desirable to tailor your footage to compress well, there is still much that can be done to maintain image quality in compressed material. Always maintain the footage at the best possible quality right up to the point of encoding, preferably in a component format such as Digital Betacam, D1, or professional DV. If a program was finished on a nonlinear editing system that uses video compression, make a new source master for encoding in an online suite or on an uncompressed NLE. Also, if the original acquisition format was film, perform inverse telecine to remove the extra fields that were added when the material was transferred to video. Finally, preprocess the footage with video noise reduction to remove grain, stray pixels, etc.

Prepared with care and encoded on a high-quality system, MPEG video compression allows moving pictures to be stored and delivered at acceptable (and often brilliant) quality in a small fraction of their original bandwidth. All in all, that makes MPEG video not only a valuable tool, but also a very impressive achievement.


Phillip De Lancie is a freelance writer covering media production and delivery technologies in areas including DVD, professional audio, multimedia, sound for picture, and the Web. He can be contacted at delphi@mindspring.com.

SIDEBAR

Mpeg at the High End

Most people are familiar with the MPEG transmission standard for broadcast HDTV. But the ATSC standard, which references MPEG-2 Main Profile @ High Level, calls for a transmission rate of approximately 19.4Mbs. This represents a significantly compressed signal — approximately 80:1 compression.

But for purposes other than broadcast, there is no need to compress HD or film-acquired images to the ATSC standard, which must accommodate the limitations of broadcast bandwidth. In fact, there are people working with large, high-resolution digital images that make use of MPEG compression at 25Mbps, 35Mbps, and even 50Mbps to produce excellent large-format images. These images are more than acceptable for many large-screen display applications, in some cases replacing 70mm film projection systems in venues such as museums and theme parks. The top end of the MPEG specification provides for rates up to 100Mbs (High Profile @ High Level). Some claim that even the most expert eyes would not see any compression artifacts at 80Mbs (High 14.40 Profile @ High Level).

Does that make MPEG compression suitable for digital cinema? No decision has been made about the suitability of the MPEG algorithm for digital cinema, because as yet no standards exist for digital cinema. In theory, MPEG could be suitable at the top end of the spec, and in fact Grass Valley Group, a leading manufacturer of servers, is banking on MPEG-2 as the answer for digital cinema. Two other respected cinema server companies — Avica and EVS — have also embraced MPEG for its interopretability. However, until recently, the bulk of proprietary encoding in place or under development for digital cinema has revolved around wavelet-based and non-MPEG DCT (Discreet Cosign Transform) algorithms. The best-known of the digital cinema servers, from QuVis, uses wavelet-based compression.

At this point, the matter is by no means settled. In addition to compression issues, encoding for digital cinema is further complicated by issues of transport, security, storage, and piracy.

In short, while MPEG-2 is often confused with the low-resolution tradition of MPEG-1, or judged on the quality of broadcast HDTV, the specification can produce a wide range of excellent images at a range of resolutions, depending on application, level and profile, the power of the encoding and decoding technology, and other factors.



© 2008, PRIMEDIA Business Magazines & Media Inc. All rights reserved. This article is protected by United States copyright and other intellectual property laws and may not be reproduced, rewritten, distributed, redisseminated, transmitted, displayed, published or broadcast, directly or indirectly, in any medium without the prior written permission of PRIMEDIA Business Magazines & Media Inc.

Get Copyright Clearance Want to use this article? Click here for options!
© 2008, PRIMEDIA Business Magazines & Media Inc.

Print-friendly format E-mail this information
 
 
Contact Us      For Advertisers      Privacy Policy     

 

©2008, Penton Media, Inc. All rights reserved.