Multimedia

The largest section of the site, multimedia. Why so big? Because there is so much today in this media rich world that needs covering. Included is how your soundcard pumps out that sweet sound and how to make it sound better by speaker positioning. Also, The image compression techniques at work on the screen in front of you, and perhaps the audio compression at work to the mp3s you may be listening to. Not to mention, how you monitor displays all these videos and images, and how your videocard evolved into a powerful machine capable of spraying the screen with pretty pictures.

Audio

Sound
Sound is an important part of a computer gaming experience. It is also a large part in a computer theater system. And it has a lot of smaller uses, like playing CD's or encoded music on a computer. The average human ear can hear sounds between 16Hz and 25KHz, and can detect sound direction and orientation. This makes a good speaker setup imperative for high end audiophiles and audio enthusiasts.

Stereo and Mono
There is a distinct sound between monophonic (mono) sound and stereophonic (stereo) sound. The difference between one and two channels of sound is as clear and day and night. But stereo is only good for knowing if sounds are coming from the right, or from the left. It is hard to tell if sounds are coming from in front or behind. Stereo encoding also has a limited sweet spot where sounds are correct in stereo. Different audio samples might have different sweet spot locations, making it extremely hard to be perfectly lined up.

3D Sound
For theater like sound while watching movies on DVD, or when playing very engaging games a good sound setup is very important to "complete" the experience.

Analog to Digital Conversions
Analog audio is converted to digital audio by sampling. The number of samples taken each second is known as the sampling rate. The quality of the samples are known as the bit rate. The bit rate represents the number of differences in the audio that the conversion will be able to detect. A bit rate of 8-bits will allow for 256 different tones to be detected. 16-bit bit rate will allow for 65536 different tones, allowing for a much truer sound. All music on audio CD's is in 44.1KHz at 16-bit. Radio music is comparable to 22KHz at 8-bit.


Analog audio is converted to digital audio by a Analog-to-Digital Converter, known as a ADC, which is part of most sound cards. When audio is in uncompressed digital form, it is know as Digital Pulse Code. Digital audio is converted to analog by a Digital-to-Analog Converter, which is known as a DAC. A DAC is located in every sound card, and in some newer speakers. Some of the newer speakers allow digital audio to bypass the sound card to be output through a S/PDIF or USB port directly to the speakers where it will be converted by a DAC. The advantage of this is that the digital signal suffers less distortion than the audio signal normally would while being output to the speakers.

3D Virtualization
Stereo sound is limited to two channels, but so are our ears. Many companies are researching what is know as head related transfer functions. The human ear detects sound orientation by measuring Interaural Intensity Difference; the strength of the sound in each ear, and Interaural Time Difference, the delay between sound reaching each ear, and the use of a Doppler effect with moving sounds to create distance.

There are many hardware and software products which will simulate 3D sound out of 2 speakers. It is hard to simulate this sound without multiple channel encodings. Many sound cards are capable of multiple speaker simulation out of 2 speakers, by analyzing the multiple channels which are produced by a computer game or encoded in a movie. This is good, but sounds often don't sound as real as with a complete sound system.

3D Sound Standards
Dolby Pro Logic Surround
This sound standard uses 4 discrete channels. There is a front-left, front-center, front-right, and a limited frequency ambience channel for the rear. The rear channel is made to be played equally through 2 rear speakers. The rear channel is limited to frequencies between 100Hz and 7000Hz, and this is enough because this channel is only to add ambient noise. All main sound comes from the front 3 speakers. A subwoofer is often used by connecting it as a crossover to the front channels.

Dolby Pro Logic sound is encoded into a stereo signal and can be played through normal speakers, with minimum distortion, although a difference can be noticed. How this is done is that the center channel is added equally to the two stereo channels, canceling itself out. The rear ambient signal is added to the stereo signal by dividing itself into two sounds perfectly 180 degrees out of phase with each other. This encoding suffers limitations, by is still capable of producing realistic sounds. Any distortion or errors can usually be compensated for by robust decoder analysis.

Being able to be encoded into a stereo signal is a major advantage to Dolby Pro Logic. This means it can be played and stored in regular stereo formats, such as VHS or radio. When played, either through a sound card or other hardware, the stereo signal is sent to a pro logic decoder before reaching the speakers. The pro logic decoder separates the stereo signal into its proper 4 discrete channels for playback.

Dolby Digital 5.1
Dolby Digital, also know AC-3 is top of the line for audio recording and playback. It uses 6 discrete channels and provides true surround sound. It is called 5.1 because their are 5 full frequency channels; front-right, front-center, front-left, rear-right, and rear-left, and one low frequency channel which operates at frequencies between 3Hz and 120Hz. The low frequency emitter is specifically designed for subwoofer output, and provides truer bass sound because the low frequencies are not overpowered by high frequencies.

Dolby Digital is a compressed digital signal, which uses a lossy compression scheme. Most sound cards are able to downsample the digital 6 channel signal into 2 speakers, and some aren't. Computers are currently not able to process and playback 6 channel AC-3 or DTS, but instead have to output the signal to a device that can. The signal has to be output as a digital signal because it is compressed and specially encoded, so the sound card must have a digital S/PDIF connector which is connected to an external decoder.

Digital Theatre Sound (DTS), is a very similar AC-3, but is a higher quality signal. It is what is used in almost all theaters, and used in some high quality audio recordings. DTS is used in most theaters, and AC-3 is used for high end audio. AC-3 is used in DVD movies and is planned to be used in the upcoming High Definition TV.

Quad Output
A 4 discrete channel sound system is used for 3D games. To support Quad speakers, the game must use either the EAX or A3D API's. Because computer sound is made in respect to a 3 dimension rendered world, it is very easy to position sound on a 4 speaker system. There is a discrete channel for each direction, two left and right speakers for the front and 2 for the rear. A subwoofer is often used as a crossover between the two front speakers.

This is setup is not only efficient, but also economical. There is no external decoder needed because the signals are never encoded like the Dolby audio. Sound is output from the sound card using 2 ordinary stereo outs, one for the front, and one for the rear.

3D Audio API and Algorithms
DirectSound3D API
DirectSound3D is part of the DirectX API. It is a more basic form of 3D positional audio, and requires no propriety hardware to implement. Almost all newer sound cards support DirectSound3D in hardware, but for those that don't, the exact same sounds can be processed by the CPU. This isn't a good idea in most cases because DirectSound3D is more complicated than anyone really thinks, and can become a burden to even the fastest CPUs.

DirectSound3D is an improvement over DirectSound, in that it is capable of more than stereo sounds. DirectSound3D is capable of almost an infinite number of speakers, depending on how it is implemented in the sound card hardware. What DirectSound 3D does is receive information about sounds, such as their content, their 3 dimensional coordinates in X,Y, and Z space, the sound receiver X,Y, and Z coordinates and the sound emission properties. The sound properties include whether the sound was emitted spherically or conically, whether the sound source was in motion, the orientational direction compared to the receiver, and atmosphere conductivity.

There are currently 3 different DirectSound3D sound renderers. The first is a software only based solution, which is the fastest to process but least accurate. There are 2 other algorithms, which were designed to run only in hardware but are also capable of being processed by the CPU, if the game developer finds it necessary. They both have different implementations of HRTF, head related transfer functions, and the choice is up to whichever the program finds is best suited for the task.

Environmental Audio Extensions (EAX)
EAX is only found enabled through hardware in Creative Labs cards, but can be done through other sound card drivers which emulate the correct functions. There is not software only based solution because the sound rendering tasks are far to complicated to prevent a serious performance hit. What this API is designed to do is to add preset reverberation, that is a blend of different echoes, to simulate different environments. This makes a clear distinction in sounds whether they are in a small room, large theatre, cave, outside, and so on.

A3D
This is a hardware only based solution made by Aureal that is based on highly accurate and mathematical sound reflections. Unlike all other sound API's, this actually models a 3 dimentional sound stage and allows the ability for sound reflections, instead of just pre-programmed generic echoes. This was originally designed by NASA, for use in their simulations, but has found a good home in 3D gaming. A3D 2.0 is capable of 16 3D generica sound streams, plus another 60 sound reflections. For games that do not use the sound reflections, the 60 extra sound channels can be combined with the original 16 for up to 76 discrete sounds.

Other Sound Algorithms
Sensura Algorithms are original reverberation algorithms that pay more attention to multi speaker set-ups, with emphasis on sound timing instead of on the strength of the sounds out of each speaker. It is also hardware compatible with EAX, although the quality of the implementation isn't as good as a true Creative Labs card.

QSound Algorithms (QSound) is another renderer that is designed to make more out of less. It can make mono sounds sound as if they were stereo, and can virtualize multi speaker setups out of only 2 speakers. This is the sound format used in Sega Dreamcast systems, where the average television is equipped with only 1 or two speakers.

Speaker Setups
Having the right speaker set-up is just as important as the number of speakers. Without the right speaker positioning and orientation, the sweet spot of the sound could be lost. The most important thing to remember is symmetry. Always have the speakers at the same height, preferably ear height (except for the sub which doesn't matter), and always have each corresponding speaker the same distance from its respective ear. The subwoofer sound is too low for human ears to be able to position, so the position of the subwoofer can be anywhere. The subwoofer is usually placed in front, were it is loudest.

2 Stereo Speakers (with or without Subwoofer)
The optimum positioning for stereo speakers are in a equilateral triangle formation. This means that there is an equal distance between the two speakers, and between each of the two speakers and you. There should be speakers off to 30 degrees to each side.

4 Speakers (Quad and 4.1)
This is the most set-up for high end gaming. Most newer gaming soundcards are capable of quad output, and are able to inexpensively use two pairs of speakers for a gaming solution. The correct orientation for this is in a square pattern with the sweet spot in the direct center. The speakers can either face each other like in the diagram below, or they all can face the center. It is up to the user's preference.

5 Speakers (Dolby Digital 5.1, AC-3, Dolby Pro Logic)
This is the high end home theatre solution. There are many ways to set up this configuration, the most popular is in the diagram below. The two front speakers, L and R, are orientated so that they are 45 degrees apart, that is 22.5 degrees to each side. The front speakers could all be orientated forward, like the center speaker is, and the rear speakers can be angled.

Compression Forms
Format Specific Details
Lossy Compression
Lossy compression schemes save the most important data with little or no noticeable quality depreciation. Lossy schemes save data size by making sacrifices and removing un-noticeable information. Once uncompressed, the original data that was compressed will not be exactly the same as the original. This is ok for some media, such as sound or video, but it is impossible to use on data.

Most lossy compression schemes are able to change the quality of their output by sacrificing their storage capacity. A good example is the JPEG lossy compression scheme.

WARNING : SOME BROWSERS ARE UNABLE TO HANDLE THESE IMAGE FORMATS

Unable To View Uncompressed
32450 bytes
24 bits/pixel
JPEG (setting 1)JPEG (setting 50)JPEG (setting 99)
15810 bytes3483 bytes1025 bytes
11.7 bits/pixel2.6 bits/pixel 0.76 bits/pixel

As you can see, the compression setting 1 looks vary similar to the original, and a setting that lies somewhere in between setting 1 and setting 50 would be the best size/quality trade off.

Lossless Compression
With lossless compression, no data is lost. Information is compressed without making compromises in quality. This is why any information that needs to remain perfect is compressed using lossless compression. All data compression formats like RAR and GZip are lossless, because information has to be exact.

WARNING : SOME BROWSERS ARE UNABLE TO HANDLE THESE IMAGE FORMATS

Unable To View
UncompressedLosslessLossy
7200 bytes902 bytes6523 bytes
4 bits/pixel0.5 bits/pixel3.6 bits/pixel

Lossless compression can be smaller in size, and a lot higher in quality over lossy compression. This is when few colours are used, there is a lot of repetition, and there are straight lines. The JPEG image to the right has severe problems in quality because it compresses using blending and averages. The straight lines in the pattern are smudged. The end result is that the original colours look smudged. Lossless does not suffer from these limitations, because the original information remains intact. Lossless compression is built more around compressing patterns then random images, so for this example it resulted in a smaller file size because of its greater compression ratio.

With lossy compression, data is lost every time the image is encoded so it is bad to edit from a lossy format. Instead, an original copy should be saved in a lossless format, and every time that the image needs to be edited, the lossless image should be used.

Quality Loss of Level 50 JPEG Compression
Original
Compressed
Difference

There is obviously some image quality loss. The difference was calculated by using a difference filter in a graphics editing application, and then creating a negative image to make the difference easier to see.
Zip Compression
Zipping is by far the most widely used compression scheme. Despite its less effective compression algorithm, ZIP has become tremendously popular, with hundreds of programs based around it. The most popular is WinWip, but almost every compression program ever made supports this format. The reason for this is that the algorithm is free. This means any program can use ZIP to compress files without having to pay royalties.

GZip Compression
GZip is also very popular, and for the same reason is that it is free. Many programming languages have gzip support built in. It is versatile because it can be compressed/decompressed on the fly rapidly, good for slower connections like the Internet. GZip can compress files in less than half of the time it takes other programs to do it. Unlike the other more featured compression formats, GZip is unable to compress multiple files into one file. GZip was designed to compress Unix and *nix TAR files. TAR files are collections of files which are neatly organized and searchable, but without compression.

RAR Compression
RAR is made by only one company called RARSoft. Despite not having distributed algorithm or multi program support, RAR is very popular because of its compression and speed. A version of RAR is available for almost every OS.

Ace Compression
WinAce was designed be a Germany company called WinAce. If offers incredible compression like WinRAR, but takes a little longer. It isn't as popular as the other file types (excluding CAB). So far it is only available for Microsoft Windows Operating Systems.

Microsoft Cabinet Compression
This was formed as a storage format used by Windows operating systems to store files. It is hardly ever used outside of a Microsoft program because Microsoft owns the exclusive rights to it. So far it is only available for Microsoft Windows Operating Systems, and I don't see that as anything that will change in the near future. Compression Formats

Images can use a variety of colour depths and alpha depths. Colours below 8-bit (256 colour) usually use what is called a lookup table or palette. Colours above this do not. 16-bit colour, also known as "High Colour" or "Thousands of Colours" with support for 65536 colours, and 24 bit colour , also known as "True Colour" or "Millions of Colours" with support for 16777216 colours. There is no physical 32-bit colour, only 24-bit colour plus 8 extra bits.
Bitmap Compression (BMP)
Bitmaps are images where each pixel is mapped with a bit value. Bitmaps are able to support an infinite number of colours per pixel, and are not palette based. Bitmaps are unable to store transparencies or alpha channels, and the only form of compression they are able to use is run length encoding.

Run Length Encoding
RLE compresses in a lossless nature by replacing consecutive values with a single codeword. For example, if a sequence of values existed like 34 34 34 34 34 34, it could be RLE encoded as 6 34, because there are 6 "34"s. This is good for images that are low colour depth and have a lot of similar blocks. This works at a low level and is lossless, so it can be used with almost any form of data.

GIF Compression
A compression algorithm patented by Compu-Serve. It is able to lossless compress images with up to an 8-bit palette, with palette transparency. No alpha channel transparency is available for this format, but it is able to store multiple layers. GIF's are able to create simple animations by successively alternating which layer is being displayed. GIF's can interlace to allow a progressive loading. GIF does not support true colour images, alpha channels, or gamma correction. And all use of the GIF format must be licensed, and royalties paid.
Portable Network Graphics Compression (PNG)
PNG image files are a lossless image compression that have succeeded from the GIF image format. PNG compression allows for up to 48-bit colours, and up to 8-bit palette colours. It supports 2 dimensional interlacing which allows the image to be read progressively, along with a full 24-bit alpha transparency, or palette transparency. It has automatic gamma correction. It offers better compression than the GIF format in almost every situation. Unlike GIF, it does not support multiple layers.
Joint Photographic Experts Group Compression (JPEG)
JPEG is a lossy compression that is only able to compress at 24-bit colour. This makes it best suited for high colour images. JPEG compression compresses by removing extraneous data that would go unnoticed by most people. This extraneous data can be in the form of small detailed lines, colour hues, or brightness levels.

JPEG is able to progressively load images by dividing the JPEG file into a series of scans. Each scan layer adds more data to the image, and this data will pile up to create the image. Simple JPEGs, which are know as baseline JPEGs can easily be converted to progressive JPEGs with special software. This software will re-arrange the data to how it is stored in a progress JPEG file, without decompressing and recompressing it. JPEG does not support transparencies of any kind.

Lossless Joint Photographic Experts Group Compression (JPEG-LS, LOCO)
This compression is a slightly different format than JPEG, but has a very similar name. It is a lossless compression doesn't support transparencies or palettes of any kind, so that it is still limited to a 24-bit colour depth. This makes it inefficient for simple coloured images. Almost all software doesn't support lossless JPEG because it is a really inefficient encoding.

JPEG File Interchange Format (JFIF)
This is the most commonly used format for JPEG files. It is able to only store a lossy compressed image.

Audio Compression
Audio, like video, is rarely uncompressed. The exception being CD-ROM audio. Audio can be very large is file size, so compression schemes have been made. The simplest is RLE, but this was designed to be a low level compression scheme that operated through sound card hardware. Sound can also be compressed with lossy compression. Like video, there are certain features that humans can not detect. In sound, most humans can not hear sounds below 16Hz or higher then 25KHz. This range of sound is included in uncompressed recordings, but is removed by almost all lossy audio compressions.

MP3
The most popular sound compression is MPEG Layer 3. This is the sound portion of the MPEG video compression format. This compresses by allocating a certain number of bytes for each group of sound samples. This is called its bitrate. What is does with those bits is assign them to the most important sound changes in the sample first, and then progressively records the rest with the remaining bits. With low bitrates, most small sound changes are removed, leaving only the more prominent.

Video Compression
Video files are extremely large when uncompressed, so there have been numerous different video codecs designed. Of these, the MPEG series tend to offer the best quality / performance. Almost all video compression formats are lossy, to make sure files sizes stay low. In order to keep good compression while maintaining quality, some formats, such as MPEG-2 are very complex. MPEG-2 requires dedicated hardware for decompression on computers slower then 300MHz, and requires expensive dedicated hardware for real-time compression, on even most high-end workstations.

MPEG-1 Compression
This was the first video compression scheme designed by the Motion Pictures Expert Group. This was designed for all purpose video compression, and like JPEG compression, it is lossy and allows for quality scaling. It was designed to compress full broadcast quality video at rates of 3MB to 5MB. MPEG is more lossy then MPEG2, but it is also much quicker to compress and decompress. It is everything you would expect from a standard designed for all around use.

MPEG-2 Compression
A lossy compression standard that is used to store audio-visual at a much higher quality than MPEG-1. MPEG-2 is capable of encoding at rates of 3Mbps to 10Mbps, and encoding full broadcast quality 720x486 resolutions with AC-3 audio.

MPEG-4 Compression
This is the latest codec which is specifically designed for the low bandwidth of the internet. It tries to balance between quality and file size, allowing movies to be streamed, instead of having to be completely downloaded beforehand. It can range between 10kbps to 10Mbps, but because it is optimized for low quality, it will not have the quality that MPEG-2 has even at its highest setting. For this codec, there have post processes which will smooth edges, remove small artifacts, and attempt to reconstruct what is lost in compression.

MJPEG Compression
This is a codec that compresses each frame individually as a series of JPEG images. This allows for quick editing, but doesn't allow for the 3 dimensional compression that other continuous compressions have.

Audio Video Interleave Compression
AVI is a standard made by Microsoft. AVI can use many codecs and also can save video feed in uncompressed form, called full frames. AVI video can range in colour depth, from 8-bit to 32-bit.

Audio Compression
Audio, like video, is rarely uncompressed. The exception being CD-ROM audio. Audio can be very large is file size, so compression schemes have been made. The simplest is RLE, but this was designed to be a low level compression scheme that operated through sound card hardware. Sound can also be compressed with lossy compression. Like video, there are certain features that humans can not detect. In sound, most humans can not hear sounds below 16Hz or higher then 25KHz. This range of sound is included in uncompressed recordings, but is removed by almost all lossy audio compressions.

MP3
The most popular sound compression is MPEG Layer 3. This is the sound portion of the MPEG video compression format. This compresses by allocating a certain number of bytes for each group of sound samples. This is called its bitrate. What is does with those bits is assign them to the most important sound changes in the sample first, and then progressively records the rest with the remaining bits. With low bitrates, most small sound changes are removed, leaving only the more prominent.

Video Compression
Video files are extremely large when uncompressed, so there have been numerous different video codecs designed. Of these, the MPEG series tend to offer the best quality / performance. Almost all video compression formats are lossy, to make sure files sizes stay low. In order to keep good compression while maintaining quality, some formats, such as MPEG-2 are very complex. MPEG-2 requires dedicated hardware for decompression on computers slower then 300MHz, and requires expensive dedicated hardware for real-time compression, on even most high-end workstations.

MPEG-1 Compression
This was the first video compression scheme designed by the Motion Pictures Expert Group. This was designed for all purpose video compression, and like JPEG compression, it is lossy and allows for quality scaling. It was designed to compress full broadcast quality video at rates of 3MB to 5MB. MPEG is more lossy then MPEG2, but it is also much quicker to compress and decompress. It is everything you would expect from a standard designed for all around use.

MPEG-2 Compression
A lossy compression standard that is used to store audio-visual at a much higher quality than MPEG-1. MPEG-2 is capable of encoding at rates of 3Mbps to 10Mbps, and encoding full broadcast quality 720x486 resolutions with AC-3 audio.

MPEG-4 Compression
This is the latest codec which is specifically designed for the low bandwidth of the internet. It tries to balance between quality and file size, allowing movies to be streamed, instead of having to be completely downloaded beforehand. It can range between 10kbps to 10Mbps, but because it is optimized for low quality, it will not have the quality that MPEG-2 has even at its highest setting. For this codec, there have post processes which will smooth edges, remove small artifacts, and attempt to reconstruct what is lost in compression.

MJPEG Compression
This is a codec that compresses each frame individually as a series of JPEG images. This allows for quick editing, but doesn't allow for the 3 dimensional compression that other continuous compressions have.

Audio Video Interleave Compression
AVI is a standard made by Microsoft. AVI can use many codecs and also can save video feed in uncompressed form, called full frames. AVI video can range in colour depth, from 8-bit to 32-bit.

Multi-Compression

Audio Compression
Audio, like video, is rarely uncompressed. The exception being CD-ROM audio. Audio can be very large is file size, so compression schemes have been made. The simplest is RLE, but this was designed to be a low level compression scheme that operated through sound card hardware. Sound can also be compressed with lossy compression. Like video, there are certain features that humans can not detect. In sound, most humans can not hear sounds below 16Hz or higher then 25KHz. This range of sound is included in uncompressed recordings, but is removed by almost all lossy audio compressions.

MP3
The most popular sound compression is MPEG Layer 3. This is the sound portion of the MPEG video compression format. This compresses by allocating a certain number of bytes for each group of sound samples. This is called its bitrate. What is does with those bits is assign them to the most important sound changes in the sample first, and then progressively records the rest with the remaining bits. With low bitrates, most small sound changes are removed, leaving only the more prominent.

Video Compression
Video files are extremely large when uncompressed, so there have been numerous different video codecs designed. Of these, the MPEG series tend to offer the best quality / performance. Almost all video compression formats are lossy, to make sure files sizes stay low. In order to keep good compression while maintaining quality, some formats, such as MPEG-2 are very complex. MPEG-2 requires dedicated hardware for decompression on computers slower then 300MHz, and requires expensive dedicated hardware for real-time compression, on even most high-end workstations.

MPEG-1 Compression
This was the first video compression scheme designed by the Motion Pictures Expert Group. This was designed for all purpose video compression, and like JPEG compression, it is lossy and allows for quality scaling. It was designed to compress full broadcast quality video at rates of 3MB to 5MB. MPEG is more lossy then MPEG2, but it is also much quicker to compress and decompress. It is everything you would expect from a standard designed for all around use.

MPEG-2 Compression
A lossy compression standard that is used to store audio-visual at a much higher quality than MPEG-1. MPEG-2 is capable of encoding at rates of 3Mbps to 10Mbps, and encoding full broadcast quality 720x486 resolutions with AC-3 audio.

MPEG-4 Compression
This is the latest codec which is specifically designed for the low bandwidth of the internet. It tries to balance between quality and file size, allowing movies to be streamed, instead of having to be completely downloaded beforehand. It can range between 10kbps to 10Mbps, but because it is optimized for low quality, it will not have the quality that MPEG-2 has even at its highest setting. For this codec, there have post processes which will smooth edges, remove small artifacts, and attempt to reconstruct what is lost in compression.

MJPEG Compression
This is a codec that compresses each frame individually as a series of JPEG images. This allows for quick editing, but doesn't allow for the 3 dimensional compression that other continuous compressions have.

Audio Video Interleave Compression
AVI is a standard made by Microsoft. AVI can use many codecs and also can save video feed in uncompressed form, called full frames. AVI video can range in colour depth, from 8-bit to 32-bit.