Chapter One Introduction To Multimedia Systems 1.1. What Is Multimedia?
Chapter One Introduction To Multimedia Systems 1.1. What Is Multimedia?
CHAPTER ONE
Magnetic tape
Hard disk
Optical disk
DVDs
CD-ROMs, etc.
Presentation – refers to the type of physical means to reproduce information to the user.
Speakers
Video windows, etc.
Speech
Music
Film
Multi – multiple/many
Media – source
This includes:
text
graphics
audio
video
images
Multiple sources of information and it is a system which integrates all the above
types.
Computer information that can be represented in audio, video and animated
format in addition to traditional format. The traditional formats are text and
graphics.
Multimedia is the field concerned with the computer controlled integration of text,
graphics, drawings, still and moving images (video), animation, and any other media
where every type of information can be represented, stored, transmitted, and processed
digitally.
Multimedia is closely tied to the World Wide Web (WWW). Without networks,
multimedia is limited to simply displaying images, videos, and sounds on your local
machine. The true power of multimedia is the ability to deliver this rich content to a large
audience.
Features of Multimedia
Newspaper was perhaps the first mass communication medium, which used mostly text,
graphics, and images. In 1895, Gugliemo Marconi sent his first wireless radio
transmission at Pontecchio, Italy. A few years later (in 1901), he detected radio waves
beamed across the Atlantic. Initially invented for telegraph, radio is now a major medium
for audio broadcasting. Television was the new media for the 20th century. It brings the
video and has since changed the world of mass communications.
1945 - Vannevar Bush (1890-1974) wrote about Memex. MEMEX stands for
MEMory Extension and it amounts to hypermedia system. A Memex is a device in
which an individual stores all his books, records, and communications, and which is
mechanized so that it may be consulted with exceeding speed and flexibility. It is an
enlarged intimate supplement to his memory.
1960s-Ted Nelson started Xanadu project (Xanadu – a kind of deep Hypertext).
Project Xanadu was the explicit inspiration for the World Wide Web, for Lotus
Notes and for HyperCard, as well as less-well-known systems.
1.4. Hypermedia/Multimedia
Hypertext
Hypermedia
The World Wide Web (WWW) is the best example of hyper-media applications.
PowerPoint
Adobe Acrobat
Macromedia Director
1. Very high processing speed (processing power): Why? Because, there are large data
to be processed. Multimedia systems deals with large data and to process data in real
time, the hardware should have high processing capacity.
2. It should support different file formats: Why? Because we deal with different data
types (media types).
3. Efficient and High Input-output: input and output to the file subsystem needs to be
efficient and fast. It has to allow for real-time recording as well as playback of data.
e.g. Direct to Disk recording systems.
4. Special Operating System: to allow access to file system and process data efficiently
and quickly. It has to support direct transfers to disk, real-time scheduling, fast
interrupt processing, I/O streaming, etc.
5. Storage and Memory: large storage units and large memory are required. Large
Caches are also required.
6. Network Support: Client-server systems common as distributed systems common.
7. Software Tools: User-friendly tools needed to handle media, design and develop
a) Synchronization issue: Since variety of media is used at the same instance, there
should be some relationship between the media, e.g. between movie (video) and
sound.
b) Data conversion: Since data is represented digitally, we have to convert analog data
into digital data.
c) Compression and decompression: Why? Because multimedia deals with large
amount of data (e.g. Movie, sound, etc) which takes a lot of storage space.
d) Render different data at same time — continuous data.
CHAPTER TWO
Authoring tools provide an integrated environment for binding together the different
elements of a multimedia production.
Multimedia authoring tools provide tools for making a complete multimedia presentation
where users usually have a lot of interactive controls.
Orientation
Capabilities, and
Learning curve: how easy it is to learn how to use the application
Some of the features that we have to take into consideration when selecting authoring
tools are:
1) Editing Feature – editing feature for multimedia data especially image and text are
often included in authoring tools. The more editors in your authoring system, the less
specialized editing tools you need. The editors that come with authoring tools offer
only subset of features found in dedicated in editing tool. If you need more capability,
still you have to go to dedicated editing tools (e.g. sound editing tools for sound
editing).
2) Organizing feature – the organization of media in your project involves navigation
diagrams, or flow charts, etc. Some authoring tools provide a visual flowcharting
facility. Such features help you for organizing the project. E.g. Icon Author and
Author Ware use flowcharting and navigation diagram method to organize media.
3) Programming feature – there are different types of programming approach:
3.1 Visual programming: this is programming using cues, icons, and objects. It is done
using drag and drop. To include sound in our project, we drag and drop it in stage.
Advantage: the simplest and easiest authoring process. It is particularly useful for slide
show and presentation.
3.2 Programming with scripting language: Some authoring tool provide very high level
scripting language and interpreted scripting environment. This helps for navigation
control and enabling user input.
3.3 Programming with traditional language such as Basic or C. Some authoring tools
provide traditional programming tools like program written in C. We can call these
programs to authoring tools. Some authoring tools allow calling DLL (Dynamic
Link Library).
3.4 Document development tools
4) Interactivity feature – interactivity offers to the end user of the project to control the
content and flow of information. Some of interactivity levels are:
i) Simple branching: enables the user to go to any location in the presentation using key
press, mouse click, etc.
ii) Conditional branching: branching based on if-then decisions
iii) Structured branching: support complex programming logic such as nested if-then
subroutines.
5) Performance-tuning features – accomplishing synchronization of multimedia is
sometimes difficult because performance varies with different computers. In such
cases we need to use authoring tools own scripting language to specify time and
sequence on system.
6) Playback feature – easy testing of the project. Testing enables us to debug the system
and find out how the user interacts with it. We don’t waste time in assembling and
testing the project.
7) Delivery feature – delivering our project needs building runtime version of the
project using authoring tools.
2.3. Multimedia System Requirement
A. Software Requirement
B. Hardware Requirement
A. Software Requirement
This software provide 3D clip art object such as people, furniture, building, car, airplane,
tree, etc. We can use these objects in our project easily.
Examples:
3Ds Max
Maya
Logo motion
Softimage
2. Text Editing and Word Processing Tools:
Word processors are used for writing letters, invoices, project content, etc. They include
features like:
spell check
table formatting
thesaurus
templates ( e.g. letters, resumes, & other common documents)
Examples:
Microsoft Word,
Word perfect,
Open Office Word
Notepad
In word processors, we can actually embed multimedia elements such as sound, image,
and video.
They are used to edit sound (music, speech, etc). The user can see the representation of
sound in fine increment, score or wave form. User can cut, copy, and paste any portion of
the sound to edit it. We can also add other effects such as distort, echo, pitch, etc.
Examples:
Compiled by: - Gemechu Boche 13 | P a g e
Multimedia Systems (ITec3121) Wollega University, Department of IT
Sound Forge
Audacity
Cool Edit
4. Multimedia Authoring Tools:
Multimedia authoring tools provide important framework that is needed for organizing
and editing objects included in the multimedia project (e.g. graphics, animation, sound,
video, etc). They provide editing capability to limited extent.
Examples:
Macromedia Flash
Macromedia Director
Macromedia Authoware
5. OCR Software:
This software convert printed document into electronically recognizable ASCII character.
It is used with scanners. Scanners convert printed document into bitmap. Then, it can
break the bitmap into pieces according to whether it contains text or graphics. This is
done by examining the texture and density of the bitmap and by detecting edges. That is,
To do the above process, this software uses probability and expert system.
To include printed documents into our project without typing from keyboard.
To include documents in their original format, e.g. signatures, drawings, etc.
Examples:
To create graphics for web and other purposes, painting and editing tools are crucial.
Painting Tools – are also called image-editing tools. They are used to edit images of
different format. They help us to retouch and enhance bitmap images. Some painting
tools allow to edit vector based graphics too.
Examples:
Macromedia Fireworks
Adobe Photoshop
Examples:
Macromedia Freehand
CorelDraw
Adobe Illustrator
7. Video Editing
Animation and digital video movie are sequence of bitmapped graphic frames rapidly
played back. Some of the tools to edit video include:
Adobe premier
Adobe After Effects
Desk share Video Edit Magic
Video shop
These applications display time references (relationship between time & the video),
frame counts, audio, transparency level, etc.
B. Hardware Requirement
Multimedia products require high storage capacity than text-based data. Huge drives are
essential for the enormous files used in multimedia and audiovisual creation.
II. Storage Devices – Large capacity storage devices are necessary to store multimedia
data. These are:
Floppy Disk: not sufficient to store multimedia data. Because of this, they are not used to
store multimedia data.
Hard Disk: the capacity of hard disk should be high to store large data.
CD: is important for multimedia because they are used to deliver multimedia data to
users. A wide variety of data like:
DVD: have high capacity than CDs. Similarly, they are also used to distribute multimedia
data to users. Some of the characteristics of DVD:
To interact with multimedia system, we use keyboard, mouse, track ball, or touch screen,
etc.
Wireless mouse – It is important when the presenter has to move around during
presentation.
Touch Screen – We use fingers instead of mouse to interact with touch screen
computers.
o Infrared light: such touch screens use invisible infrared light that are projected
across the surface of screen. A finger touching the screen interrupts the beams
generating electronic signal. Then it identifies the x-y coordinate of the screen
where the touch occurred and sends signals to the operating system for processing.
o Texture-coated: such monitors are coated with texture material that is sensitive
towards pressure. When user presses the monitor, the texture material on the
monitor extracts the x-y coordinate of the location and send signals to operating
system
o Touch mate
Application areas of touch screen: Touch screens are used to display/provide information
in public areas such as:
Air ports
Museums
Transport service areas
Hotels, etc
user friendly
easy to use even for non technical people
easy to learn how to use
II. Information Entry Devices:
Graphical Tablets/ Digitizer – Both are used to convert points, lines, and curves from
sketch into digital format. They use a movable device called stylus.
Scanners – They enable us to use OCR software convert printed document into ASCII
file.
Microphones – They are important because they enable us to record speech, music, etc.
The microphone is designed to pick up and amplify incoming acoustic waves or
harmonics precisely and correctly and convert them to electrical signals. We have to
purchase a superior, high-quality microphone because our recordings will depend on its
quality.
Digital Camera and Video Camera Record (VCR) – are important to record and include
image and video in multimedia system respectively. Digital video cameras store images
as digital data, and they do not record on film. We can edit the video taken using video
camera and VCR using video editing tools.
Depending on the content of the project and how the information is presented, we need
different output devices. Some of the output devices are:
Speaker – If our project includes speeches that are meant to convey message to audience,
or background music, using speaker is obligatory.
Projector – It is used
Types of projector:
LCD projector
CRT projector
Plotter/Printer – When the situation arises to present using papers, we use printers and/or
plotters. In such cases, print quality of the device should be taken into consideration.
i) Modem: It stands for modulator demodulator, and it is used to convert digital signal
into analog signal for communication of the data over telephone line which can carry
only analog signal. At the receiving end, it does the reverse action, i.e., converts
analog to digital data.
Currently, the standard modem is called v.90 which has the speed of 56 kbps. Older
standards include v.34 which has the speed of 28 kbps.
Types of modem:
External
Internal
Data is transferred through modem in compressed format to save time and cost.
ii) ISDN: stands for Integrated Services Digital Network. It is circuit switched telephone
network system designed to allow digital transmission of voice and data over ordinary
telephone copper wires. This has the advantage of better quality and higher speeds
than available with analog systems.
It has higher transmission speed, i.e., faster data transfer rate.
They use additional hardware hence they are more expensive
Data is transferred through modem in compressed format to save time and cost.
iii) Cable modem: uses existing cables stretched for television broadcast reception. The
data transfer rate of such devices is very fast, i.e., they provide high bandwidth. They
are primarily used to deliver broadband internet access taking advantage of unused
bandwidth on a cable television network.
iv) DSL: provide digital data transmission over the telephone wires of local telephone
network. The speed of DSL is faster than using telephone line with modem. How?
They carry a digital signal over the unused frequency spectrum (analog voice
transmission uses limited range of spectrum) available on the twisted pair cables
running between the telephone company's central office and the customer premises.
CHAPTER THREE
DATA REPRESENTATIONS
Pixel (picture element) contains the color or hue and relative brightness of that point in
the image. The number of pixels in the image determines the resolution of the image.
Types of Images
There are two basic forms of computer graphics: bit-maps and vector graphics.
The kind we use determines the tools we choose. Bitmap formats are the ones used for
digital photographs. Vector formats are used only for line drawings.
They are formed from pixels—a matrix of dots with different colors. Bitmap images are
defined by their dimension in pixels as well as by the number of colors they represent.
For example, a 640X480 image contains 640 pixels and 480 pixels in horizontal and
vertical direction respectively. If we enlarge a small area of a bit-mapped image, we can
clearly see the pixels that are used to create it.
a) Monochrome/Bit-Map Images
Each pixel is stored as a single bit (0 or 1).
The value of the bit indicates whether it is light or dark.
A 640 x 480 monochrome image requires 37.5 KB of storage.
They are really just a list of graphical objects such as lines, rectangles, ellipses, arcs, or
curves called primitives. Draw programs, also called vector graphics programs, are used
to create and edit these vector graphics. These programs store the primitives as a set of
numerical coordinates and mathematical formulas that specify their shape and position in
the image. This format is widely used by computer-aided design programs to create
detailed engineering and design drawings. It is also used in multimedia when 3D
animation is desired. Draw programs have a number of advantages over paint-type
programs.
Image Resolution
Image resolution refers to the spacing of pixels in an image and is measured in pixels per
inch (ppi), sometimes called dots per inch (dpi). The higher the resolution, the more
pixels in the image. A printed image that has a low resolution may look pixelated or
made up of small squares, with jagged edges and without smoothness.
Image size refers to the physical dimensions of an image. Because the number of pixels
in an image is fixed, increasing the size of an image decreases its resolution and
decreasing its size increases its resolution.
Choosing the right file type for our image to save in is of vital importance. If we are, for
example, creating image for web pages, then it should load fast. So such images should
be small size. The other criteria to choose file type is taking into consideration the quality
of the image that is possible using the chosen file type. We should also be concerned
about the portability of the image.
The most common formats used on internet are the GIF, JPG, and PNG.
GIF
PNG
for JPG, because it is a loss-less compression format which results in large file
size.
Provides transparency using alpha value.
Supports interlacing.
PNG can be animated through the MNG extension of the format, but browser
support is less for this format.
JPEG/JPG
TIFF
Tagged Image File Format (TIFF), stores many different types of images (e.g.,
monochrome, grayscale, 8-bit & 24-bit RGB, etc.).
Uses tags, keywords defining the characteristics of the image that is included in
the file. For example, a picture 320 by 240 pixels would include a 'width' tag
followed by the number '320' and a 'depth' tag followed by the number '240'.
Developed by the Aldus Corp. in the 1980’s and later supported by the Microsoft.
TIFF is a lossless format (when not utilizing the new JPEG tag which allows for
JPEG compression).
It does not provide any major advantages over JPEG and is not as user-
controllable.
It does not use TIFF for web images. They produce big files, and more
importantly, most web browsers will not display TIFFs.
PAINT was originally used in MacPaint program, initially only for 1-bit
monochrome images.
PICT is a file format that was developed by Apple Computer in 1984 as
the native format for Macintosh graphics.
The PICT format is a meta-format that can be used for both bitmap images and
vector images though it was originally used in MacDraw (a vector based
drawing program) for storing structured graphics.
Still an underlying Mac format (although PDF on OS X).
X-windows: XBM
What is Sound?
Sound is produced by a rapid variation in the average density or pressure of air molecules
above and below the current atmospheric pressure. We perceive sound as these pressure
fluctuations cause our eardrums to vibrate. These usually minute changes in atmospheric
pressure are referred to as sound pressure and the fluctuations in pressure as sound
waves. Sound waves are produced by a vibrating body, be it a guitar string, loudspeaker
cone or jet engine. The vibrating sound source causes a disturbance to the surrounding air
molecules, causing them bounce off each other with a force proportional to the
disturbance. The back and forth oscillation of pressure produces a sound waves.
Digitizing Sound
This creates a need to convert Analog audio to Digital audio specialized hardware. This is
also known as sampling.
There are two basic types of audio files: the traditional discrete audio file that we can
save to a hard drive or other digital storage medium, and the streaming audio file that we
listen to as it downloads in real time from a network/internet server to our computer.
Common discrete audio file formats include WAV, AIF, AU and MP3. A fifth format,
called MIDI is actually not a file format for storing digital audio, but a system of
instructions for creating electronic music.
i) WAV
The WAV format is the standard audio file format for Microsoft Windows applications,
and is the default file type produced when conducting digital recording within Windows.
It supports a variety of bit resolutions, sample rates, and channels of audio. This format is
very popular upon IBM PC (clone) platforms, and is widely used as a basic format for
saving and modifying digital audio data.
ii) AIF/AIFF
The Audio Interchange File Format (AIFF) is the standard audio format employed by
computers using the Apple Macintosh operating system. Like the WAV format, it
supports a variety of bit resolutions, sample rates, and channels of audio and is widely
used in software programs used to create and modify digital audio.
iii) AU
The AU file format is a compressed audio file format developed by Sun Microsystems
and popular in the Unix world. It is also the standard audio file format for the Java
programming language. Only supports 8-bit depth thus cannot provide CD-quality sound.
iv) MP3
MP3 stands for Motion Picture Experts Group, Audio Layer 3 Compression. MP3 files
provide near-CD-quality sound but are only about 1/10th as large as a standard audio CD
file. Because MP3 files are small, they can easily be transferred across the Internet and
played on any multimedia computer with MP3 player software.
v) MIDI
MIDI (Musical Instrument Digital Interface) is not a file format for storing or
transmitting recorded sounds, but rather a set of instructions used to play electronic music
on devices such as synthesizers. MIDI files are very small compared to recorded audio
file formats. However, the quality and range of MIDI tones is limited.
Definition of MIDI:
MIDI is a protocol that enables computer, synthesizers, keyboards, and other musical
device to communicate with each other. This protocol is a language that allows
interworking between instruments from different manufacturers by providing a link that
is capable of transmitting and receiving digital data. MIDI transmits only commands; it
does not transmit an audio signal.
a) Synthesizer:
It is a sound generator (various pitch, loudness, tone color).
A good (musician’s) synthesizer often has a microprocessor, keyboard, control
panels, memory, etc.
b) Sequencer:
It can be a stand-alone unit or a software program for a personal computer. (It
used to be a storage server for MIDI data. Nowadays it is more a software music
editor on the computer.)
It has one or more MIDI INs and MIDI OUTs.
1) Track:
Track in sequencer is used to organize the recordings.
Tracks can be turned on or off on recording or playing back.
2) Channel:
Multitimbral – capable of playing many different sounds at the same time (e.g.,
piano, brass, drums, etc.)
4) Pitch:
The Musical note that the instrument plays
5) Voice:
Voice is the portion of the synthesizer that produces sound.
Synthesizers can have many (12, 20, 24, 36, etc.) voices.
Each voice works independently and simultaneously to produce sounds of
different timbre and pitch.
6) Patch:
The control settings that define a particular timbre.
MIDI connectors:
MIDI IN: the connector via which the device receives all MIDI data.
MIDI OUT: the connector through which the device transmits all the MIDI data it
generates itself.
MIDI THROUGH: the connector by which the device echoes the data receives from
MIDI IN.
MIDI Messages
MIDI messages are used by MIDI devices to communicate with each other.
Note On Command
Which Key is pressed
Which MIDI Channel (what sound to play)
3 Hexadecimal Numbers
Compiled by: - Gemechu Boche 33 | P a g e
Multimedia Systems (ITec3121) Wollega University, Department of IT
Advantages of MIDI:
Because MIDI is a digital signal, it's very easy to interface electronic instruments
to computers, and then do manipulations on the MIDI data on the computer with
software. For example, software can store MIDI messages to the computer's disk
drive. Also, the software can playback MIDI messages upon all 16 channels with
the same rhythms as the human who originally caused the instrument(s) to generate those
messages.
A MIDI files stores MIDI messages. These messages are commands that tell a musical
device what to do in order to make music. For example, there is a MIDI message that
tells a device to play a particular note. There is another MIDI message that tells a device
to change its current "sound" to a particular patch or instrument.
The MIDI files also stores timestamps, and other information that a sequencer needs to
play some "musical performance" by transmitting all of the MIDI messages in the file to
all MIDI devices. In other words, a MIDI file contains hundreds (to thousands) of
instructions that tell one or more sound modules (either external ones connected to your
sequencer's MIDI Out, or sound modules built into your computer's sound card) how to
reproduce every single, individual note and nuance of a musical performance.
A WAVE and MP3 files store a digital audio waveform. This data is played back by a
device with a Digital to Analog Converter (DAC) such as computer sound card's DAC.
There are no timestamps, or other information concerning musical rhythms or tempo
stored in a WAVE or MP3 files. There is only digital audio data.
CHAPTER FOUR
Figure 4.2: White light composed of all wavelengths of visible light incident on a pure blue
object. Only blue light is reflected from the surface
The Human Retina
The eye is basically just a camera. Each neuron is either a rod or a cone.
Rods: are not sensitive to color. They are sensitive only to intensity of light. They are effective in
dim light and sense differences in light intensity - the flux of incident photons. Because rods are
not sensitive to color, in dim light we perceive colored objects as shades of grey, not shades of
color.
Cones: allow us to distinguish between different colors. Three types of cones:
Red cones: responds to red light.
Green cones: respond to green light.
Blue cones: responds to blue light.
5.1. Color Spaces
Color space specifies how color information is represented. It is also called color model. Any
color could be described in a three dimensional graph, called a color space.
been subtracted from the white light). Pure black (255, 255, 255) --- presence of three colors
(because, all of the light has been subtracted from the white light).
CMY are used to produce all colors and they are complementary colors of RGB. They are
mostly used in printing devices and in painting.
correspond to black and white. The angular parameter corresponds to hue, distance from the axis
corresponds to saturation, and distance along the black white axis corresponds to lightness.
One neat aspect of YUV is that it is possible to throw out the U and V components and get a
grey-scale image. Black and white TV receives only Y (luminanace) component ignoring the
others to make it black-white TV compatible.
CHAPTER FIVE
Video is a series of images which are displayed on screen at fast speed to form a motion. A single
image is called frame and video is a series of frames. The rate at which these images are
presented is referred to as the frame rate. Each screen-full of video is made up of thousands of
pixels.
There are two types of video: a) Analog Video and b) Digital Video.
a) Analog Video
Analog technology requires information representing images and sound in the form of electric
signal between sources and destinations. Analog formats are susceptible to loss due to
transmission noise effects. Quality loss is also possible from one generation to another. This type
of loss is like photocopying, in which a copy of a copy is never as good as the original.
b) Digital Video
Digital technology is based on images represented in the form of bits. Digital video is just a
digital representation of the analogue video signal. With a digital video signal, there is no
variation in the original signal once it is captured on to computer disc. Therefore, the image does
not lose any of its original sharpness and clarity. The image is an exact copy of the original. A
computer is the most common form of digital technology.
An analog video can be very similar to the original video copied, but it is not identical.
Digital copies will always be identical and will not lose their sharpness and clarity over
time.
Digital video has the limitation of the amount of RAM available, whereas this is not a
factor with analog video.
Digital technology allows for easy editing and enhancing of videos.
Storage of the analog video tapes is much more cumbersome than digital video CDs.
5.2. Displaying Video
I) Interlaced Scanning
Interlacing is the splitting up the image into two parts and the splitted up pictures are called
fields. A field is basically a picture with every 2 nd line black/white. Interlaced scanning writes
every second line of the picture during a scan, and writes the other half during the next sweep.
During the first scan the upper field is written on screen. The first, 3rd, 5th, etc. line is written and
after writing each line the electron beam moves to the left again before writing the next line.
Here is an image that shows interlacing:
Monitor writes a whole picture per scan. Progressive scan updates all the lines on the screen at
the same time, 60 times every second.
Computer Television
Scans 480 horizontal lines from top to Scans 625, 525 horizontal lines
bottom
Scan each line progressively Scan line using interlacing system
Scan full frame at a rate of typically 66.67 Scan 25-30 Hz for full time
Hz or higher
Use RGB color model Uses limited color palette and restricted
luminance (lightness or darkness)
Recording Video
CCDs (Charge Coupled Devices) are chip containing a series of tiny, light-sensitive photo sites.
CCDs can be thought of as film for electronic cameras. CCDs consist of thousands or even
millions of cells, each of which is light-sensitive and capable of producing varying amounts of
charge in response to the amount of light they receive.
There are three types of video signals: Component, Composite and S-video.
1) Component Video
Component video takes the different components of the video and breaks them into separate
signals. Each primary is sent as a separate video signal. The primaries can either be RGB or YIQ,
YUV. It supports best color reproduction. It requires more bandwidth and good synchronization
of the three components.
2) Composite Video
Color (chrominance) and luminance signals are mixed into a single carrier wave. Some
interference between the two signals is inevitable. Composite analog video has all its components
(brightness, color, synchronization information, etc.) combined into one signal. Due to the
compositing (or combining) of the video components, the quality of composite video is marginal
at best. The results are color bleeding, low clarity and high generational loss.
It is a compromise between component analog video and the composite video. It uses two lines,
one for luminance and another for composite chrominance signal.
The most common video broadcasting standards are: PAL, SECAM, NTSC, and HDTV.
beginning of each field. It controls vertical retrace and sync. So a maximum of 485 lines of
visible data. Similarly, 1/6 of the raster at left is left for horizontal retrace and sync. NTSC
uses YIQ color model.
With digital video, four factors have to be kept in mind. These are:
a) Frame Rate – The standard for displaying any type of non-film video is 30 frames per second
(film is 24 frames per second).
b) Color Resolution – The number of colors displayed on the screen at a time.
c) Spatial Resolution – The size of picture.
d) Image Quality – The clarity and sharpness of an image.
¼ screen, 15 frames per second (fps), at 8 bits per pixel.
Full screen (760 by 480), full frame rate video, at 24 bits per pixel.
Chapter 6
Digitizing Sound
Sampling Audio
Analog Audio
Most natural phenomena around us are continuous; they are continuous transitions between two
different states. Sound is not exception to this rule i.e. sound also constantly varies.
Continuously varying signals are represented by analog signal.
Signal is a continuous function f in the time domain. For value y=f(t), the argument t of the
function f represents time. If we graph f, it is called wave. (see the following diagram)
Amplitude
Frequency, and
Phase
Amplitude: is the intensity of signal. This is can be determined by looking at the height of signal.
If amplitude increases, the sound becomes louder. Amplitude measures the how high or low the
voltage of the signal is at a given point of time.
Frequency: is the number of times the wave cycle is repeated. This can be determined by
counting the number of cycles in given time interval. Frequency is related with pitchness of
the sound. Increased frequencyhigh pitch.
When sound is recorded using microphone, the microphone changes the sound into
analog representation of the sound. In computer, we can’t deal with analog things. This
makes it necessary to change analog audio into digital audio. How? Read the next topic.
Converting an analog audio to digital audio requires that the analog signal is sampled. Sampling
is the process of taking periodic measurements of the continuous signal. Samples are taken at
regular time interval, i.e. every T seconds. This is called sampling frequency/sampling rate.
Digitized audio is sampled audio. Many times each second, the analog signal is sampled. How
often these samples are taken is referred to as sampling rate. The amount of information stored
about each sample is referred to as sample size.
Analog signal is represented by amplitude and frequency. Converting these waves to digital
information is referred to as digitizing. The challenge is to convert the analog waves to
numbers (digital information).
In digital form, the measure of amplitude (the 7 point scale - vertically) is represented with
binary numbers (bottom of graph). The more numbers on the scale the better the quality of
the sample, but more bits will be needed to represent that sample. The graph below only
shows 3-bits being used for each sample, but in reality either 8 or 16-bits will be used to
create all the levels of amplitude on a scale. (Music CDs use 16-bits for each sample).
In digital form, the measure of frequency is referred to as how often the sample is taken. In the
graph below the sample has been taken 7 times (reading across). Frequency is talked about in
terms of Kilohertz (KHz).
Music CDs use a frequency of 44.1 KHz. A frequency of 22 KHz for example, would mean
that the sample was taken less often.
Sampling means measuring the value of the signal at a given time period. The samples are then
quantized. Quantization is rounding the value of each sample to the nearest amplitude number
in the graph. For example, if amplitude of a specific sample is 5.6, this should be rounded either
up to 6 or down to 5. This is called quantization. Quantization is assigning a value (from a set)
to a sample. The quantized values are changed to binary pattern. The binary patterns are stored
in computer.
Example:
The value of sample at point A falls between 2 and 3, may be 2.6. This value should be
represented by the nearest number. We will round the sample value to 3. Then this three is
converted into binary and stored inside computer.
Similarly, the values of other sampling points are:
B=1
C=3
D=1
E=3
F=1
G=2
H=3
I=1
The values of most sample points are quantized. After quantization, we convert sample values
Sample Rate
A sample is a single measurement of amplitude. The sample rate is the number of these
measurements taken every second. In order to accurately represent all of the frequencies in a
recording that fall within the range of human perception, generally accepted as 20Hz–20KHz, we
must choose a sample rate high enough to represent all of these frequencies. At first
consideration, one might choose a sample rate of 20 KHz since this is identical to the highest
frequency. This will not work, however, because every cycle of a waveform has both a positive
and negative amplitude and it is the rate of alternation between positive and negative amplitudes
that determines frequency. Therefore, we need at least two samples for every cycle resulting in a
sample rate of at least 40 KHz.
Sampling Theorem
Nyquist’s Theorem:
The Sampling frequency for a signal must be at least twice the highest frequency component in
the signal.
When the sampling rate is lower than or equal to the Nyquist rate, the condition is defined as
under sampling. It is impossible to rebuild the original signal according to the sampling theorem
when such sampling rate is used.
Aliasing
What exactly happens to frequencies that lie above the Nyquist frequency? First, we’ll look at
a frequency that was sampled accurately:
In this case, there are more than two samples for every cycle, and the measurement is a good
approximation of the original wave. we will get back the same signal we put in later on
when converting it into analog.
Remember: speakers can play only analog sound. You have to convert back digital audio
to analog when you play it.
In this diagram, the blue wave (the one with short cycles) is the original frequency. The red wave
(the one with lower frequency) is the aliased frequency produced from an insufficient number of
samples. This frequency, which was in all likelihood a high partial in a complex timbre, has
folded over and is now below the Nyquist frequency. For example, a 11KHz frequency sampled
at 18KHz would produce an alias frequency of 7KHz. This will alter the timbre of the recording
in an unacceptable way.
Under sampling causes frequency components that are higher than half of the sampling
frequency to overlap with the lower frequency components. As a result, the higher frequency
components roll into the reconstructed signal and cause distortion of the signal. This type of
signal distortion is called aliasing.
Each sample can only be measured to a certain degree of accuracy. The accuracy is dependent on
the number of bits used to represent the amplitude, which is also known as the sample resolution.
Examples:
Tolassa sampled audio for 10 seconds. How much storage space is required if
a) 22.05 KHz sampling rate is used, and 8 bit resolution with mono recording?
b) 44.1 KHz sampling rate is used, and 8 bit resolution with mono recording?
c) 44.1 KHz sampling rate is used, 16 bit resolution with stereo recording?
Solution:
a) m=22050*8*10*1
m= 1764000bits=220500bytes=220.5KB
b) m=44100*8*10*1
m= 3528000 bits=441000butes=441KB
c) m=44100*16*10*2
d) m=11025*16*10*2
Clipping
Both analog and digital media have an upper limit beyond which they can no longer accurately
represent amplitude. Analog clipping varies in quality depending on the medium. The upper
amplitudes are being altered, distorting the waveform and changing the timbre, but the
alterations are slightly different. Digital clipping, in contrast, is always the same. Once an
amplitude of 1111111111111111 (the maximum value in a 16 bit resolution) is reached, no
higher amplitudes can be represented. The result is not the smooth, rounded flattening of analog
clipping, but a harsh slicing of off the top of the waveform, and an unpleasant timbral result.
An Ideal Recording
We should all strive for an ideal recording. First, don’t ignore the analog stage of the process.
Use a good microphone, careful microphone placement, high quality cables, and a reliable
analog-to-digital converter. Strive for a hot (high levels), clean signal.
Second, when you sample, try to get the maximum signal level as close to zero as possible
without clipping. That way you maximize the inherent signal-to-noise ratio of the medium.
Third, avoid conversions to analog and back if possible. You may need to convert the signal to
run it through an analog mixer or through the analog inputs of a digital effects processor. Each
time you do this, though, you add the noise in the analog signal to the subsequent digital re-
conversion.
Chapter 7
Data Compression
Introduction
Data compression is often referred to as coding, where coding is a very general term
encompassing any special representation of data which satisfies a given need.
Definition: Data compression is the process of encoding information using fewer number of bits
so that it takes less memory area (storage) or bandwidth during transmission.
Error! Not a valid embedded object.Lossless Data Compression: in lossless data compression,
the original content of the data is not lost/changed when it is compressed (encoded).
Examples:
Lossy data compression: the original content of the data is lost to certain degree when
compressed. Part of the data that is not much important is discarded/lost. The loss factor
determines whether there is a loss of quality between the original image and the image after it
has been compressed and played back (decompressed). The more compression, the more likely
that quality will be affected. Even if the quality difference is not noticeable, these are considered
lossy compression methods.
Examples
Information Theory
Information theory is defined to be the study of efficient coding and its consequences. It is the
field of study concerned about the storage and transmission of data. It is concerned with source
coding and channel coding.
Data compression may be viewed as a branch of information theory in which the primary
objective is to minimize the amount of data to be transmitted.
With more colors, higher resolution, and faster frame rates, you produce better quality video, but
you need more computer power and more storage space for your video. Doing some simple
calculations (see below) it can be shown that with 24-bit color video, with 640 by 480
resolutions, at 30 fps, requires an astonishing 26 megabytes of data per second! Not only does
this surpass the capabilities of the many home computer systems, but also overburdens existing
storage systems.
Claude Shannon and R.M. Fano created the first compression algorithm in the 1950's. This
algorithm assigns variable number of bits to letters/symbols.
Shannon-Fano Coding
The steps to encode data using Shannon-Fano coding algorithm is as follows: Order the source
letter into a sequence according to the probability of occurrence in nonincreasing order i.e.
decreasing order.
ShannonFano(sequence s)
Divide s into two subsequences S1, and S2 with the minimal difference between
extend the codeword for each letter in S1 by attaching 0, and by attaching 1 to each
ShannonFano(S1);
ShannonFano(S2);
S={A,B,C,D,E}
Compiled By :- Gemechu Boche Page 65
Multimedia Systems (ITec3121) Introduction to Multimedia Systems
P={0.35,0.17,0.17,0.16,0.15}
Message to be encoded=”ABCDE”
The probability is already arranged in non-increasing order. First we divide the message into AB
and CDE. Why? This gives the smallest difference between the total probabilities of the two
groups.
S1={A,B} P={0.35,0.17}=0.52
S2={C,D,E} P={0.17,0.17,0.16}=0.46
The difference is only 0.52-0.46=0.06. This is the smallest possible difference when we divide
the message.
S21={C} P={0.17}=0.17
S22={D,E} P={0.16,0.15}=0.31
Attach 0 to S21 and 1 to S22. Since S22 has more than one letter in it, we have to subdivide it.
S221={D} attach 0
S222={E} attach 1
The message is transmitted using the following code (by traversing the tree)
A=00 B=01
C=10 D=110
E=111
Dictionary Encoding
Dictionary coding uses groups of symbols, words, and phrases with corresponding abbreviation.
It transmits the index of the symbol/word instead of the word itself. There
LZSS
LZW (Lempel-Ziv-Welch)
LZW Compression
LZW compression has its roots in the work of Jacob Ziv and Abraham Lempel. In 1977, they
published a paper on "sliding-window" compression, and followed it with another paper in 1978
on "dictionary" based compression. These algorithms were named LZ77 and LZ78, respectively.
Then in 1984, Terry Welch made a modification to LZ78 whichbecame very popular and was
called LZW.
The Concept
Many files, especially text files, have certain strings that repeat very often, for example " the ".
With the spaces, the string takes 5 bytes, or 40 bits to encode. But what if we were to add the
whole string to the list of characters? Then every time we came across " the ", we could send the
code instead of 32,116,104,101,32. This would take less no of bits.
The Algorithm:
LZWEncoding()
read symbol c;
s = s+c;
else
s =c;
end loop
output codeword(s);
The program reads one character at a time. If the code is in the dictionary, then it adds the
character to the current work string, and waits for the next one. This occurs on the first character
as well. If the work string is not in the dictionary, (such as when the second character comes
across), it adds the work string to the dictionary and sends over the wire the works string without
the new character. It then sets the work string to the new character.
Encoding
Check if s+c (s+c=aa) is found in the dictionary (the one created above in step 1). It is not
found. So add s+c(s+c=aa) to dictionary and output codeword for s(s=a). The code for a is 1
from the dictionary.
Check if s+c (ab) is found in the dictionary. It is not found. Then, add s+c (s+c=ab) into
dictionary and output code for c (c=b). The codeword is 2. Then initialize s to c (s=c=b).
Read the next letter to c (c=a). Check if s+c (s+c=ba) is found in the dictionary. It is not found.
Then add s+c (s+c=ba) to the dictionary. Then output the codeword for s (s=b). It is 2. Then
initialize s to c (s=c=b).
Compiled By :- Gemechu Boche Page 69
Multimedia Systems (ITec3121) Introduction to Multimedia Systems
Read the next message to c (c=a). Then check if s+c (s+c=ab) is found in the dictionary. It is
there. Then initialize s to s+c (s=s+c=ab). Read again the next letter to c (c=a). Then check if s+c
(s+c=aba) is found in the dicitionary. It is not there. Then transmit codeword for s (s=ab). The
code is 6. Initialize s to c(s=c=a).
Again read the next letter to c and continue the same way till the end of message. At last you will
have the following encoding table.
Now instead of the original message, you transmit their indexes in the dictionary. The code for
the message is 112613791145.