0% found this document useful (0 votes)
75 views

2 imageAudioVideo

The document discusses multimedia basics and representation, focusing on text, images, audio, and video. It covers topics like: - Text representation and typography principles like typefaces, fonts, font attributes. - Digital image representation using formats like bitmap, grayscale, RGB color model. - Audio representation moving from analog to digital, using formats like MIDI, WAV, MP3. - Video representation as a series of framed images put together to simulate motion, stored in files with extensions like MOV, AVI, MPEG.

Uploaded by

Robel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

2 imageAudioVideo

The document discusses multimedia basics and representation, focusing on text, images, audio, and video. It covers topics like: - Text representation and typography principles like typefaces, fonts, font attributes. - Digital image representation using formats like bitmap, grayscale, RGB color model. - Audio representation moving from analog to digital, using formats like MIDI, WAV, MP3. - Video representation as a series of framed images put together to simulate motion, stored in files with extensions like MOV, AVI, MPEG.

Uploaded by

Robel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Chapter TWO

Multimedia Basics
and
Representation
Text
• Text is a vital element of multimedia
presentations.
• Words and symbols in any form, spoken or
written, are the most common system of
communication. They deliver the most
widely understood meaning to the greatest
number of people— accurately and in
detail.
…continued
• Text is a visual representation of language,
as well as a graphic element in its own
right.
• The study of how to display text is known
as typography. It concerns the precise shape
of characters, their spacing, the layout of the
lines and paragraphs, etc
Typefaces and fonts
• To display text, we need to have a visual
representation of the characters stored as codes in the
computer.
• A typeface is a family of graphic characters with a
coherent design and usually includes many sizes and
styles.
• A font is a set of graphic characters with a specific
design in a specific size and style.
• For example, the typeface used in this paragraph
is ‘Arial’. The font is ‘Arial 28pt’. Arial may contain
many fonts such as Arial Black, Arial narrow etc
Classification of typeface
• Serif means tick marks that you often see at the
endings of a stroke in some fonts such as:
– Times new roman
– Courier
• Sans serif usually refers to those fonts that do
not have tick marks at the character endings.
Sans means without…Examples include
– Arial
– Helvetica
– Universe 55
• Note:-Serif typefaces may look
beautiful but could be difficult to
read in some environments, such as
on multimedia images etc or for
Example fonts and sizes
• Arial 32: Multimedia systems
• Times New Roman 22: Multimedia systems
• Albertus extra bold 24: Multimedia systems
• Algerian 28: Multimedia systems
• Abyssinica 30: Multimedia systems
• Apple chancery 30: Multimedia systems
• Bauhaus 93 22: Multimedia systems

•Vladmir script 40: Multimedia systems


Font measurement
• When putting characters on to a
page, we need to know some basic
measurement of the types we use.
• Each character has a bounding box.
This is the rectangle enclosing the
entire character.
• Each character has an origin. It is
usually place on the baseline.
• The width of the character descen
determine where the origin of the
next character will be. der
• The distance between the origin and
the left side of the bounding box is
called left side bearing.
Bitmap and outline fonts
Font formats can be divided into two main categories:
bitmap fonts and outline fonts.
• Bitmap fonts come in specific sizes and resolutions.
Because the font contain the bitmaps of the character
shapes. The result will be very poor if they are scaled
to different sizes.
• Outline fonts contain the outline of the characters.
They can be scaled to a large range of different sizes
and still have reasonable look. They need a
rasterizing process to display on screen.
Bitmap font

• W Outline font
Font attributes
Five attributes are often used for specifying a font:
• Family — fonts in the same family have similar design, look and feel.
Here are some of the common families:
– Times, Helvetica, Courier, Garamond, Univers
• Shape — refers to the different appearance within a family.
– normal (upright), italic, SMALL CAP
• Weight — measures the darkness of the characters, or the thickness of
the strokes. The commonly used names are:
– ultra light, extra light, light, semi light, medium, semi bold, bold,
extra bold,etc.
• Width — the amount of expansion or contraction with respect to the
normal or medium in the family.
• Size — unit of measure is point.
– 1 inch = 72.27 point in printing industry.
– 1 inch = 72 point in PostScript systems.

– tiny small normal large larger even larger huge


Image
•Image:
–It is a 2-D object, which is stored as a specific
arrangement of dots, or pixels.
• The picture elements known as pixels can
be either on or off, as in the 1-bit bitmap, or,
by using more bits to describe them, can
represent varying shades of color (4 bits for
16 colors; 8 bits for 256 colors; 15 bits for
32,768 colors; 16 bits for 65,536 colors; 24
bits for 16,772,216 colors). :
Fig 2.2 - A bitmap is a data matrix that describes the characteristics of all
the pixels making up an image. Here, each cube represents the data required
to display a 4 × 4–pixel image (the face of the cube) at various color depths
(with each cube extending behind the face indicating the number of bits—zeros or ones—
used to represent the color for that pixel).
Digital Images

• To store an image, the image contains a 2D samples of a surface,


which can be represented as matrices.
• Each sample in an image is called a pixel – the smallest image
resolution elements.
• Each pixel has a numerical value, that is, the number of bits available
to code a pixel which is dependent on the type of images.
Image content

An image
contains
a header and
a bunch of
(integer) numbers.
Types of Digital Images
• Grayscale image
– Usually we use 256 levels for each pixel.
That means, the numerical value for gray
levels range from 0 (for black pixels) to
FF (256) for white. Thus we need 8 bits
to represent each pixel (28 = 256)
– Gray scale ranges from black to grays A 8 bit grayscale
and finally to white. Image.
• Binary Image
– A binary image has only two values (0 or
1). A numerical value may represent
either a black (0) or a white (1) dot/pixel.
– Binary image is quite important in image
analysis and object detection
applications.
RGB Color Model
• To form a color with RGB, three B
separate color signals of one red,
one green, & one blue must be R
mixed. Each of the three signals
can have an arbitrary intensity,
from fully off to fully on, in the
mixture. G
–The RGB color model is an additive
color mixing model using which every
color can be encoded as a combination
of red, green, & blue light.
–Projection of primary color lights on a
screen shows secondary colors where
they overlap; for instance, the Red
combination of all three of red, green,
& blue in appropriate intensities
makes white.
RGB Color Model
• The main purpose of the RGB color
model is for display of images in
electronic systems, such as televisions
and computers.
– Typical RGB input devices are color TV &
video cameras, image scanners, and digital
cameras.
– Typical RGB output devices are TV sets of
various technologies (CRT, LCD, plasma,
etc.), computer and mobile video projectors,
phone displays, etc.
– Color printers, on the other hand, are usually
not RGB devices, but subtractive color
devices (typically CMYK color model).
RGB color model
 The figure shows an RGB image, along with its
separate R, G and B components. Note that,
–strong red, green, and blue produces white
color; like wise, strong red and green with little
blue gives brown; strong green with little red or
blue gives dark green; strong blue and
moderately strong red and green provides light
blue sky.
 The number of bits used to represent each pixel in RGB space is
called the pixel depth.
–Consider an RGB image in which each of the red, green, and
blue color is an 8-bit representation. Under these conditions
each RGB color pixel have a depth of 24 bits.
–Compute the total number of colors in a 24-bit RGB image ?
Color Image
• Characterization of light is central
to the science of color.
• There are different color models:
RGB, YUV, YIQ, HSV, CMYK
(Cyan, Magenta, Yellow, Black),
etc. color model
R

B
24 bit image
Color Table
Clusters of colors

Image with 256 colors


•It is possible to use much
less colors to represent a
color image without
much degradation.
•Audio: used to record sound.
–In the past 20 years, audio has moved from analog recording on tape
cassettes to totally digital recording using computers.
–Today, the Musical Instrument Digital Interface (MIDI) allow
anyone to create music right on their desktop. MIDI is a digital
standard that defines how to code musical scores, such as sequences
of notes, timing conditions, and the instrument to play each note.

•Video:
–A series of framed images put together, one after another, to
simulate motion and interactivity. A video can be transmitted by
number of frames per second and/or the amount of time between
switching frames.
–The difference between video and animation is that video is broken
down into individual frames.
Digital Media
• In computers, audio, image and video are stored as files
just like other text files (e.g. DOC, TXT, TEX, etc.).
– For images, these files can have an extension like
• BMP, JPG/JPEG, GIF, TIF, PNG, PPM, …
– For audios, the file extensions include
• WAV, MP3, m4a, AMR, WMA…
– The videos files usually have extensions:
• MOV, AVI, MPEG, MP4, 3gp, …

• What about PDF file? PS file?


Digital Media Capturing
• To get a digital image, an audio or a video clip, we need some
media capturing devices
• Image:
– is captured using devises such as a digital camera or a digital
scanner
• Audio:
– is recorded using a digital audio recorder (or Microphone), such as
Olympus Voice Recorder, MP3 digital recorder, SONY Voice
Recorder, etc.
• Video:
– is recorded using a digital camcorder.
– Camcorder is a video camera that records video and audio using a
built-in recorder unit. The camcorder contains both a video camera
and a video recorder in one unit and hence its compound name
Advantage of digital media over analog ?
• Is digital cameras do things that are not done by still
cameras?
• The following are some of the advantages of digital
media
1. Displaying images/audios/videos on a screen immediately
after they are recorded
2. Storing thousands of images/audios/videos on a single small
memory device
3. Deleting images/audios/videos to free storage space
4. Digital camera enables recording video with sound; and
camcorder enables capturing image.
Convert Analog to Digital Media
• Once the media is captured, there is a need to process them to convert
the continuous signal to digital. Hence, all the devices used for
capturing the digital media have to complete the following tasks:
• Sampling: converts a continuous media (analog signal) into a
discrete set of values at regular time and/or space intervals.
–Given an analog media, sampling represents a mapping of the
media from a continuum of points in space (and possibly time, if it
is a moving image) to a discrete set.
• Digitization/quantization: converts a sampled signal into a
signal that can take only a limited number of values (or bit
depth).
–E.g. an 8-bit quantization provides 256 possible values
• Compression: There are probably some further compression
process to reduce file size to save space.
–Compression is minimizing the size in bytes of a media file without
degrading the quality.
Sampling and Quantization of Audio

An Audio Signal

• The rate at which a


continuous wave form is
sampled is called the
sampling rate.
Sampling Audio
• Any signal can be represented as the summation of sin waves.
• Good sampling follows Nyquist theorem
– If we have a signal with frequency components, f1 < f2 <…<fn, what is
the sampling frequency we can use?
– Nyquist theorem: the necessary condition of reconstructing a continuous
signal from the sampling version is that the sampling frequency is fs =
2*fmax (where fmax is the highest frequency components in the signal)

• Range of Human Hearing (Music): 20Hz – 20KHz


– We lose high frequency response with age; Women generally have
better response than men
– To reproduce 20 kHz requires a sampling rate of 40 kHz
• Speech (like telephony) signal frequency is 5Hz–4KHz
– According to Nyquist, it would take 8,000 samples (2 times 4,000) to
capture a 4,000 Hz signal perfectly.
Sampling for an Audio Signal
• When capturing audio covering the entire 20Hz – 20KHz range of
human hearing, such as when recording music, audio wave forms are
typically sampled at 44.1 KHz (CD) 48 KHz (professional audio)
Signal Period T, f = 1/T
1

0.8

0.6

0.4

0.2
Amplitude

-0.2

-0.4
Sampling period Ts,
-0.6
fs =1/Ts
-0.8

-1
0 10 20 30 40 50 60 70 80 90 100
t

Problem: There are infinite number of possible sin waves going


through the sampling points
Analog-to-digital conversion process: (a) original analog signal; (b)
sampling pulses; (c) sampled values and quantization intervals; and (d)
digitized sequence.
Reconstruction of audio from
digitized data
Digital-to-analog conversion process: (a) digital sequence;(b) step
signals; (c) signal removed after passing through a low-pass filter.
Increasing the sampling rate
Sampling vs. Digitization/Quantization
• The samples are continuous and have infinite number of
possible values at every sampled points (at regular time
interval and/or space).
• The digitization/quantization process approximates the bit
depth with a fixed number of values.
• To represent N numbers, we need log2N bits.
– For example, an 8 bit quantization represents 256 possible values.
– What about a 16 bit quantization? It handles more than 65, 000
possible values
• So, what determines the number of bits we need for an
audio clip or an image?
Digital Audio
• Music has more high frequency components than speech.
–44 KHz is the sampling frequency for music.
–8 kHz sampling is good enough for telephone quality speech,
since all the energy is contained in the 5Hz – 4 KHz ranges.
• Audio is typically recorded at 8-, 16-, and 20- bit depth.
CD quality audio is recorded at 16-bit
–You often hear an audio (music) which is quantized at 16 bits for
each sampled data at 44 kHz.
–16 bits means each sample is represented as a 16bit integer,
which results in 65, 536 possible values.
Sampling and Quantization of Image
• The sampling theorem applies to 2D signal
(images) too.

Pixels are infinitely small point


samples

Sampling on a grid

• During sampling we have to determine the sampling rate, like


every third pixel sampled. The intermediate pixels are filled in with
the sampled values.
Representing an Image
• To represent an image without noticeable deterioration, we
would have to use a matrix of at least 640 x 480 pixels.
– How much space is required by the grey-scale image with
such specification ?
– (640*480*8 bit=2457600 bit =307200 bytes )
– Where each pixel is represented by an 8-bit integer, this
image specification results in a matrix containing 307, 200
eight-bit numbers (or, a total of 2, 457, 600 bits or 307 KBs).
– But RGB image (640*480*24(8*3)=307200 * 3 bytes
• This is also true for video Graphics Array
(VGA) or configuring graphics card of
computers.
Image Storing format & compression
• The most popular image storing formats include: BMP, JPG,
GIF, TIF, PNG, PPM, …
–To store an image, the image is represented in a 2-D matrix, in
which each value corresponds to the data associated with one image
pixel.
–The image formats also influence storage requirements of image. If
storage space is scarce, images should be compressed in a suitable
way. If sufficient memory available, the image can be stored
uncompressed.
• For instance;
–BMP (Bit Map) format does not compress the original image.
–GIF (Graphics Interchange Format ) type images are compressed
to more than 99% of their original size with no loss in image
quality. It supports up to 256 colors.
– JPEG (Joint Photographic Experts Group) became an ISO
international standard for compression of images. It applies to color
and gray-scaled images.
Video Sampling and Quantization
• Analog video signal is continuous in space and time and sampling
considers both time and space.
• Video sampling break the frame into 720 x 480 pixels.

• Video quantization is
time essentially the same as
Frame N-1
image quantization

• During video
quantization each pixel
is represented by a bit
depth of, say 8-bits
Frame 0
representing luminance
and color information.
Human visual system
• What characteristics of the human visual system can be exploited
in related to compression of color images and video?
• The eye is basically sensitive to color intensity
– Each neuron is either a rod or a cone . Rods are not sensitive to color.
– Cones come in 3 types: red, green and blue.
– Each responds differently --- Non linearly and not equally for RGB
differently to various frequencies of light.
Color System in Video
• Video signals are often transmitted to the receiver over a
single television channel
–In order to encode color, a video signal is decomposed into three
sub-signals: a luminance signal and two color signals.
–Since human vision is more sensitive to brightness than to color, a
more suitable color encoding system separates the luminance from
color information. Such models include YUV, YIQ, etc.
 The YUV color model: While RGB model
separates colors, YUV model separates
brightness (luminance) information from
the color information. Y is the luminance
component (brightness) and U and V are
color components
– It is obtained from RGB using the following
equations.
Y = 0.299 R + 0.587 G + 0.144 B
U=B–Y
V=R-Y
Y U V
Color System in Video
YIQ color model
• YIQ color model is a similar encoding system
like YUV.
• It produces the I and Q colors and adds the
modulated signal to the luminance Y.
– It is obtained from RGB using the following
equations.
Y = 0.3 R + 0.59 G + 0.11 B
I = 0.60 R – 0.28 G – 0.32 B
Q = 0.21 R – 0.52 G + 0.31 B

I Q
Video Storing format & compression
• Each video formats support various resolutions and color
presentation. The following are the well-known video formats
• The Color Graphics Adaptor (CGA):
–Has a resolution of 320 x 200 pixels with simultaneous display of four
colors. (320*200 *log42 (2)=16000 byte
–What the necessary storage capacity per frame ?
• The Enhanced Graphics Adaptor (EGA):
–Supports display resolution of 640 x 350 pixels with 16 simultaneous
display colors
–What the necessary storage capacity per frame ?
• The Video Graphics Array (VGA):
–Works mostly with a resolution of 640 x 480 pixels with 256
simultaneous display colors
–What the necessary storage capacity per frame ?
• The Supper Video Graphics Array (VGA):
–Can present 256 colors at a resolution of 1024 x 768 pixels.
–What the necessary storage capacity per frame ?
–Other SVGA modes include 1280 x 1024 pixels and 1600 x 1280 pixels.
Exercise
• Suppose we have 24 bits per pixel available for a color
image. We also note that humans are more sensitive to
red and green colors than to blue, by a factor of
approximately 1.5 times. How may we design a simple
color representation to make use of the bits available?
• Why we use different type of format for specific media?
• Quite a simple scheme:
– Since Blue is less perceptually important use less bits to
represent blue color. Use proportionately more bits for red
and green rather than blue
– Therefore Red and Green use 9 bits each and Blue 6 bits to
represent values
– Need to quantize at different levels for blue and Red/green

You might also like