1.3 Compression
1.3 Compression
1.3
compression
Zain Merchant
2
Zain Merchant
3
Zain Merchant
4
It is often necessary to reduce the le size of a le to either save storage space or
to reduce the time taken to stream or transmit data from one device to another.
The two most common forms of le compression are lossless le compression and
lossy le compression.
This means we have ve characters with ASCII code 97, four characters with ASCII
code 98, two characters with ASCII code 99, and ve characters with ASCII code
100. Assuming each number in the second row requires 1 byte of memory, the RLE
code will need 8 bytes. This is half the original le size.
One issue occurs with a string such as ‘cdcdcdcdcd’, where compression is not very
effective. To cope with this we use a ag. A ag preceding data indicates that what
follows are the number of repeating units (for example, 255 05 97 where 255 is the
ag and the other two numbers indicate that there are ve items with ASCII code
97). When a ag is not used, the next byte(s) are taken with their face value and a
run of 1 (for example, 01 99 means one character with ASCII code 99 follows).
Consider this example:
Zain Merchant
fl
fi
fi
fl
fi
fi
fi
fi
fi
fi
fi
fi
fl
fi
fl
fi
fi
fi
fi
fi
fi
fi
fi
6
The original string contains 32 characters and would occupy 32 bytes of storage.
The coded version contains 18 values and would require 18 bytes of storage.
Introducing a ag (255 in this case) produces:
255 08 97 255 10 98 99 100 99 100 99 100 255 08 101
This has 15 values and would, therefore, require 15 bytes of storage. This is a
reduction in le size of about 53%.
The 8 × 8 grid would need 64 bytes; the compressed RLE format has 30 values,
and therefore needs only 30 bytes to store the image.
Coloured images
Figure 1.8 shows an object in four colours. Each colour is made up of red, green
and blue (RGB) according to the code on the right.
The original image (8 × 8 square) would need 3 bytes per square (to include all
three RGB values). Therefore, the uncompressed le for this image is 8 × 8 × 3 =
192 bytes.
Zain Merchant
fi
fl
fi
7
The RLE code has 92 values, which means the compressed le will be 92 bytes in
size. This gives a le reduction of about 52%. It should be noted that the le
reductions in reality will not be as large as this due to other data which needs to be
stored with the compressed le (such as a le header).
General methods of
compressing files
All the above le compression techniques are excellent for very speci c types of
le. However, it is also worth considering some general methods to reduce the size
of a le without the need to use lossy or lossless le compression:
Zain Merchant
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi