This document discusses different techniques for data compression and organizing files for performance. It covers:
1. Ways to compress data by making files smaller through encoding, reducing redundancy, run-length encoding, variable-length codes, and irreversible compression.
2. Reasons for compression including using less storage, faster transmission, and processing speeds.
3. Methods for reclaiming space in files after records are deleted through marking deleted records, finding freed space, and reusing the space through avail lists, linked lists, stacks, and storage compaction.
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
33 views
Fs Mod2
This document discusses different techniques for data compression and organizing files for performance. It covers:
1. Ways to compress data by making files smaller through encoding, reducing redundancy, run-length encoding, variable-length codes, and irreversible compression.
2. Reasons for compression including using less storage, faster transmission, and processing speeds.
3. Methods for reclaiming space in files after records are deleted through marking deleted records, finding freed space, and reusing the space through avail lists, linked lists, stacks, and storage compaction.
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24
Module 2
Organizing Files for Performance
Data Compression • Ways to make files smaller. It involves encoding the information in a file in such a way that it takes up less space. • Reasons to make smaller files 1. Use less storage, cost saving 2. Can be transmitted faster, decreasing access time or allowing same access with lower and cheaper bandwidth 3. Can be processed faster sequentially
Different was of data compression-next slides
1. Using different notation • A compression technique where we decrease the number of bits by finding a more compact notation. This is classified as redundancy reduction. • Example: state code in Person file was stored with two byte(16 bits) ASCII characters. There are total 50 states(In USA), we need only 6 bits to store the state code. Saving 10 bits. 2.Suppressing Repeating Sequences • In this 8 bit image only objects above certain brightness are identified and all other regions are set to pixel value 0. • Spare arrays of this sort are very good candidates for compression called as run- length encoding. • Run-length encoding algorithm 1. First, we choose one special unused byte value to indicate that a run-length code follows. 2. Read through the pixels that make up the image, copying the pixel values to the file in sequence, except where the same pixel value occurs more then once in succession. 3. Where the same value occurs more than once in succession, substitute following 3 bytes in order 1. The special run-length code indication 2. The pixel value that is repeated 3. The number of times that value is repeated(up to 256 times) 3. Assigning Variable length codes • Variable –length codes are based on principle that values occur more frequently than others, so the codes for those values should take the least amount of space. • This is another form of redundancy reduction. • Example: Morse code(using dot(.) and dashes(-)). • Huffman coding 4.Irreversible Compression Techniques • It is based on the assumption that some information can be sacrificed. Reclaiming Space in Files
• Record addition will not change the file. But
record updation and deletion will modify the file and we need to reuse the space. Record deletion and Storage Compaction • Storage compaction makes files smaller by looking for places in a file where there is no data at all and recovering this space. • Any record-deletion strategy must provide some way for us to recognize records as deleted. We can place special mark as the first field in a deleted record. • Once we recognize a record as deleted, next we need to know how to reuse the space from the record. • Programs using this file must use the logic that causes them to ignore the records that are marked as deleted. • Reclamation of space from the deleted records happens all at once. • After deleted records have accumulated for some time, a special program is used to reconstruct the file with all deleted records squeezed out(Fig 3(c)). Deleting Fixed-Length Records for Reclaiming Space dynamically - To provide a a mechanism for record deletion with subsequent reutilization of freed space, we need to be able to guarantee two things: 1.Deleted records are marked in some special way 2.We can find space that deleted records once occupied so we can reuse that space when we add records. • The first requirement can be met by adding an asterisk to the deleted record. • For second one we can search sequentially for deleted record so that new record can be added. If we reach end of the file then record will be appended at the end of the file. Linked Lists Stacks • A stack is a list in which all insertions and removals of nodes take place at one end of the list. • If we have an available list managed as a stack that contains relative record numbers(RRN) 5 and 2 and then add RRN 3, it looks like below. Linking and Stacking Deleted Records Deleting Variable Length Records • We have mechanism for handling avail list of available space once records are deleted. This mechanism now will be applied fro reusing space from variable-length records. • To support record reuse through an avail list, we need Avail list of variable length records • Adding and removing records Storage fragmentation Binary search vs sequential search
• To find a record in two thousand records file a Binary Search will