ss_ch2
ss_ch2
Processing
Bhargavi H. Goswami
Assistant Professor
Sunshine Group of Institutions
INTRODUCTION:
• Which operation is frequently used by a
Language Processor?
• Ans: Search.
• This makes the design of data structures a
crucial issue in language processing activities.
• In this chapter we shall discuss the data
structure requirements of LP and suggest
efficient data structure to meet there
requirements.
Criteria for Classification of Data
Structure of LP:
E F
B E
A H
G F
F H
D
• C: float *ptr;
ptr = (float*)calloc(5,sizeof(float));
• The Pascal call new(p): allocates sufficient memory to
hold an integer value and puts the address of this
memory area in p.
• The C statement ptr=… : allocates a memory area
sufficient to hold 5 float values and puts its address
in ptr.
• Means, access to these memory area are
implemented through pointers. i.e p and ptr.
• Conclusion: No search is involved in accessing the
allocated memory.
SEARCH DATA STRUCTURES
Search Data Structures Topic List
• Entry Formats • Sequential Search Org
• Fixed and variable length • Binary Search Org
entries
• Hash table Org
• Hybrid entry formats
• Operations on search • Hashing functions
structures • Collision handling
• Generic Search methods.
Procedures • Linked list
• Table organizations
• Tree Structured Org
Search Data Structure:
• When we talk of ‘Search’ what is the basic requirement?
• Ans. ‘Key’. Key is the symbol field containing name of an
entity.
• Search Data Structure (also called search structure) is a set
of entries, each entry accommodating the information
concerning one entity.
• Each entry in search structure is a set of fields i.e a record, a
row.
• Each entry is divided into two parts:
– Fixed Part
– Variant Part
• The value in fixed (tag) part determines the information to
be stored in the variant part of the entry.
Entries in the symbol table of a
compiler have following field:
--------------------------------------------------------------------
Tag Value Variant Part Fields
--------------------------------------------------------------------
Variable type, length, dimension info
Procedure address of parameter list,
number of parameters
Function type of returned value, length
of returned value, address of
address of parameter list,
number of parameters
Label statement number
-----------------------------------------------------------------------
Fixed Length Entry:
1 2 3 4 5 6 7 8 9 10
1. Symbol 7. No. of
2. Class Parameters
3. Type 8. Type of returned
value
4. Length
9. Length of
5. Dimension
returned value
Information
10.Statement
6. Parameter List
number.
Address
Variable Length Entry:
1 2 3
• 1. Name
• 2. Class
• 3. Statement Number
• When class = label, all fields excepting name, class and statement
number are redundant.
• Here, Search method may require knowledge of length of entry.
• So the record would contain following fields:
– 1. A length field
– 2. Fields in fixed part including tag field
– 3. { fj | fj Є SFVj if tag = Vj }
length entry
Fixed v/s Variable
• For each value Vi in the tag field, the variant part of the entry
consists of the set of fields SFVi.
• Fixed Length Entry Format:
– 1. Fields in the fixed part of the entry.
– 2. Uvi SFvi, i.e the set of fields in all variant parts of the entry.
• In fixed length entries, all the records in search structure have an
identical format.
• This enables the use of homogeneous linear data structures like
arrays.
• Drawback?
• Inefficient use of memory. How?
• Many records may contain redundant fields.
• Solution?
• Variable Length Entry Format:
– Fields in the fixed part of entry, including the tag field
– { fj | fj Є SFVj if tag = Vj }
• This entry format leads to compact organization in which no
memory wastage occurs.
Hybrid Entry Formats:
Fixed Part Pointer length Variable Part
Search DS Allocation DS
#1
#2
Occupied Entries
#f
Free Entries
#n
After Rebalancing:
c, e, f, h, k, p, t
c t h
f e p
e h
c f k t
k
Nested Search Structures:
• Nested search structures are used when it is
necessary to support a search along a
secondary dimension within a search
structure.
• Also called multi-list structures.
• Each symbol table entry contains two fields:
– Field list
– Next field
Eg: personal_info :
record
name : array[1..10] of char;
gender : char;
id: int;
end;
personal info -
name -
gender -
id -
ALLOCATION DATA STRUCTURES
- Stacks
- Heaps
Stacks
• Is a linear data structure which satisfies following
properties:
– 1. Allocation and de-allocation are performed in a LIFO manner.
– 2. Only last element is accessible at any time.
• SB – Stack Base points to first word of stack.
• TOS – Top Of Stack points to last entry allocated to stack.
• In last fig u can see that TOS = SB – 1.
TOS
SB 10 SB 10 SB 10 SB
20 20 20
30 30 30
40 40 TOS 40
TOS 50 50
TOS 60
Extended Stack Model
• All entries may not be of same size.
• Record: A set of consecutive stack entries.
• Two new pointers exist in the model other than SB and TOS.
• 1. RB Record Base pointing to the first word of the last record
in stack.
• 2. ‘Reserve Pointer’, the first word of each record.
• The allocation and de-allocation time actions shown below:
SB SB SB,RB
RB TOS
TOS RB
TOS
Allocation
• 1. TOS := TOS + 1;
• 2. TOS* := RB;
• 3. RB := TOS;
• 4. TOS := TOS + n;
• The first statement increments TOS by one stack entry.
• Now TOS points to ‘reserved pointer’ of new record.
• 2nd statement deposits address of previous record base into
‘reserved pointer’.
• 3rd statement sets RB to point at first stack entry in the new
record.
• 4th statement performs allocation of n stack entries to the new
entity. See fig 2 in previous slide.
• The newly created entity now occupies the address <RB> + l to
<RB> + l x n.
• RB stands for contents of Record in ‘RB’.
De-Allocation
• 1. TOS := RB – 1;
• 2. RB := RB*;
• 1st statement pops a record off the stack by
resetting TOS to the value it had before the
record was allocated.
• 2nd statement points RB to base of the previous
record.
• That was all about allocation and de-allocation in
extended stack model.
• Now let us see an implementation of this model
in a Pascal program that contains nested
procedures where many symbol table must co-
exist during compilation.
Example: Consider Pascal Program
Program Sample(input,output);
var
x,y : real; SB -
i : integer; sample
Procedure calc(var a,b : real); x
var y
sum : real;
i
begin
sum := a+b;
RB
---
calc
---
end calc; a
begin {Main Program} b
---- TOS sum
----
end.
Heaps
• Non Linear Data Structure
• Permits allocation and de-allocation of entities in
random order.
• Heaps DS does not provide any specific means to
access an allocated entity.
• Hence, allocation request returns pointer to allocated
area in heap.
• Similarly, de-allocation request must present a pointer
to area to be de-allocated.
• So, it is assumed that each user of an allocated entity
maintains a pointer to the memory area allocated to
the entity.
• Lets take the example to clarify more what we talked.
Example: Status of Heap after c program
execution
float *floatptr1,*floatptr2;
Int *intprt;
floatptr1=(float *)calloc(5,sizeof(float));
floatptr2=(float *)calloc(5,sizeof(float));
intptr=(int *)calloc(5,sizeof(int));
free(floatptr2);
floatptr1
floatptr2 --
intprt
a a a
x b
b b c
c c d
--
y e
--
d d
z
e e
Reuse of Memory:
• After memory compaction, fresh allocation
can be made on free block of memory.
• Free area descriptor and count of words in
free area are updated.
• When a free list is used, two techniques
can be used to perform a fresh allocation:
– 1. First Fit Technique
– 2. Best Fit Technique
Techniques to make fresh allocation:
First Fit Best Fit
• First fit technique selects • Best fit technique finds
first free area whose size the smallest free area
is greater than or equal to whose size is greater than
n(number of words to be or equal to n.
alloacated) words. • Advantage: This enables
• Problem: Memory area more allocation request
becomes successively to be satisfied.
smaller. • Problem: In long run, it
• Result: Request for large too may suffer from
memory area may have to problem of numerous
be rejected. small free areas.
Chapter Ends Here
• Assignment Question: