0% found this document useful (0 votes)
2 views

SS Lession Notes

Uploaded by

abinavabi1806
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

SS Lession Notes

Uploaded by

abinavabi1806
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

SYSTEM SOFTWARE AND OPERATING SYSTEM

SYSTEM SOFTWARE

UNIT I

Introduction –System Software and machine architecture-Loader and Linkers: Basic Loader
Functions - Machine dependent loader features – relocation – program – linking - Machine
independent loader features - Automatic Library search - Loader options - Loader design options
- linkage editor - dynamic linking - Bootstrap loader.

INTRODUCTION

System Software consists of a variety of programs that support the operation of a computer.
It makes possible for the user to focus on an application or other problem to be solved, without
needing to know the details of how the machine works internally.

You probably wrote programs in a high level language like C, C++ or VC++, using text
editor to create and modify the program. You translated these programs into machine languages
using a compiler. The resulting machine language program was loaded into memory and
prepared for execution by loader and linker. Also used debugger to find errors in the programs.

One characteristic in which most system software differs from application software is
machine dependency.
 System software – support operation and use of computer.
 Application software - solution to a problem.

Assembler translates mnemonic instructions into machine code. The instruction formats,
addressing modes etc., are of direct concern in assembler design. Similarly, Compilers must
generate machine language code, taking into account such hardware characteristics as the
number and type of registers and the machine instructions available.
Operating systems are directly concerned with the management of nearly all of the
resources of a computing system.
There are aspects of system software that do not directly depend upon the type of
computing system, general design and logic of an assembler, general design and logic of a
compiler and, code optimization techniques, which are independent of target machines.
Likewise, the process of linking together independently assembled subprograms does not
usually depend on the computer being used.

Simplified Instructional Computer (SIC) is a hypothetical computer that includes the


hardware features most often found on real machines. There are two versions of SIC, they are,
standard model (SIC), and, extension version (SIC/XE) (extra equipment or extra expensive).

Later, you probably wrote programs in assembler language, by using macro instructions
to read and write data. You used assembler, which included macro processor, to translate these
programs into machine languages.

1
SYSTEM SOFTWARE AND OPERATING SYSTEM

You controlled all these processes by interacting with the operating system of the
computer. The operating system took care of all the machine level details for you. You should
concentrate on what you wanted to do, without worrying about how it was accomplished.

You will come to understand the processes that were going on “ behind the scenes” as
you used the computer in previous courses. By understanding the system software, you will gain
a deeper understanding of how computers actually work.

SYSTEM SOFTWARE AND MACHINE ARCHITECTURE

One characteristic in which most system software differs from application soft-ware is
machine dependency. An application program is primarily concerned with the solution of some
problem, using the computer as a tool. The focus is on the application, not on the computing
system. System programs, on the other hand, are intended to support the operation and use of the
computer itself, rather than any particular application. For this mason, they are usually related to
the architecture of the machine on which they are to run.

For example, assemblers translate mnemonic instructions into machine code; the
instruction formats, addressing modes, etc., are of direct concern in assembler design. Similarly,
compilers must generate machine language code, taking into account such hardware
characteristics as the number and type of registers and the machine instructions available.

Operating systems are directly concerned with the management of nearly all of the
resources of a computing system. Many other examples of such machine dependencies may be
found through-out this book. On the other hand, there are some aspects of system software that
do not directly depend upon the type of computing system being supported.

For example, the general design and logic of an assembler is basically the same on most
computers. Some of the code optimization techniques used by compilers are independent of the
target machine (although there are also machine-dependent optimizations). Likewise, the process
of linking together independently assembled subprograms does not usually depend on the
computer being used.

Assembler is system software which is used to convert an assembly language program to


its equivalent object code.

The input to the assembler is a source code written in assembly language (using
mnemonics) and the output is the object code. The design of an assembler depends upon the
machine architecture as the language used is mnemonic language.

An application program is primarily concerned with the solution of some problem, using
the computer as a tool. The focus is on the application, not on the computing system. System
programs, on the other hand, are intended to support the operation and use of the computer itself,
rather than any particular application. For this reason, they are usually related to the architecture
of the machine on which they are to run.

2
SYSTEM SOFTWARE AND OPERATING SYSTEM

For example,

 Assemblers translate mnemonic instructions into machine code, the instruction formats,
addressing modes, etc., are of direct concern in assembler design.

 Compilers generate machine code, taking into account such hardware characteristics as
the number and type of registers & machine instruction available.

 Operating system concerned with the management of nearly all resources of a computing
system.

Some of the system software is machine independent, the processes of linking together
independent assembled subprograms does not usually depend on the computer being used. And
the other system software is machine dependent; we must include real machines and real pieces
of software in our study.

However, most real computers have certain characteristics that are unusual or even unique. It is
difficult to distinguish between those features of the software. To avoid this problem, we present
the fundamental functions of piece of software through discussion of a Simplified Instructional
Computer (SIC).

SIC is a hypothetical computer that has been carefully designed to include the hardware
features most often found on real machines, while avoiding unusual or irrelevant complexities.

SIC MACHINE ARCHITECTURE

Memory

Memory consists of 8- bit bytes, any three consecutive bytes form a word (24 bits). All addresses
on SIC are byte addresses, words are addressed by the location of their lowest numbered byte.
There are total of 32768 bytes in the computer memory.

Registers

There are five registers, all of which have special uses. Each register is 24 bits in length.

Mnemonic Number Special Use

A 0 Accumulator, used for arithmetic operations

X 1 Index register, used for Addressing

L 2 Linkage register, the jump to subroutine instruction stores

There turn address in this register .

3
SYSTEM SOFTWARE AND OPERATING SYSTEM

PC 8 Program counter, contains the address of the next

Instruction to be fetched for execution.

SW 9 Status word, contains a variety of information, including a

Condition Code.

Data format
• Integers are stored as 24-bit binary numbers; 2’s complement representation is used for
negative numbers.
• Characters are store using their 8-bit ASCII codes.
• There is no floating-point hardware on SIC.

Instruction Format
• All machine instructions on SIC has the following 24-bit format.

used to indicate indexed-addressing mode.


Addressing Modes
X is only two modes are supported:
– Direct
– Indexed

() are used to indicate the content of a register.

4
SYSTEM SOFTWARE AND OPERATING SYSTEM

Instruction Set
SIC provide a basic set of instructions that are sufficient for most simple task.
• Load and store registers (LDA, LDX, STA, STX)
• Integer arithmetic (ADD, SUB, MUL, DIV), all involve register A and a word in
memory.
• Comparison (COMP), involve register A and a word in memory.
• Conditional jump (JLE, JEQ, JGT, etc.)
• Subroutine linkage (JSUB, RSUB).
INPUT AND OUTPUT
• One byte at a time to or from the rightmost 8 bits of register A.
• Each device has a unique 8-bit ID code.
• Test device (TD): test if a device is ready to send or receive a byte of data.
• Read data (RD): read a byte from the device to register A
• Write data (WD): write a byte from register A to the device.

Sic Machine Architecture


Mnemonic Number Special Use

B 3 Used for addressing; know as the base register.

S 4 No special use, general purpose register.

T 5 No special use, general purpose register.

F 6 Floating point accumulator register (This register is 48-bits

instead of 24).

Memory
• Two versions: SIC and SIC/XE (extra equipments). SIC program can be executed on
SIC/XE.
• Memory consists of 8-bit bytes. 3 consecutive bytes form a word (24 bits)

5
SYSTEM SOFTWARE AND OPERATING SYSTEM

• In total, there are 2^15 bytes in the memory.


• There are 5 registers. Each is 24 bits in length.

Addressing Modes for SIC and SIC/XE

The Simplified Instruction Computer has three instruction formats, and the Extra Equipment
add-on includes a fourth. The instruction formats provide a model for memory and data
management. Each format has a different representation in memory:

 Format 1: Consists of 8 bits of allocated memory to store instructions.


 Format 2: Consists of 16 bits of allocated memory to store 8 bits of instructions
and two 4-bits segments to store operands.
 Format 3: Consists of 6 bits to store an instruction, 6 bits of flag values, and 12
bits of displacement.
 Format 4: Only valid on SIC/XE machines, consists of the same elements as
format 3, but instead of a 12-bit displacement, stores a 20-bit address.

Both format 3 and format 4 have six-bit flag values in them, consisting of the following flag bits:

 n: Indirect addressing flag


 i: Immediate addressing flag
 x: Indexed addressing flag
 b: Base address-relative flag
 p: Program counter-relative flag
 e: Format 4 instruction flag

SIC PROGRAMMING EXAMPLES:


COPY START 1000
FIRST STL RETADR
CLOOP JSUB RDREC
LDA LENGTH
COMP ZERO
JEQ ENDFIL
JSUB WRREC
J CLOOP
ENDFIL LDA EOF
STA BUFFER
LDA THREE
STA LENGTH
JSUB WRREC
LDL RETADR
RSUB
EOF BYTE C'EOF'
THREE WORD 3
ZERO WORD 0
RETADR RESW 1
LENGTH RESW 1
BUFFER RESB 4096

6
SYSTEM SOFTWARE AND OPERATING SYSTEM

.
. SUBROUTINE TO READ RECORD INTO BUFFER
.
RDREC LDX ZERO
LDA ZERO
RLOOP TD INPUT
JEQ RLOOP
RD INPUT
COMP ZERO
JEQ EXIT
STCH BUFFER,X
TIX MAXLEN
JLT RLOOP
EXIT STX LENGTH
RSUB
INPUT BYTE X'F1'
MAXLEN WORD 4096
.
. SUBROUTINE TO WRITE RECORD FROM BUFFER
.
WRREC LDX ZERO
WLOOP TD OUTPUT
JEQ WLOOP
LDCH BUFFER,X
WD OUTPUT
TIX LENGTH
JLT WLOOP
RSUB
OUTPUT BYTE X'06'
END FIRST

LOADERS AND LINKERS


Introduction
The Source Program written in assembly language or high level language will be
converted to object program, which is in the machine language form for execution. This
conversion either from assembler or from compiler, contains translated instructions and data
values from the source program, or specifies addresses in primary memory where these items are
to be loaded for execution.
This contains the following three processes, and they are,
Loading - which allocates memory location and brings the object program into memory
for execution - (Loader)
Linking- which combines two or more separate object programs and supplies the
information needed to allow references between them - (Linker)
Relocation - Which modifies the object program so that it can be loaded at an address
different from the location originally specified - (Linking Loader)

Linker:
In high level languages, some built in header files or libraries are stored. These libraries
are predefined and these contain basic functions which are essential for executing the program.
These functions are linked to the libraries by a program called Linker. If linker does not find a

7
SYSTEM SOFTWARE AND OPERATING SYSTEM

library of a function then it informs to compiler and then compiler generates an error. The
compiler automatically invokes the linker as the last step in compiling a program.
Not built in libraries, it also links the user defined functions to the user defined
libraries. Usually a longer program is divided into smaller subprograms called modules. And
these modules must be combined to execute the program. The process of combining the modules
is done by the linker.
Loader:
Loader is a program that loads machine codes of a program into the system memory.
In Computing, a loader is the part of an Operating System that is responsible for loading
programs. It is one of the essential stages in the process of starting a program. Because it places
programs into memory and prepares them for execution. Loading a program involves reading the
contents of executable file into memory. Once loading is complete, the operating system starts
the program by passing control to the loaded program code. All operating systems that support
program loading have loaders. In many operating systems the loader is permanently resident in
memory.
BASIC LOADER FUNCTIONS
A loader is a system program that performs the loading function. It brings object program
into memory and starts its execution. translator may be assembler/complier, which generates the
object program and later loaded to the memory by the loader for execution. The translator is
specifically an assembler, which generates the object loaded, which becomes input to the loader.
Type of Loaders
The different types of loaders are, absolute loader, bootstrap loader, relocating loader
(relative loader), and, direct linking loader.

ABSOLUTE LOADER
The operation of absolute loader is very simple. The object code is loaded to specified
locations in the memory. At the end the loader jumps to the specified address to begin execution
of the loaded program. The advantage of absolute loader is simple and efficient. But the
disadvantages are, the need for programmer to specify the actual address, and, difficult to use
subroutine libraries.

Memory address content

8
SYSTEM SOFTWARE AND OPERATING SYSTEM

Fig : program loaded in memory

The algorithm for this type of loader is given here. The object program and, the object
program loaded into memory by the absolute loader are also shown. Each byte of assembled
code is given using its hexadecimal representation in character form. Easy to read by human
beings. Each byte of object code is stored as a single byte. Most machine store object programs
in a binary form, and we must be sure that our file and device conventions do not cause some of
the program bytes to be interpreted as control characters.

Begin
read Header record
verify program name and length
read first Text record
while record type is <> ‘E’ do
begin
{if object code is in character form, convert into internal representation}
move object code to specified location in memory
read next object program record
end
jump to address specified in End record
end

A SIMPLE BOOTSTRAP LOADER


When a computer is first turned on or restarted, a special type of absolute loader, called
bootstrap loader is executed. This bootstrap loads the first program to be run by the computer
usually an operating system. The bootstrap itself begins at address 0. It loads the OS starting
address 0x80. No header record or control information, the object code is consecutive bytes of
memory.
The algorithm for the bootstrap loader is as follows
Begin
X=0x80 (the address of the next memory location to be loaded
Loop
AGETC (and convert it from the ASCII character
code to the value of the hexadecimal digit)
save the value in the high-order 4 bits of S
AGETC
combine the value to form one byte A<-(A+S)
store the value (in A) to the address in register X
XX+1
End
It uses a subroutine GETC, which is
GETC Aread one character
if A=0x04 then jump to 0x80
if A<48 then GETC
A A-48 (0x30)
if A<10 then return

9
SYSTEM SOFTWARE AND OPERATING SYSTEM

A A-7
Return

MACHINE-DEPENDENT LOADER FEATURES


Absolute loader is simple and efficient, but the scheme has potential disadvantages One
of the most disadvantage is the programmer has to specify the actual starting address, from
where the program to be loaded. This does not create difficulty, if one program to run, but not for
several programs. Further it is difficult to use subroutine libraries efficiently. This needs the
design and implementation of a more complex loader. The loader must provide program
relocation and linking, as well as simple loading functions.
RELOCATION
The concept of program relocation is, the execution of the object program using any part
of the available and sufficient memory. The object program is loaded into memory wherever
there is room for it. The actual starting address of the object program is not known until load
time. Relocation provides the efficient sharing of the machine with larger memory and when
several independent programs are to be run together. It also supports the use of subroutine
libraries efficiently. Loaders that allow for program relocation are called relocating loaders or
relative loaders.
Methods for specifying relocation
Use of modification record and, use of relocation bit, are the methods available for
specifying relocation. In the case of modification record, a modification record M is used in the
object program to specify any relocation. In the case of use of relocation bit, each instruction is
associated with one relocation bit and, these relocation bits in a Text record is gathered into bit
masks.
Modification records are used in complex machines and are also called Relocation and
Linkage Directory (RLD) specification. The format of the modification record (M) is as follows.
The object program with relocation by Modification records is also shown here.
Modification record
col 1: M
col 2-7: relocation address
col 8-9: length (halfbyte)
col 10: flag (+/-)
col 11-17: segment name

The relocation bit method is used for simple machines. Relocation bit is 0: no
modification is necessary, and is 1: modification is needed. This is specified in the columns 10-
12 of text record (T), the format of text record, along with relocation bits is as follows.
Text record
col 1: T
col 2-7: starting address
col 8-9: length (byte)
col 10-12: relocation bits
col 13-72: object code
Twelve-bit mask is used in each Text record (col:10-12 – relocation bits), since each text
record contains less than 12 words, unused words are set to 0, and, any value that is to be
modified during relocation must coincide with one of these 3-byte segments. For absolute

10
SYSTEM SOFTWARE AND OPERATING SYSTEM

loader, there are no relocation bits column 10-69 contains object code. The object program with
relocation by bit mask is as shown below. Observe FFC - means all ten words are to be
modified and, E00 - means first three records are to be modified.

PROGRAM LINKING
The Goal of program linking is to resolve the problems with external references
(EXTREF) and external definitions (EXTDEF) from different control sections.
EXTDEF (external definition) - The EXTDEF statement in a control section names symbols,
called external symbols, that are defined in this (present) control section and may be
used by other sections.
ex: EXTDEF BUFFER, BUFFEND, LENGTH
EXTDEF LISTA, ENDA
EXTREF (external reference) - The EXTREF statement names symbols used in this
(present) control section and are defined elsewhere.
ex: EXTREF RDREC, WRREC
EXTREF LISTB, ENDB, LISTC, ENDC
How to implement EXTDEF and EXTREF
The assembler must include information in the object program that will cause the loader
to insert proper values where they are required – in the form of Define record (D) and, Refer
record(R).
Define record
The format of the Define record (D) along with examples is as shown here.
Col. 1 D
Col. 2-7 Name of external symbol defined in this control section
Col. 8-13 Relative address within this control section (hexadecimal)
Col.14-73 Repeat information in Col. 2-13 for other external symbols
Example records
D LISTA 000040 ENDA 000054
D LISTB 000060 ENDB 000070
Refer record
The format of the Refer record (R) along with examples is as shown here.
Col. 1 R
Col. 2-7 Name of external symbol referred to in this control section
Col. 8-73 Name of other external reference symbols
Example records
R LISTB ENDB LISTC ENDC
R LISTA ENDA LISTC ENDC
R LISTA ENDA LISTB ENDB
Here are the three programs named as PROGA, PROGB and PROGC, which are
separately assembled and each of which consists of a single control section. LISTA, ENDA in
PROGA, LISTB, ENDB in PROGB and LISTC, ENDC in PROGC are external definitions in
each
of the control sections. Similarly LISTB, ENDB, LISTC, ENDC in PROGA, LISTA, ENDA,
LISTC,
ENDC in PROGB, and LISTA, ENDA, LISTB, ENDB in PROGC, are external references.
These sample programs given here are used to illustrate linking and relocation. The following

11
SYSTEM SOFTWARE AND OPERATING SYSTEM

figures give the sample programs and their corresponding object programs. Observe the object
programs, which contain D and R records along with other records.
0000 PROGA START 0
EXTDEF LISTA, ENDA
EXTREF LISTB, ENDB, LISTC, ENDC
………..
……….
0020 REF1 LDA LISTA 03201D
0023 REF2 +LDT LISTB+4 77100004
0027 REF3 LDX #ENDA-LISTA 050014
..
0040 LISTA EQU *
0054 ENDA EQU *
0054 REF4 WORD ENDA-LISTA+LISTC 000014
0057 REF5 WORD ENDC-LISTC-10 FFFFF6
005A REF6 WORD ENDC-LISTC+LISTA-1 00003F
005D REF7 WORD ENDA-LISTA-(ENDB-LISTB) 000014
0060 REF8 WORD LISTB-LISTA FFFFC0
END REF1
0000 PROGB START 0
EXTDEF LISTB, ENDB
EXTREF LISTA, ENDA, LISTC, ENDC
………..
……….
0036 REF1 +LDA LISTA 03100000
003A REF2 LDT LISTB+4 772027
003D REF3 +LDX #ENDA-LISTA 05100000
..
0060 LISTB EQU *
0070 ENDB EQU *
0070 REF4 WORD ENDA-LISTA+LISTC 000000
0073 REF5 WORD ENDC-LISTC-10 FFFFF6
0076 REF6 WORD ENDC-LISTC+LISTA-1 FFFFFF
0079 REF7 WORD ENDA-LISTA-(ENDB-LISTB) FFFFF0
007C REF8 WORD LISTB-LISTA 000060
END
0000 PROGC START 0
EXTDEF LISTC, ENDC
EXTREF LISTA, ENDA, LISTB, ENDB
………..
………..
0018 REF1 +LDA LISTA 03100000
001C REF2 +LDT LISTB+4 77100004
0020 REF3 +LDX #ENDA-LISTA 05100000
..
0030 LISTC EQU *

12
SYSTEM SOFTWARE AND OPERATING SYSTEM

0042 ENDC EQU *


0042 REF4 WORD ENDA-LISTA+LISTC 000030
0045 REF5 WORD ENDC-LISTC-10 000008
0045 REF6 WORD ENDC-LISTC+LISTA-1 000011
004B REF7 WORD ENDA-LISTA-(ENDB-LISTB) 000000
004E REF8 WORD LISTB-LISTA 000000
END
H PROGA 000000 000063
D LISTA 000040 ENDA 000054
R LISTB ENDB LISTC ENDC
..
T 000020 0A 03201D 77100004 050014
..
T 000054 0F 000014 FFFF6 00003F 000014 FFFFC0
M000024 05+LISTB
M000054 06+LISTC
M000057 06+ENDC
M000057 06 -LISTC
M00005A06+ENDC
M00005A06 -LISTC
M00005A06+PROGA
M00005D06-ENDB
M00005D06+LISTB
M00006006+LISTB
M00006006-PROGA
E000020
H PROGB 000000 00007F
D LISTB 000060 ENDB 000070
R LISTA ENDA LISTC ENDC
.
T 000036 0B 03100000 772027 05100000
.
T 000007 0F 000000 FFFFF6 FFFFFF FFFFF0 000060
M000037 05+LISTA
M00003E 06+ENDA
M00003E 06 -LISTA
M000070 06 +ENDA
M000070 06 -LISTA
M000070 06 +LISTC
M000073 06 +ENDC
M000073 06 -LISTC
M000073 06 +ENDC
M000076 06 -LISTC
M000076 06+LISTA
M000079 06+ENDA
M000079 06 -LISTA

13
SYSTEM SOFTWARE AND OPERATING SYSTEM

M00007C 06+PROGB
M00007C 06-LISTA
E
H PROGC 000000 000051
D LISTC 000030 ENDC 000042
R LISTA ENDA LISTB ENDB
.
T 000018 0C 03100000 77100004 05100000
.
T 000042 0F 000030 000008 000011 000000 000000
M000019 05+LISTA
M00001D 06+LISTB
M000021 06+ENDA
M000021 06 -LISTA
M000042 06+ENDA
M000042 06 -LISTA
M000042 06+PROGC
M000048 06+LISTA
M00004B 06+ENDA
M00004B 006-LISTA
M00004B 06-ENDB
M00004B 06+LISTB
M00004E 06+LISTB
M00004E 06-LISTA
E
The following figure shows these three programs as they might appear in memory after
loading and linking. PROGA has been loaded starting at address 4000, with PROGB and
PROGC immediately following.

For example, the value for REF4 in PROGA is located at address 4054 (the beginning
address of PROGA plus 0054, the relative address of REF4 within PROGA). The following
figure shows the details of how this value is computed.

The initial value from the Text record T0000540F000014FFFFF600003F000014FFFFC0


is 000014. To this is added the address assigned to LISTC, which is 4112 (the beginning address
of PROGC plus 30). The result is 004126. That is REF4 in PROGA is ENDA-
LISTA+LISTC=4054-4040+4112=4126. Similarly the load address for symbols LISTA:
PROGA+0040=4040, LISTB:
PROGB+0060=40C3 and LISTC: PROGC+0030=4112
Keeping these details work through the details of other references and values of these references
are the same in each of the three programs.

ALGORITHM AND DATA STRUCTURES FOR A LINKING LOADER


The algorithm for a linking loader is considerably more complicated than the absolute
loader program, which is already given. The concept given in the program linking section is used
for developing the algorithm for linking loader. The modification records are used for relocation

14
SYSTEM SOFTWARE AND OPERATING SYSTEM

so that the linking and relocation functions are performed using the same mechanism. Linking
Loader uses two-passes logic. ESTAB (external symbol table) is the main data structure for a
linking loader.
Pass 1: Assign addresses to all external symbols
Pass 2: Perform the actual loading, relocation, and linking

ESTAB - ESTAB for the example (refer three programs PROGA PROGB and PROGC)
given is as shown below.
The ESTAB has four entries in it; they are name of the control section, the symbol appearing in
the control section, its address and length of the control section.
Control section Symbol Address Length
PROGA 4000 63
LISTA 4040
ENDA 4054
PROGB 4063 7F
LISTB 40C3
ENDB 40D3
PROGC 40E2 51
LISTC 4112
ENDC 4124
Program Logic for Pass 1
Pass 1 assign addresses to all external symbols. The variables & Data structures used
during pass 1 are, PROGADDR (program load address) from OS, CSADDR (control section
address), CSLTH (control section length) and ESTAB. The pass 1 processes the Define Record.
The algorithm for Pass 1 of Linking Loader is given below.

Program Logic for Pass 2


Pass 2 of linking loader perform the actual loading, relocation, and linking. It uses
modification record and lookup the symbol in ESTAB to obtain its address. Finally it uses end
record of a main program to obtain transfer address, which is a starting address needed for the
execution of the program. The pass 2 process Text record and Modification record of the object
programs. The algorithm for Pass 2 of Linking Loader is given below. Improve Efficiency,
The question here is can we improve the efficiency of the linking loader. Also observe
that, even though we have defined Refer record (R), we haven’t made use of it. The efficiency
can be improved by the use of local searching instead of multiple searches of ESTAB for the
same symbol. For implementing this we assign a reference number to each external symbol in
the Refer record. Then this reference number is used in Modification records instead of external
symbols. 01 is assigned to control section name, and other numbers for external reference
symbols.
The object programs for PROGA, PROGB and PROGC are shown below, with above
modification to Refer record ( Observe R records).
Symbol and Addresses in PROGA, PROGB and PROGC are as shown below. These
are the entries of ESTAB. The main advantage of reference number mechanism is that it avoids
multiple searches of ESTAB for the same symbol during the loading of a control section
Ref No. Symbol Address
1 PROGA 4000

15
SYSTEM SOFTWARE AND OPERATING SYSTEM

2 LISTB 40C3
3 ENDB 40D3
4 LISTC 4112
5 ENDC 4124

MACHINE-INDEPENDENT LOADER FEATURES


Machine-independent loader features are not directly related to machine architecture and
design. Automatic Library Search and Loader Options are such Machine independent Loader
Features.
AUTOMATIC LIBRARY SEARCH
This feature allows a programmer to use standard subroutines without explicitly
including them in the program to be loaded. The routines are automatically retrieved from a
library as they are needed during linking. This allows programmer to use subroutines from one or
more libraries. The subroutines called by the program being loaded are automatically fetched
from the library, linked with the main program and loaded. The loader searches the library or
libraries specified for routines that contain the definitions of these symbols in the main program.

Ref No. Symbol Address


1 PROGB 4063
2 LISTA 4040
3 ENDA 4054
4 LISTC 4112
5 ENDC 4124

Ref No. Symbol Address


1 PROGC 4063
2 LISTA 4040
3 ENDA 4054
4 LISTB 40C3
5 ENDB 40D3

LOADER OPTIONS
Loader options allow the user to specify options that modify the standard processing. The
options may be specified in three different ways. They are, specified using a command language,
specified as a part of job control language that is processed by the operating system, and can be
specified using loader control statements in the source program.
Here are the some examples of how option can be specified.
INCLUDE program-name (library-name) - read the designated object program from a
library
DELETE csect-name – delete the named control section from the set pf programs being
loaded
CHANGE name1, name2 - external symbol name1 to be changed to name2 wherever it
appears in the object programs
LIBRARY MYLIB – search MYLIB library before standard libraries

16
SYSTEM SOFTWARE AND OPERATING SYSTEM

NOCALL STDDEV, PLOT, CORREL – no loading and linking of unneeded


routines .Here is one more example giving, how commands can be specified as a part of object
File, and the respective changes are carried out by the loader.

LIBRARY UTLIB
INCLUDE READ (UTLIB)
INCLUDE WRITE (UTLIB)
DELETE RDREC, WRREC
CHANGE RDREC, READ
CHANGE WRREC, WRITE
NOCALL SQRT, PLOT

The commands are, use


 UTLIB ( say utility library),
 include READ and WRITE control sections from the library,
 delete the control sections RDREC and WRREC from the load,
 the change command causes all external references to the symbol RDREC to be changed
to the symbol READ,
 similarly references to WRREC is changed to WRITE,
 finally, no call to the functions SQRT, PLOT, if they are used in the program.

LOADER DESIGN OPTIONS


There are some common alternatives for organizing the loading functions, including
relocation and linking. Linking Loaders – Perform all linking and relocation at load time. The
Other Alternatives are Linkage editors, which perform linking prior to load time and, dynamic
linking, in which linking function is performed at execution time.

LINKING LOADERS
The below diagram shows the processing of an object program using Linking Loader.
The source program is first assembled or compiled, producing an object program. A linking
loader performs all linking and loading operations, and loads the program into memory for
execution.

17
SYSTEM SOFTWARE AND OPERATING SYSTEM

LINKAGE EDITORS
The above figure shows the processing of an object program using Linkage editor. A
Linkage editor produces a linked version of the program often called a load module or an
executable image which is written to a file or library for later execution.
The linked program produced is generally in a form that is suitable for processing by a
relocating loader. Some useful functions of Linkage editor are, an absolute object program can
be created, if starting address is already known.
New versions of the library can be included without changing the source program.
Linkage editors can also be used to build packages of subroutines or other control sections that
are generally used together.
Linkage editors often allow the user to specify that external references are not to be
resolved by automatic library search – linking will be done later by linking loader – linkage
editor + linking loader – savings in space.

Distinguish linking loader from linkage editor.

Linking Loader Linkage Editor

1. Performs linking and relocation at load 1. Linking is done prior to load time.
time. 2. Writes a linked version of
2. Loads the linked program directly into program, which is later executed
the memory. by relocating loader

18
SYSTEM SOFTWARE AND OPERATING SYSTEM

3. Linking loader has less flexibility and 3. Linkage editors offer more
control flexibility and control

DYNAMIC LINKING
The scheme that postpones the linking functions until execution. A subroutine is loaded
and linked to the rest of the program when it is first called – usually called dynamic linking,
dynamic loading or load on call. The advantages of dynamic linking are, it allow several
executing programs to share one copy of a subroutine or library. In an object oriented system,
dynamic linking makes it possible for one object to be shared by several programs. Dynamic
linking provides the ability to load the routines only when (and if) they are needed. The actual
loading and linking can be accomplished using operating system service request.

BOOTSTRAP LOADERS
The bootstrap loader loads the first program to be run by the computer, usually it is an
operating system. The bootstrap loader is a small program that runs before any other normal
program can run. It is stored on non-volatile storage (normally the computer's ROM) so that it
can still be used after the computer has been switched off and then on again.
It gives instructions as to where the operating system on a microcomputer is to be found.
If the question, how is the loader itself loaded into the memory? is asked, then the answer is,
when computer is started – with no program in memory, a program present in ROM ( absolute
address) can be made executed – may be OS itself or A Bootstrap loader, which in turn loads OS
and prepares it for execution. The first record ( or records) is generally referred to as a bootstrap

19
SYSTEM SOFTWARE AND OPERATING SYSTEM

loader – makes the OS to be loaded. Such a loader is added to the beginning of all object
programs that are to be loaded into an empty and idle system.

 A bootstrap loader is a small program which is held in ROM.


 The processor executes this code when it gets the reset (or powerup) signal.
 The bootstrap loader does a few hardware checks and then causes the processor to load
and execute the code in the boot sector of the start-up hard disc.
 Finally the processor will load the main part of the operating system from disk into main
memory

SUMMARY:

SUMMARY:
 System software – support operation and use of computer.
Application software - solution to a problem.

 Compilers must generate machine language code, taking into account such hardware
characteristics as the number and type of registers and the machine instructions available.
 Simplified Instructional Computer (SIC) is a hypothetical computer that includes the
hardware features most often found on real machines.
 The simple assembler uses two major internal data structures:
• Operation Code Table(OPTAB)
• Symbol Table (SYMTAB).
 Location counter helps in the assignment of the addresses.
 The address is mentioned during assembling itself. This is called Absolute Assembly.
 The actual address of a memory location, also called an absolute address;
 These are the features which do not depend on the architecture of the machine. These are:
• Literals
• Symbol-Defining Statements
• Expressions
• Program blocks
• Control sections and program linking
 A literal is defined with a prefix = followed by a specification of the literal value.
 This directive can be used to indirectly assign values to the symbols. The directive is
usually called ORG (for origin). Its general format is: ORG value

20
SYSTEM SOFTWARE AND OPERATING SYSTEM

 Program blocks allow the generated machine instructions and data to appear in the
object program in a different order by Separating blocks for storing code, data,
stack, and larger data block.
 A control section is a part of the program that maintains its identity after assembly; each
control section can be loaded and relocated independently of the others.
 Loading - which allocates memory location and brings the object program into memory
for execution - (Loader)
 Linking- which combines two or more separate object programs and supplies the
information needed to allow references between them - (Linker)
 Relocation - which modifies the object program so that it can be loaded at an address
different from the location originally specified - (Linking Loader).
 The different types of loaders are, absolute loader, bootstrap loader, relocating loader
(relative loader), and, direct linking loader
 The scheme that postpones the linking functions until execution. A subroutine is loaded
and linked to the rest of the program when it is first called – usually called dynamic
linking, dynamic loading or load on call.
 The bootstrap loader loads the first program to be run by the computer, usually it is an
operating system.

21
SYSTEM SOFTWARE AND OPERATING SYSTEM

UNIT I

SECTION A
1. In 2 pass assembler the literal operand recognized during________pass.
2. When a computer turned on/restarted ____________ Program will be executed.
3. Assembler is a __________.
4. The _______ loader loads the first program to be run by the computer, usually an
operating system.
5. The codes such as START and END in an assemble programs are __________.
6. SIC stands for__________.
7. A reference to a table that is defined later is the program is called as _________.
8. The _______ loader loads the first program to be run by the computer, usually an
operating system.
9. CISC stands for_____________.
10. .____________ is the assembler directives reserve the indicated number of bytes for a
data set.
SECTION B
1. Write short notes on System Software and Machine Architecture
2. Describe a simple SIC Assembler.
3. What is Program Relocation in Machine-Dependent Assembler?
4. Describe Literals in Machine-Independent Assembler Features.
5. Write short notes on Control Sections and Program Linking in Machine-Independent
Assembler.
6. Explain the Significance of one pass assembler.
7. Explain the internal data structures of an assembler.
8. Explain the operations provided by the loader options.
9. What are the functions of an assembler? Write briefly about each one.
10. Write short notes on linkage editors.
SECTION C
1. Explain the design of a two pass assembler with associated algorithm.
2. List out the various steps that should be followed by the designer to design assembler.
3. Explain the features of Machine independent assembler.
4. Describe the concept of loader design options.
5. Describe the significance of machine independent assembler.
6. Explain the Features of machine dependent loader.
7. Explain the significance of machine-independent loader features.
8. Explain the 'Program relocation and Program Linking.

22
SYSTEM SOFTWARE AND OPERATING SYSTEM

9. Explain the basic functions of loader.


10. Discuss the assembler algorithm and data structures in detail.

UNIT II

Machine dependent compiler features - Intermediate form of the program-Machine dependent


code optimization-machine independent compiler features-Compiler design options-division
into passes-interpreters-p –code compilers-compiler-compilers.

MACHINE INDEPENDENT COMPILER FEATURES


Machine independent compilers describe the method for handling structured variables
such as arrays. Problems involved in compiling a block-structured language indicate some
possible solution.
STRUCTURED VARIABLES
Structured variables discussed here are arrays, records, strings and sets. The primarily
consideration is the allocation of storage for such variable and then the generation of code to
reference them. The same principles can also be applied to the other types of structured
variables.
Arrays: In Pascal array declaration -
(i)Single dimension array:
A: ARRAY [ 1 . . 10] OF INTEGER
If each integer variable occupies one word of memory, then we require 10 words of
memory to store this array. In general an array declaration is
ARRAY [ l .. u ] OF INTEGER
Memory word allocated = ( u - l + 1) words.
(ii)Two dimension array :
B : ARRAY [ 0 .. 3,1 . . 3 ] OF INTEGER
In this type of declaration total word memory required is 0 to 3 = 4 ;1 - 3 = 3 ; 4 x 3 = 12 word
memory locations.
In general : ARRAY [ l1 .. u1,l2 . . u2.] OF INTEGER Requires ( u1 - l1 + 1)* ( u2 -l2 + 1)
Memory words

23
SYSTEM SOFTWARE AND OPERATING SYSTEM

The data is stored in memory in two different ways. They are row-major and column
major. All array elements that have the same value of the first subscript are stored in contiguous
locations. This is called row-major order. It is shown in fig. 30(a). Another way of looking at this
is to scan the words of the array in sequence and observe the subscript values. In row-major
order, the right most subscript varies most rapidly.
Fig. 30(b) shows the column major way of storing the data in memory. All elements
that have the same value of the second subscript are stored together; this is called column major
order. In other words, the column major order, the left most subscript varies most rapidly.
To refer to an element, we must calculate the address of the referenced element relative
to the base address of the array. Compiler would generate code to place the relative address in an
index register. Index addressing mode is made easier to access the desired array element.
(1) One Dimensional Array: On a SIC machine to access A [6], the address is
Calculated by starting address of data + size of each data * number of preceding data.
i.e. Assuming the starting address is 1000H
Size of each data is 3 bytes on SIC machine
Number of preceding data is 5
Therefore the address for A [ 6 ] is = 1000 + 3 * 5 = 1015.In general for A:ARRAY
[ l . . u ] of integer, if each array element occupies W bytes of storage and if the value of the
subscript is S, then the relative address of the referred element A[ S ] is given by W * ( S - l ).
The code generation to perform such a calculation is shown in fig. 31.
The notation A [ i2 ] in quadruple 3 specifies that the generated machine code
Should refer to A using index addressing after having placed the value
A: ARRAY [ 1 . . 10 ] OF INTEGER
(2) Multi-Dimensional Array: In multi-dimensional array we assume row major order. To
access element B[ 2,3 ] of the matrix B[ 6, 4 ], we must skip over two complete rows before
arriving at the beginning of row 2. Each row contains 6 elements so we have to skip 6 x 2 = 12
array elements before we come to the beginning of row 2 to arrive at B[ 2, 3 ]. To skip over the
first two elements of row 2 to arrive at B[ 2, 3 ]. This makes a total of 12 + 2 = 14 elements
between the beginning of the array and element B[2, 3 ]. If each element occurs 3 byte as in SIC,
the B[2, 3] is located relating at 14 x 3 =42 address within the array.
Generally the two dimensional array can be written as
B ; ARRAY [ l1 . . . u1,l1 . . . u1, ]OF INTEGER
Code Generation for Two Dimensional Array
The symbol - table entry for an array usually specifies the following:

24
SYSTEM SOFTWARE AND OPERATING SYSTEM

1. The type of the elements in the array


2. The number of dimensions declared
3. The lower and upper limit for each subscript.
4. This information is sufficient for the compiler to generate the code required for array
reference. Some of the languages line FORTRAN 90, the values of ROWS and
COLUMNS are not known at completion time. The compiler cannot directly generate
code. Then, the compiler create a descriptor called dope vector for the array. The
descriptor includes space for storing the lower and upper bounds for each array subscript.
When storage is allocated for the array, the values of these bounds are computed and
stored in the descriptor. The generated code for one array reference uses the values from
the descriptor to calculate relative addresses as required. The descriptor may also include
the number of dimension for the array, the type of the array elements and a pointer to the
beginning of the array. This information can be useful if the allocated array is passed as a
parameter to another procedure.

25
SYSTEM SOFTWARE AND OPERATING SYSTEM

Code Generation For Array References


For example, FORTRAN 90 provides dynamic arrays. Using this feature, a two-dimensional
array could be declared as

INTEGER, ALLOCABLE, ARRAY (:,:) :: MATRIX

This specifies that MATRIX is an array of integers that can be allocated lynamically. The
allocation can be accomplished by a statement like

ALLOCATE (MATRIX(ROWS,COLUMNS))

Where the variables ROWS and COLUMNS have previously been assigned values. Since the
values of ROWS and COLUMNS are not known at compilation time.In the compilation of other
structured variables like recode, string and sets the same type of storage allocations are required.
The compiler must store information concerning the structure of the variable and use the
information to generate code to access components of the structure and it must construct a
description for situation in which the required conformation is not known at compilation time.

MACHINE - INDEPENDENT CODE OPTIMIZATION


One important source of code optimization is the elimination of common sub-
expressions. These are sub-expressions that appear at more than one point in the program and
that compute the same value.
x, y : ARRAY [ 0 . . 10,1 . . 10 ] OF INTEGER
....FOR I : = 1 TO 10 DO
X [ I, 2 * J - 1 ]: = Y[ I,2 * J }
The sub-expression 2 * J is calculated twice. An optimizing compiler should generate
code so that the multiplication is performed only once and the result is used in both places.
Common sub-expressions are usually detected through the analysis of an intermediate form of
the program.

26
SYSTEM SOFTWARE AND OPERATING SYSTEM

(8) JLE I #20 (4)

CODE OPTIMIZATION BY REDUCTION IN STRENGTH OF OPERATIONS

The operand J is not changed in value between quadruples 5 and 12. It is not possible to
reach quadruple 12 without passing through quadruple 5 first because the quadruples are part of
the same basic block. Therefore, quadruples 5 and 12 compute the same value. This means we
can delete quadruple 12 and replace any reference to its result (i10), with the reference to i3, the
result of quadruple 5. this information eliminates the duplicate calculation of 2 * J which we
identified previously as a common expression in the source statement.
After the substitution of i3 for i10 , quadruples 6 and 13 are the same except for the
name of the result. Hence the quadruple 13 can be removed and substitute i4 for i11wherever it is
used. Similarly quadruple 10 and 11 can be removed because they are equivalent to quadruples 3
and 4.

STORAGE ALLOCATION
All the program defined variable, temporary variable, including the location used to
save the return address use simple type of storage assignment called static allocation.
When recursively procedures are called, static allocation cannot be used. This is
explained with an example. Fig. 38(a) shows the operating system calling the program MAIN.
The return address from register 'L' is stored as a static memory location RETADR within
MAIN.

27
SYSTEM SOFTWARE AND OPERATING SYSTEM

MAIN has called the procedure SUB. The return address for the call has been stored at a
fixed location within SUB (invocation 2). If SUB now calls itself recursively as shown in a
problem occurs.SUB stores the return address for invocation 3 into RETADR from register L.
This destroys the return address for invocation 2. As a result, there is no possibility of ever
making a correct return to MAIN.
There is no provision of saving the register contents. When the recursive call is made,
variable within SUB may set few variables. These variables may be destroyed. However, these
previous values may be needed by invocation 2 or SUB after the return from the recursive call.
Hence it is necessary to preserve the previous values of any variables used by SUB, including
parameters, temporaries, return addresses, and register save areas etc., when a recursive call is
made. This is accomplished with a dynamic storage allocation technique.
In this technique, each procedure call creates an activation record that contains storage
for all the variables used by the procedure. If the procedure is called recursively, another
activation record is created. Each activation record is associated with a particular invocation of
the procedure, not with the itself. An activation record is not deleted until a return has been made
from the corresponding invocation.
Activation records are typically allocated on a stack, with the correct record at the tip of
the stack. The procedure MAIN has been called; its activation record appears on the stack. The
base register B has been set to indicate the starting address of this correct activation record. The
first word in an activation record would normally contain a pointer PREV to the previous record
on the stack. Since the record is the first, the pointer value is null.
The second word of the activation record contain a portion NEXT to the first unused
word of the stack, which will be the starting address for the next activation record created. The
third word contains the return address for this invocation of the procedure, and then necessary
words contain the values of variables used by the procedure.

28
SYSTEM SOFTWARE AND OPERATING SYSTEM

(a) (b) (c)

Recursive invocation of a procedure using static storage allocation

Activation records are typically allocated on a stack, with the current record at the top of the
stack. The procedure MAIN has been called; its activation record appears on the stack. The base
register B has been set to indicate the starting address of this current activation record. The first
word in an activation record would normally contain a pointer PREV to the previous record on
the stack. Since this record is the first, the pointer value is null. The second word of the
activation record contains a pointer NEXT to the first unused word of the stack, which will be
the starting address for the next activation record created. The third word contains the return
address for this invocation of the procedure, and the remaining words contain the values of
variables used by the procedure.

29
SYSTEM SOFTWARE AND OPERATING SYSTEM

30
SYSTEM SOFTWARE AND OPERATING SYSTEM

31
SYSTEM SOFTWARE AND OPERATING SYSTEM

(d)

When a procedure returns to its caller, the current activation record (which corresponds to the
most recent invocation) is deleted. The pointer PREV in the deleted record is used to reestablish
the previous activation record as the cur-rent one, and execution continues. This shows the stack
as it would appear after SUB returns from the recursive call. Register B has been reset to point to
the activation record for the previous invocation of SUB.
The return address and all the variable values in this activation record are exactly the
same as they were before the recursive call. This technique is often referred to as automatic
allocation of storage to distinguish it from other types of dynamic allocation that are under the
control of the programmer.
When automatic allocation is used, the compiler must generate code for references to
variables using some sort of relative addressing. In our example the compiler assigns to each
variable an address that is relative to the beginning of the activation record, instead of an actual
location within the object program.
The address of the current activation record is, by convention, contained in register B, so
a reference to a variable is translated as an instruction that uses base relative addressing. The
displacement in this instruction is the relative address of the variable within the activation record.
The compiler must also generate additional code to manage the activation records themselves. At
32
SYSTEM SOFTWARE AND OPERATING SYSTEM

the beginning of each procedure there must be code to create a new activation record, linking it
to the previous one and setting the appropriate pointers. This code is often called a prologue for
the procedure. At the end of the procedure, there must be code to delete the current activation
record, resetting pointers as needed. This code is often called an epilogue.

BLOCK – STRUCTURED LANGUAGE


A block is a unit that can be divided in a language. It is a portion of a program that
has the ability to declare its own identifiers. This definition of a block is also meeting the units
such as procedures and functions.
Each procedure corresponds to a block. Note that blocks are rested within other blocks.
Example: Procedures B and D are rested within procedure A and procedure C is rested within
procedure B. Each block may contain a declaration of variables. A block may also refer to
variables that are defined in any block that contains it, provided the same names are not
redefined in the inner block. Variables cannot be used outside the block in which they are
declared.
In compiling a program written in a blocks structured language, it is convenient to
number the blocks .As the beginning of each new block is recognized, it is assigned the next
block number in sequence. The compiler can then construct a table that describes the block
structure. The block-level entry gives the nesting depth for each block. The outer most block
number that is one greater than that of the surrounding block.
A NEW or MALLOC statement would be translated into a request to the operating system for
an area of storage of the required size. Another method is to handle the required allocation
through a run-time support procedure associated with the compiler. With this method, a large
block of free storage called a heap is obtained from the operating system at the beginning of the
program. Allocations of storage from the heap are managed by the run-time procedure. In some
systems, it is not even necessary for the programmer to free storage explicitly. Instead, a run-
time garbage collection procedure scans the pointers in the program and reclaims areas from the
heap that are no longer being used. Dynamic storage allocation, as discussed in this section,
provides another example of delayed binding.

33
SYSTEM SOFTWARE AND OPERATING SYSTEM

Nesting of blocks in a source program

34
SYSTEM SOFTWARE AND OPERATING SYSTEM

When a reference to an identifier appears in the source program, the compiler must first
check the symbol table for a definition of that identifier by the current block. If no such
definition is found, the compiler looks for a definition by the block that surrounds the current
one, then by the block that surrounds that, and so on.
If the outermost block is reached without finding a definition of the identifier, then the
reference is an error. The search process just described can easily be implemented within a
symbol table that uses hashed addressing. The hashing function is used to locate one definition of
the identifier. The chain of definitions for that identifier is then searched for the appropriate
entry.
There are other symbol-table organizations that store the definitions of identifiers
according to the nesting of the blocks that define them. This kind of structure can make the
search for the proper definition more efficient. Most block-structured languages make use of
automatic storage allocation. That is, the variables that are defined by a block are stored in an
activation record that is created each time the block is entered.
If a statement refers to a variable that is declared within the current block, this variable
is present in the current activation record, so it can be accessed in the usual way. However, it is
also possible for a statement to refer to a variable that is declared in some surrounding block. In
that case, the most recent activation record for that block must be located to access the variable.
One common method for providing access to variables in surrounding blocks uses a data
structure called a display.

35
SYSTEM SOFTWARE AND OPERATING SYSTEM

Use of display for procedures

36
SYSTEM SOFTWARE AND OPERATING SYSTEM

37

You might also like