UNIT-I Basics of System Programming
UNIT-I Basics of System Programming
UNIT I
Basics of system
programming
Goals of This Class
Enforce your programming skill
Get you acquainted with programming
in ‘C’, Linux OS
Make you prepared for University Exam
& entrance exams
Goals of System software
operating system.
Loader
Macro processor
Text editor
Compiler
Operating system
Debugging system
Need ?
Programming Languages
Machine language:
It is computer’s native language having a sequence of zeroes
and ones (binary). Different computers understand different
sequences. Thus, hard for humans to understand e.g.0101001...
Assembly language:
It uses mnemonics for machine language. In this each instruction
is minimal but still hard for humans to understand:
e.g. ADD AH, BL
High-level languages:
FORTRAN, Pascal, BASIC, C, C++, Java, etc.
Each instruction composed of many low-level instructions,
closer to English. It is easier to read and understand:
e.g. hypot = sqrt(opp*opp + adj * adj);
File Handling
FILE:
It is defined in the header file stdio.h.
The link between our program and
operating system is a structure called FILE.
FILE structure contains information about
file being used , such as current size,
location in memory.
FILE *fp; where fp is file pointer of type
FILE.
Format of fopen()
FILE *fp;
fp= fopen(“file_name”, “type”);
File_name: character string that contains name of file to be
opened.
Type: File Meaning
Type
r Open existing file for reading only
w Opens a new file for writing if exists overwritten
a Opens existing file appending
r+ Read, write, modify existing file
w+ Read, write new file, existing file contents
destroyed
a+ Reading, appending existing file
getc(fp);
getc() function returns EOF when end of file
occurs or if it encounters an error.
putc():is used to write characters to a disk file that
can be opened using fopen in write mode
General Format:
• Language processors
• Data structures for language
processing
• Scanning and parsing
• Assembler
Language Processor
Necessity of Language of processing
Application domain
Execution domain
Semantic gap
Draw back of semantic gap
Large development time
Large development effort
Poor quality of software
A. Language processors
Application Execution
PL Domain
Domain Domain
Language processors
A language translator
A detranslator
A preprocessor
Interpreter
Language Processing Activities
Specification Gap
Program
Application Target PL Execution
Generator
Domain Domain Domain
Domain
Program generator validates the data
during data entry.
It reduces the testing effort.
Economical than to develop problem
oriented language(large EG).
Employee Name
Address
Age
Program Execution Activities
Characteristics of program
translation model
Interpreter Memory
PC Source
Program
+
Error Data
Characteristics of program
interpretation model
Language processing
= Analysis of source program
+ synthesis of target program
Lexical rules
Syntax rules
Semantic rules
Fundamentals of Language
Processing
Lexical Analysis:
• Identify lexical units->classification of units->
Tokens->class code and number
a:= b + I;
ID#2 OP#5 ID#3 OP#3 ID#1 OP#10
• ID#2 Stands for identifier occupying entry #2
in the symbol table
• Tokens: keywords, variable or arithmetic
operator
Continue…
Syntax analysis:
String of tokens->statement class->
build IC -> semantic analysis
• Forward references
• Issues concerning memory requirements and
organization of a language processor`
Pass I
Performs analysis of the source program
and note relevant information
Pass II
Perform synthesis of target program
Intermediate Representation
Desirable Properties of IR
Memory efficiency: IR must be
compact.
Processing efficiency: Efficient
algorithms must exist for constructing
and analyzing the IR.
Ease of use: IR should be easy to
construct and use
Language Processor
Development Tools
Reduced errors
Faster Translation time
Changes could made easier and Faster
Better than HLL where it is desirable to
use specific architectural features.
Disadvantages
Many instructions are required to
achieve small task.
Programmer requires knowledge of the
processor architecture and instruction
set.
Programs are machine dependent,
requires complete rewrite if hardware is
changed.
Design specification of assembler
Pass – II
Synthesize the target program
Pass – I Databases
Input source program
LC: used to keep track of each instructions
location.
MOT: mnemonic and its length.
ST: stores each symbol and its
corresponding location.
LT: stores each literal and its
corresponding assigned location
Data structures of assembler pass I
An assembly program
Algorithm for pass-I of two
pass assembler
Read
If label
Yes: store label in ST with LC value
No: Search pseudo OPTAB
Search pseudo OPTAB
Found: type
DS, DC : Determine length of data space required and update LC
End: Go to pass-II
Not found: Search m/c OPTAB
Search m/c OPTAB(MOVEM AREG,A)
Get length of instruction
Process literals update LC
Read
Assembler First Pass
1. loc_cntr:=0;
POOLTAB[1]:=1;
pooltab_ptr:=1; littab_ptr:=1;
2. While next statement is not END
a) If label is present
this_label = symbol in label field;
Enter(this_label,loc_cntr) in SYMTAB
b) If an LTORG statement then
process literal and update location counter
c) If a START or ORIGIN statement
loc_cntr:= value in operand field
d) If an EQU statement then
i) this_addr:=value of <address spec>
ii) correct the symtab entry for this_label to
(this_label,this_addr)
e) If a declaration statement the
i)code:=codeof the declaration statement
ii)size:=size of memory area req. by DC/DS
iii)loc_cntr:=lcn_cntr+size
iv) Generate IC ’(DC,code)..’
f) If an imperative statement then
i) code:=machine opcode from OPTAB
ii)loc_cntr:=lcn_cntr+instruction length from OPTAB
iii)IF operand is literal then
this_literal:=literal in operand field;
LITTAB[littab_ptr]:=this_literal;
littab_ptr:=littab_ptr+1
else(operand is symbol)
this_entry:=SYMTAB entry number of operand
Generate IC(IS, code)(S, this_entry)
3. (Processing of END statement)
a) Perform step 2b
b) Generate IC ‘(AD,02)’
c) Go to Pass II
Intermediate code
IC is categorized based on:
Processing efficiency
Memory economy
Each IC unit consists of following 3
fields:
Address
Representation of mnemonic opcode
Representation of operands
Address Mnemonic Operands
Opcoed
Code for declaration statements and
directives
Intermediate code – variant I
Intermediate code – variant II
Memory requirements using variant I
and variant II
Processing of declarations
• All of the literal operands used in the program
are gathered together into one or more literal
pools
• Normally literals are placed into a pool at the
end of the program
• A LTORG statement creates a literal pool that
contains all of the literal operands used since
the previous LTORG
• Most assembler recognize duplicate literals: the
same literal used in more than one place and
store only one copy of the specified data value
LITTAB (literal table): contains the literal name,
the operand value and length, and the address
assigned to the operand when it is placed in a
literal pool
An assembly program
Two pass assembler
Read statement
Search pseudo opcode table
Found : check type
DC,DS: Determine length of data space update LC
END : clean up and exit
Not found: Search MOT
Search MOT
Get instruction length, format type and binary code
Evaluate operand expression by searching for values of
symbols
Assemble together the parts of instruction and update LC
Read statement.
Data structures for two pass
assembler
Input: pass I program(IC)
Location counter(LC)
Machine operation table (MOT):
Pseudo-Operation table(POT)
Symbol table(Prepared by Pass I)
Base Table (indicates which registers are
currently specified as base register)
Work space INST : used to hold each
instruction as its various parts(binary op-code,
register field, length field, displacement field)
are being assembled together.
Algorithm: Second pass
Continued
Error reporting in pass - I
Error reporting in pass - II
predicate verb
article a
article the
noun cat
noun dog
verb runs
verb walks
80
A derivation of “the dog walks”:
sentence noun _ phrase predicate
noun _ phrase verb
article noun verb
the noun verb
the dog verb
the dog walks
81
More Notation
G V , T , S , P
Grammar
V: Set of variables
S: Start variable
83
Errors Reporting in Pass-I
86
Allocation data structures
Stacks
A linear data structure which permits
allocation and deallocation of entities in a
Last-in-first-out (LIFO) manner and only
the last entry is accessible at any time
Heaps
A nonlinear data structure which permits
allocation and deallocation of entities in a
random order
87
Symbol-Defining Statements
Assembler directive that allows the programmer to define symbols
and specify their values
General form: symbol EQU value
Line 133: +LDT #4096
MAXLEN EQU 4096
+LDT #MAXLEN
It is much easier to find and change the value of MAXLEN
Assembler directive that indirect assigns values to symbols ORG