Linker and Loader
Linker and Loader
When you write code in a high-level programming language (like C or C++), it has to go through several
steps before it can actually run on a computer. Here’s an overview of the process:
4. Linker and Loader: These finalize the program so that it can actually run.
2 What is a Linker?
The Linker is a system program that takes one or more object files (output from the compiler) and
combines them into a single, executable program. The main jobs of the linker are:
• Resolve Symbolic References: In code, you often refer to functions or variables that might be
defined in another file. These references are called *symbolic references*. The linker finds where these
symbols (like function names or variables) actually reside in memory and updates the code to point to
the correct locations.
• Combine Object Files: If you have a large program broken into multiple files (like main.c, math.c,
io.c), each file will have its own object file. The linker combines these into one final executable file.
• Address Binding: The linker assigns memory addresses to different parts of the program, such as
where variables and functions will reside in memory during execution.
1
3 What is a Loader?
Once the linker has created an executable file, the Loader comes into play. The loader is responsible for
loading the program into memory so it can be executed by the CPU. The loader’s tasks include:
• Loading the Program: The loader copies the program’s code from disk into the system’s memory
(RAM).
• Setting Up Memory Space: Allocates memory for the program, including for global variables and
functions. It sets up the *stack* and *heap* for the program’s dynamic memory needs.
• Adjusting Addresses: For dynamically linked programs, the loader may need to adjust memory
addresses for any libraries that are loaded at runtime. This is known as *relocation*.
• Transfer Control: Finally, the loader hands control of the CPU to the program, starting its execution
from the program’s main function.
2. Linking:
• The linker takes all the object files and combines them into one executable.
• It resolves references to variables and functions that are used across files.
• It performs static or dynamic linking as required.
3. Loading:
• The loader loads the executable into memory.
• It allocates memory for program data and prepares any necessary runtime environment.
• For dynamically linked programs, it loads shared libraries and updates memory addresses.
• Finally, it transfers control to the program’s entry point, often the main function.
• Memory Management: Loaders allow efficient use of memory by loading only the parts of code that
are needed and by supporting dynamic linking.
• Reusability: Linking enables code reuse by allowing libraries to be shared across programs.
2
4 Binding in Programming and Address Relocation
Binding is a fundamental concept in programming that refers to the process of associating a method or
function call with the actual method or function implementation. The method or function invoked in
response to a call is determined based on when and how this association occurs. Binding can take place at
compile-time (static binding) or at runtime (dynamic binding).
Let’s explore both types of binding in detail, and also touch on addressing and relocation, which are
critical in understanding how bindings relate to memory management during execution.
• Example: Function calls where the method is not overridden (no polymorphism).
1 class Animal {
2 public void sound () {
3 System . out . println (" Some animal sound ") ;
4 }
5 }
6
7 public class Main {
8 public static void main ( String [] args ) {
9 Animal a = new Animal () ;
10 a . sound () ; // Static binding happens here
11 }
12 }
In this example, the method sound() is bound statically because it is resolved at compile-time when the
reference a is recognized as pointing to an Animal object.
3
4.3.1 Characteristics of Dynamic Binding
• Binding Time: Determined during runtime.
• Method Resolution: Based on the actual object type at runtime.
• Performance: Slower than static binding since resolution occurs during execution.
• Example: Polymorphic method calls (method overriding).
1 class Animal {
2 public void sound () {
3 System . out . println (" Some animal sound ") ;
4 }
5 }
6
7 class Dog extends Animal {
8 @Override
9 public void sound () {
10 System . out . println (" Bark ") ;
11 }
12 }
13
14 class Cat extends Animal {
15 @Override
16 public void sound () {
17 System . out . println (" Meow ") ;
18 }
19 }
20
21 public class Main {
22 public static void main ( String [] args ) {
23 Animal a = new Dog () ; // a refers to a Dog object
24 a . sound () ; // Resolves to Dog ’ s sound () method at runtime
25
26 a = new Cat () ; // a now refers to a Cat object
27 a . sound () ; // Resolves to Cat ’ s sound () method at runtime
28 }
29 }
In this example, dynamic binding occurs because at runtime, the sound() method call is resolved to
the method of the actual object (either Dog or Cat) that a is pointing to.
4.4.1 Addressing
Addressing refers to the process of associating a memory address with a variable, function, or method. When
a program is compiled and executed, variables, methods, and functions are stored in memory, and each must
have a unique address. During binding, the address of a method or function is determined based on the
reference type or object type (static or dynamic).
• In static binding, the address is determined at compile-time because the method to be called is fixed.
• In dynamic binding, the address is determined at runtime because the method to be called depends
on the actual object type, which is known only during execution.
4
4.4.2 Relocation
Relocation refers to the process of adjusting addresses when a program is loaded into memory. Programs
are often compiled to an intermediate form, and their memory locations may change when they are loaded
into memory for execution. Relocation ensures that all memory addresses are updated correctly, including
those that were determined during binding.
• Static binding: The memory addresses for functions or methods are fixed at compile-time. There is
no need for runtime relocation of addresses.
• Dynamic binding: During runtime, methods are selected dynamically, and the program must adjust
addresses as necessary to ensure that the correct method is invoked.
Here, the address of myFunction is not fixed and is dynamically located during the execution, requiring
relocation.
5
2. Loader: The loader is responsible for loading the program into memory at runtime. If the load origin
(the actual address where the program is loaded) is different from the linked origin, the loader performs
the relocation. In general, the relocation is performed by the linker, but if necessary, the loader can
also handle it.
• Linked Origin: The address where the program was linked (usually a fixed memory location).
• Translated Origin: The address where the program will be loaded into memory at runtime.
• The instruction READ A refers to address 540. This is an address-sensitive instruction because it
assumes that memory location 540 contains the necessary data.
• At the link time, the program is linked at linked origin = 700.
2. Calculating the Relocation Factor:
6
• The linked address of A is calculated by adding the relocation factor to the translated address:
• This means that the program should now use the address 740 for the symbol A instead of 540.
4. Adjusting All Instructions: All the other address-sensitive instructions in the program must also
be adjusted using the relocation factor.
1. Relocation Factor:
In this relocated program, all addresses have been adjusted correctly by adding the relocation factor of
200. This ensures that the program can now execute correctly even if it is loaded at a different address in
memory.
7
5.7 Key Steps for Relocation
1. Calculate the Relocation Factor: Subtract the translated origin from the linked origin.
2. Adjust the Addresses: For each address-sensitive instruction, add the relocation factor to the
translated address to obtain the linked address.
3. Recalculate the Memory Layout: Make sure all instructions and data are correctly relocated.
4. Load the Program: The relocated program can now be loaded into memory at any location and
executed correctly.
• External Reference: An external reference is a reference to a symbol that is not defined within the
current program unit but is defined in another program unit. These references must be resolved during
linking.
• EXTRN: This statement is used to declare external symbols that will be linked later. For example, if
MAX and ALPHA are defined in another program, their references are declared as EXTRN in the current
program.
This tells the assembler that the symbols MAX and ALPHA are external references that are defined elsewhere.
The assembler will leave these fields as 0 (or unresolved) and the linker will resolve them later.
6.2 Linking
Linking is the process of combining different program units (modules) and resolving all the external ref-
erences. When a program unit references an external symbol (like MAX or ALPHA), the linker will replace
these references with the correct addresses. If the program unit is linked at a certain memory address (called
the link origin), the addresses of the symbols are modified accordingly.
Linking occurs in two phases:
Example:
Consider a program Prog-P where MAX is an external symbol, and ALPHA is defined in another program
Prog-Q. When these programs are linked together, the linker resolves MAX and ALPHA by replacing the
references with the correct addresses.
8
6.3 Assembly Example
Let’s now work through a concrete assembly example to see how these concepts are used in practice.
1 START 500 ; Program starts at address 500
2 ENTRY TOTAL ; Public definition
3 EXTRN MAX , ALPHA ; External references
4 READ A
5 090540 500 ; Instruction at address 540
6 MOVER AREG , ALPHA ; Moves data from ALPHA into AREG
7 BC ANY , MAX ; Branch if condition is met with MAX
8 06 6 000 ; Another instruction
9 BC LT , LOOP ; Branch to LOOP if condition is met
10 06 1 501 ; Instruction at address 501
11 STOP ; Stop program execution
12 00 0 000 A DS 1 ; Allocate 1 word for A
13 TOTAL DS 1 ; Allocate 1 word for TOTAL
14 END
In this example:
• ENTRY TOTAL: The symbol TOTAL is a public definition, meaning it can be accessed by other
modules.
• EXTRN MAX, ALPHA: These symbols are external references, meaning they are defined in other
program units.
• The assembler will not know the addresses of MAX and ALPHA yet, so it leaves these addresses as 0.
• Prog-P contains an external reference to the symbol ALPHA, which is defined in Prog-Q.
• The program unit Prog-Q has a public definition for ALPHA at a translation-time address of 231.
Now, when Prog-P is linked to Prog-Q, the linker will adjust the addresses to match the link origin and
resolve the external reference.
Let’s assume:
Thus, the reference to ALPHA in Prog-P will be replaced with 773 during linking.
9
• <link origin>: The memory address where the program will be loaded.
• <object module names>: The names of the object modules to be linked.
• <execution start address>: The address where the program execution will begin (if not specified,
it’s assumed to be the link origin).
10