Oral Questions 2021 - Architecture
Oral Questions 2021 - Architecture
Computer Architecture
2. Cache Types:
1. direct-mapped cache: each memory location is mapped to exactly one location in the cache.
Performance of this cache memory type is lower than others.
2. fully associative cache: a block can be placed in any location in the cache.
This memory type significantly decreases amount of cache misses, considered as complex type
of cache memory implementation.
3. set-associative cache: A cache that has a fixed number of sets (at least two). An entry can
reside in any block within a set.
The advantage of increasing the degree of associativity is that it usually decreases the miss rate.
The main disadvantage is a potential increase in the hit time (search time increased).
4. What is fragmentation?
Fragmentation refers to the condition of a disk in which files are divided into pieces scattered
around the disk. Fragmentation occurs when the operating system needs to store parts of a file
in noncontiguous spaces. This can slow down the speed at which data is accessed because the
disk drive must search through different parts of the disk to put together a single file.
CPU GPU
CPU consumes or needs more memory it consumes or requires less memory than
2.
than GPU. CPU.
4. CPU contain minute powerful cores. While it contain more weak cores.
CPU is suitable for serial instruction While GPU is not suitable for serial
5.
processing. instruction processing.
CPU is not suitable for parallel While GPU is suitable for parallel
6.
instruction processing. instruction processing.
9. Replacement:
Page replacement is a process of swapping out an existing page from the frame of a main
memory and replacing it with the required page. All the frames of main memory are already
occupied. Thus, a page has to be replaced to create a room for the required page.
11. What is the difference between unified memory access and non-unified memory access?
Single address space multiprocessors come in two styles.
Uniform memory access (UMA) multiprocessors: the latency to a word in memory does not
depend on which processor asks for it.
Nonuniform memory access (NUMA): some memory accesses are much faster than others,
depending on which processor asks for which word, typically because main memory is divided
and attached to different microprocessors or to different memory controllers on the same chip.
The programming challenges are harder for a NUMA multiprocessor than for a UMA
multiprocessor, but NUMA machines can scale to larger sizes and can have lower latency to
nearby memory.
UMA is applicable for general purpose applications and time-sharing applications. NUMA
is applicable for real-time applications and time-critical applications.
12. What happens in instruction fetch and explain the remaining stages?
The MIPS pipeline with five stages, with one step per stage:
• IF: Instruction fetch from memory
• ID: Instruction decode & register read
• EX: Execute operation or calculate address
• MEM: Access memory operand
• WB: Write result back to register
15. What are the type of hazards and how are they solved?
Situations that prevent starting the next instruction in the next cycle
1. Structure hazard: because he hardware does not support the combination of
instructions that are set to execute. Solved in MIPS by having two separate instruction/data
memories.
2. Data hazard: because data that is needed to execute the instruction is not yet available.
Data hazards arise from the dependence of one instruction on an earlier one that is still in the
pipeline. Solved by:
• forwarding (bypassing). retrieving the missing data element from internal buffers rather
than waiting for it to arrive from registers or memory.
• Stall (bubble): In load-use data hazard which is a specific form of data hazard in which
the data being loaded by a load instruction has not yet become available when it is
needed by another instruction. A stall is initiated in order to resolve this hazard.
• Software that reorders code to try to avoid pipeline stalls.
3. Control hazard (branch hazard): because the instruction that was fetched is not the one
that is needed; that is, the flow of instruction addresses is not what the pipeline expected.
Solved by using Branch prediction. A method of resolving a branch hazard that assumes a given
outcome for the branch and proceeds from that assumption rather than waiting to ascertain
the actual outcome.
Register: is for storing variables before they are operated on by program instruction during
stages of instruction execution in CPU.
Cache: is for storing frequently used variables in the nearest level of memory hierarchy to the
CPU.
21. What is memory hierarchy? Why use it? And what are the technology in each level? Why
we need it?
A memory hierarchy. A structure that uses multiple levels of memories; as the distance from
the processor increases, the size of the memories and the access time both increase.
1. Main memory is implemented from DRAM (dynamic random access memory) 2.
Levels closer to the processor (caches) use SRAM (static random access memory).
3. The largest and slowest level in the hierarchy, is usually magnetic disk.
By implementing the memory system as a hierarchy, the user has the illusion of a memory that
is as large as the largest level of the hierarchy, but can be accessed as if it were all built from
the fastest memory.
33. How does a company specify the criteria of hardware specification and configuration?
Define Application Requirements: Run a hypothetical example.
1) Estimate Your Bandwidth Requirements
Estimate Monthly Data Transfer including Average Monthly Visitors, Average Number of Page
Views Per Visitor, Average Page Size
Your bandwidth speed requirements are based on the amount of data transfer per month.
2) Estimate Your Compute Requirements
CPU, RAM and Virtual Machines (VMs)
3) Determine Storage Requirements
35. What is the difference between high performance and high throughput?
High performance: increasing a system's efficiency with regard to:
• response time
• throughput
• resource utilization
Throughput: no. of instructions executed in a unit of time.
• number of computations per second
• number of data per second
37. What is the difference between SIMD and MIMD?? What is SPMD?
Multiple Instruction Multiple Data (MIMD): is a technique employed to achieve parallelism.
Machines using MIMD have a number of processors that function asynchronously and
independently. At any time, different processors may be executing different instructions on
different pieces of data.
Single Program Multiple Data (SPMD): is a technique employed to achieve parallelism; it is a
subcategory of MIMD. Tasks are split up and run simultaneously on multiple processors with
different input in order to obtain results faster.
43. What is the difference between fine-grained multithreading and coarse grained
multithreading?
Fine-grained multithreading: A version of hardware multithreading that suggests switching
between threads after every instruction.
Coarse-grained multithreading: A version of hardware multithreading that suggests switching
between threads only after significant events, such as a cache miss.
44. The difference between Shared memory & distributed database in HPC:
• Shared memory: multiple processors all share the same memory
• Distributed database: multiple processors each has its own memory communicating through a
network.
45. What is the difference between grid computation and hybrid computation?
Grid computing: is the use of widely distributed computer resources to reach a common goal. In
grid computers, each node set to perform a different task/application.
Hybrid computation: it is a combination of several computing technology such as: HPC + Grid +
Cloud.
Pseudo instructions: These are simple assembly language instructions that do not have a direct
machine language equivalent. During assembly, the assembler translates each pseudo
instruction into one or more machine language instructions.