0% found this document useful (0 votes)
4 views

Oral Questions 2021 - Architecture

Architecture

Uploaded by

Maruf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Oral Questions 2021 - Architecture

Architecture

Uploaded by

Maruf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Oral Questions 2021

Computer Architecture

1. What is the single cycle instruction, how to improve its performance?


Single cycle instruction: All “steps” of executing an instruction are done in 1 clock cycle. The
cycle is long to accommodate longest path.
Pipelining improves efficiency by executing multiple instructions simultaneously resulting in a
much higher throughput.
Multi cycle instruction: Execute instruction in steps; one step done per clock cycle. The longest
step determines cycle time.

2. Cache Types:
1. direct-mapped cache: each memory location is mapped to exactly one location in the cache.
Performance of this cache memory type is lower than others.
2. fully associative cache: a block can be placed in any location in the cache.
This memory type significantly decreases amount of cache misses, considered as complex type
of cache memory implementation.
3. set-associative cache: A cache that has a fixed number of sets (at least two). An entry can
reside in any block within a set.
The advantage of increasing the degree of associativity is that it usually decreases the miss rate.
The main disadvantage is a potential increase in the hit time (search time increased).

3. What is a clustering in CA?


Cluster is a set of computers connected over a local area network that function as a single large
multiprocessor.

4. What is fragmentation?
Fragmentation refers to the condition of a disk in which files are divided into pieces scattered
around the disk. Fragmentation occurs when the operating system needs to store parts of a file
in noncontiguous spaces. This can slow down the speed at which data is accessed because the
disk drive must search through different parts of the disk to put together a single file.

5. Difference between CPU and GPU:

CPU GPU

While GPU stands for Graphics Processing


1. CPU stands for Central Processing Unit.
Unit.

CPU consumes or needs more memory it consumes or requires less memory than
2.
than GPU. CPU.

The speed of CPU is less than GPU’s


3. While GPU is faster than CPU’s speed.
speed.

4. CPU contain minute powerful cores. While it contain more weak cores.

CPU is suitable for serial instruction While GPU is not suitable for serial
5.
processing. instruction processing.

CPU is not suitable for parallel While GPU is suitable for parallel
6.
instruction processing. instruction processing.

7. CPU emphasis on low latency. While GPU emphasis on high throughput.

6. Why adding 4 during the increment of instruction (pc+4)?


Program counter (PC) points to the next instruction and All MIPS instructions are 4 bytes long.
So, the sequentially following instruction address would be (PC + 4).

7. Write through and write back:


Write-through: A scheme in which writes always update both the cache and the next lower
level of the memory hierarchy, ensuring that data is always consistent between the two.
This scheme could slow down the processor.
Write-back: A scheme that handles writes by updating values only to the block in the cache,
then writing the modified block to the lower level of the hierarchy when the block is replaced.
This schemes can improve performance, but it is more complex.

8. LRU and random method:


Least recently used (LRU): A replacement scheme in which the block replaced is the one that
has been unused for the longest time. As associativity increases, implementing LRU gets harder.
Random: choose randomly among the blocks.

9. Replacement:
Page replacement is a process of swapping out an existing page from the frame of a main
memory and replacing it with the required page. All the frames of main memory are already
occupied. Thus, a page has to be replaced to create a room for the required page.

10. Loop unrolling:


An important compiler technique to get more performance from loops, where multiple copies
of the loop body are made. After unrolling, there is more ILP available by overlapping
instructions from different iterations.
Loop unrolling is a compiler optimization applied to certain kinds of loops to reduce the
frequency of branches and loop maintenance instructions.

11. What is the difference between unified memory access and non-unified memory access?
Single address space multiprocessors come in two styles.
Uniform memory access (UMA) multiprocessors: the latency to a word in memory does not
depend on which processor asks for it.
Nonuniform memory access (NUMA): some memory accesses are much faster than others,
depending on which processor asks for which word, typically because main memory is divided
and attached to different microprocessors or to different memory controllers on the same chip.
The programming challenges are harder for a NUMA multiprocessor than for a UMA
multiprocessor, but NUMA machines can scale to larger sizes and can have lower latency to
nearby memory.
UMA is applicable for general purpose applications and time-sharing applications. NUMA
is applicable for real-time applications and time-critical applications.
12. What happens in instruction fetch and explain the remaining stages?
The MIPS pipeline with five stages, with one step per stage:
• IF: Instruction fetch from memory
• ID: Instruction decode & register read
• EX: Execute operation or calculate address
• MEM: Access memory operand
• WB: Write result back to register

13. What is Paging?


Paging is a storage mechanism that allows OS to retrieve processes from the secondary storage
into the main memory in the form of pages. In the Paging method, the main memory is divided
into small fixed-size blocks of physical memory, which is called frames. The size of a frame
should be kept the same as that of a page to have maximum utilization of the main memory
and to avoid external fragmentation. Paging is used for faster access to data.
Advantages of Paging: • Easy to use memory management algorithm • No need for external
Fragmentation • Swapping is easy between equal-sized pages and page frames.

14. What is fragmentation types?


Internal fragmentation: A process is allocated a memory block of size more than the size of that
process. Due to this some part of the memory is left unused.
External fragmentation: Although we might have total space available that is needed by a
process still we are not able to put that process in the memory because that space is not
contiguous.

15. What are the type of hazards and how are they solved?
Situations that prevent starting the next instruction in the next cycle
1. Structure hazard: because he hardware does not support the combination of
instructions that are set to execute. Solved in MIPS by having two separate instruction/data
memories.
2. Data hazard: because data that is needed to execute the instruction is not yet available.
Data hazards arise from the dependence of one instruction on an earlier one that is still in the
pipeline. Solved by:
• forwarding (bypassing). retrieving the missing data element from internal buffers rather
than waiting for it to arrive from registers or memory.
• Stall (bubble): In load-use data hazard which is a specific form of data hazard in which
the data being loaded by a load instruction has not yet become available when it is
needed by another instruction. A stall is initiated in order to resolve this hazard.
• Software that reorders code to try to avoid pipeline stalls.
3. Control hazard (branch hazard): because the instruction that was fetched is not the one
that is needed; that is, the flow of instruction addresses is not what the pipeline expected.
Solved by using Branch prediction. A method of resolving a branch hazard that assumes a given
outcome for the branch and proceeds from that assumption rather than waiting to ascertain
the actual outcome.

16. What is the difference between multiprocessing and multi-threading?


Multiprocessing: more processors per chip are added to increase computing power.
Multithreading: it allows multiple threads to share the functional units of a single processor in
an overlapping fashion.
Both are used to increase the level of parallelism.

17. What is the difference between CISC and RISC?


A complex instruction set computer (CISC): instruction can take several clock cycles.
A reduced instruction set computer (RISC): single cycle instruction.

18. What is MIPS and its virtual machine benefit?


Microprocessor without Interlocked Pipelined Stages (MIPS):
The MIPS Virtualization Module is an optional module that is specified in the base MIPS32 and
MIPS64 architectures to support hardware-assisted virtualization. It retains the original
simplicity of the MIPS baseline architecture while providing a host of features in support of
virtualization.
Hardware-assisted virtualization does not require modification of a guest OS. The fact that the
guest OS does not have to be modified adds to the flexibility of hardware-assisted
virtualization. Multiple virtual machines, each with a different unmodified OS, can run on a
common processor, without regards to compatibility.
19. What is the difference between multicore and multiprocessor?
The main difference between multicore and multiprocessor is that the multicore refers to a
single chip with multiple execution units while the multiprocessor refers to a system that has
two or more CPUs.

20. What is the difference between cache and register?

Register: is for storing variables before they are operated on by program instruction during
stages of instruction execution in CPU.
Cache: is for storing frequently used variables in the nearest level of memory hierarchy to the
CPU.

21. What is memory hierarchy? Why use it? And what are the technology in each level? Why
we need it?
A memory hierarchy. A structure that uses multiple levels of memories; as the distance from
the processor increases, the size of the memories and the access time both increase.
1. Main memory is implemented from DRAM (dynamic random access memory) 2.
Levels closer to the processor (caches) use SRAM (static random access memory).
3. The largest and slowest level in the hierarchy, is usually magnetic disk.
By implementing the memory system as a hierarchy, the user has the illusion of a memory that
is as large as the largest level of the hierarchy, but can be accessed as if it were all built from
the fastest memory.

22. What is the difference between superscalar architecture and MIPS?


Superscalar: An advanced pipelining technique that enables the processor to execute more than
one instruction per clock cycle by selecting them during execution by the processor (dynamic
multiple issue). Multiple issue is possible by replicating the internal components of the
computer.
MIPS: traditional pipeline where in each clock cycle only one instruction is issued or static
multiple issue where issue packet is determined statically by the compiler.
23. What is the first step in processor design?
Determine the instruction set architecture ISA, choose paradigm (RISC / CISC).

24. How do you decide if a processor is better than another?


• Instruction count: Determined by ISA (Instruction Set Architecture) and compiler.
• CPI (Clock Cycle Per Instruction) and Clock Cycle time: Determined by CPU hardware.
The basic performance equation: CPU time = Instruction count x CPI x Clock cycle time

25. What is shared memory?


More than one processor shares the same memory.
Shared memory multiprocessor (SMP) is a parallel processor with a single physical address
space.

26. What is memory coherence? How it is maintained?


Two or more processors or cores share a common area of memory. Maintained using Memory
coherence protocol.
Coherence, defines what values can be returned by a read. Key to implementing a cache
coherence protocol is tracking the state of any sharing of a data block. The most popular cache
coherence protocol is snooping.

27. Compare between cluster and multiprocessor?


Cluster: A set of computers connected over a local area network (LAN) that functions as a single
large multiprocessor.
Multiprocessor: A computer system with at least two processors.

28. What is clock rate?


It is referred to the frequency at which the clock generator of a processor can generate pulses,
which are used to synchronize the operations of its components and is used as an indicator of
the processor's speed.
Clock rate (e.g., 4 gigahertz, or 4 GHz), is the inverse of the clock period.
Clock Period: The length of each clock cycle.
29. What is pipelining vs non-pipeline? How does it improve performance?
Non-pipeline execute instructions sequentially.
Pipelining exploits parallelism such that multiple instructions are overlapped in execution.
Pipelining improves performance by increasing instruction throughput, as opposed to
decreasing the execution time of an individual instruction, but instruction throughput is the
important metric because real programs execute billions of instructions.

30. From programming view, what we should do with the parallelism?


Uniprocessor design techniques such as pipelining and superscalar take advantage of
instruction-level parallelism without the involvement of the programmer.
In multiprocessors, programs must be rewritten to get better performance and efficiency from
parallel processing.
It is difficult to write software that uses multiple processors to complete one task faster, and
the problem gets worse as the number of processors increases.
The challenges include scheduling, load balancing, time for synchronization, and overhead for
communication between the parties.
Additionally, getting good speed-up on a multiprocessor while keeping the problem size fixed
(Strong scaling) is harder than getting good speed-up by increasing the size of the problem
(Weak scaling).

31. How does clock cycle differ in Pipeline and non-Pipeline?


In non-pipeline: the clock cycle is long to accommodate longest path. (LW instruction) In
pipeline: The longest stage determines the clock cycle time.

32. What is HPC? When does we need HPC?


High-performance computing (HPC) is the ability to process data and perform complex
calculations at high speeds. One of the best-known types of HPC solutions is the
supercomputer. A supercomputer contains thousands of compute nodes that work together to
complete one or more tasks. This is called parallel processing.
We need it to solve large problems in science, engineering, or business, image modeling,
physics, chemistry simulation simulating and understanding real world phenomena, big data
processing.

33. How does a company specify the criteria of hardware specification and configuration?
Define Application Requirements: Run a hypothetical example.
1) Estimate Your Bandwidth Requirements
Estimate Monthly Data Transfer including Average Monthly Visitors, Average Number of Page
Views Per Visitor, Average Page Size
Your bandwidth speed requirements are based on the amount of data transfer per month.
2) Estimate Your Compute Requirements
CPU, RAM and Virtual Machines (VMs)
3) Determine Storage Requirements

34. What is collective synchronization?


Collective: involves data sharing between more than two tasks, which are often specified as
being members in a common group, or collective.
Tasks wait (do not finish execution) for each other to finish. Then join them together.

35. What is the difference between high performance and high throughput?
High performance: increasing a system's efficiency with regard to:
• response time
• throughput
• resource utilization
Throughput: no. of instructions executed in a unit of time.
• number of computations per second
• number of data per second

36. What are SIMD, MISD?


• SIMD: A type of parallel computer.
Single Instruction: All processing units execute the same instruction at any given clock cycle
Multiple Data: Each processing unit can operate on a different data element.
• MISD: A type of parallel computer (rare).
Multiple Instruction: Each processing unit operates on the data independently via separate
instruction streams.
Single Data: A single data stream is fed into multiple processing units.
• SISD: Single Instruction stream, Single Data stream. A uniprocessor.
• MIMD: Multiple Instruction streams, Multiple Data streams. A multiprocessor.
• SPMD: Single Program, Multiple Data streams. The conventional MIMD programming
model, where a single program runs across all processors.

37. What is the difference between SIMD and MIMD?? What is SPMD?
Multiple Instruction Multiple Data (MIMD): is a technique employed to achieve parallelism.
Machines using MIMD have a number of processors that function asynchronously and
independently. At any time, different processors may be executing different instructions on
different pieces of data.
Single Program Multiple Data (SPMD): is a technique employed to achieve parallelism; it is a
subcategory of MIMD. Tasks are split up and run simultaneously on multiple processors with
different input in order to obtain results faster.

38. Why pipeline is fast?


Because it is performing multiple instructions at the same time.

39. Why did you choose the HPC in your solution?


Because it is the suitable solution for distributing systems with big data.
40. What is the Personal Mobile Device? What we should consider when we want design its
hardware?
It is a computer small enough to hold and operate in the hand. Typically, any handheld
computer device will have an LCD or OLED flatscreen interface, providing a touchscreen
interface with digital buttons and keyboard or physical buttons along with a physical keyboard.

41. What is the virtual memory? For what it is used?


A technique that uses main memory as a “cache” for secondary storage.
It is used to allow efficient and safe sharing of memory among multiple programs, and to
remove the programming burdens of a small, limited amount of main memory.
To compile each program into its own address space—a separate range of memory locations
accessible only to this program. Virtual memory implements the translation of a program’s
address space to physical addresses to enforce protection.

42. Is the register a cache?


No, CPU include registers which are a small amount of data storage, that facilitates some CPU
operations.
CPU cache, it is a high-speed volatile memory which is bigger in size, that helps the processor to
reduce the memory operations.

43. What is the difference between fine-grained multithreading and coarse grained
multithreading?
Fine-grained multithreading: A version of hardware multithreading that suggests switching
between threads after every instruction.
Coarse-grained multithreading: A version of hardware multithreading that suggests switching
between threads only after significant events, such as a cache miss.

44. The difference between Shared memory & distributed database in HPC:
• Shared memory: multiple processors all share the same memory
• Distributed database: multiple processors each has its own memory communicating through a
network.
45. What is the difference between grid computation and hybrid computation?
Grid computing: is the use of widely distributed computer resources to reach a common goal. In
grid computers, each node set to perform a different task/application.
Hybrid computation: it is a combination of several computing technology such as: HPC + Grid +
Cloud.

46. What is the difference between parallel computing and cluster.


A computer cluster is a set of computers that work together so that they can be viewed as a
single system.
Parallel computing is a type of computation where many calculations or the execution of
processes are carried out simultaneously.
The cluster is computing in parallel.
-----------------------------
1. What is register spilling?
If there are not enough registers to hold all the variables, some variables may be moved to and
from RAM. This process is called "spilling" the registers.

2. Is there an instruction that is not implemented in hardware?

Pseudo instructions: These are simple assembly language instructions that do not have a direct
machine language equivalent. During assembly, the assembler translates each pseudo
instruction into one or more machine language instructions.

3. What is pseudo instruction?


These are simple assembly language instructions that do not have a direct machine language
equivalent. During assembly, the assembler translates each pseudo instruction into one or
more machine language instructions.
Example: move $t0, $t1
The assembler will translate it to: add $t0, $zero, $t1
4. What happens if we increase the word size?
• increasing word size increases the capability to process larger units of data. The
disadvantages of increased word size are increased CPU, bus, and memory complexity and,
therefore, cost. This complexity increases at a faster rate than the increase in word size.
• If we increase the word size, we also have to increase the number of address bits in the
address registers, and the memory control logic. Thus the word size is an important decision
when designing architecture, with extra memory being required as word sizes increase with the
corresponding increase in virtual memory size.

5. What is assembly language?


An assembly language is a low-level programming language designed for a specific type of
processor.

6. How will CPU work without registers?


A processor register is a quickly accessible location available to a computer's processor. At the
very least CPU need a program counter.
Without other registers, we would read data words from the memory and write data words into
the memory each instruction. (not practical)

7. What are addressing modes?


Addressing Modes– The term addressing modes refers to the way in which the operand of an
instruction is specified. The addressing mode specifies a rule for interpreting or modifying the
address field of the instruction before the operand is actually executed.
Multiple forms of addressing are generically called addressing modes. The MIPS addressing
modes are the following:
1. Immediate addressing, where the operand is a constant within the instruction itself
2. Register addressing, where the operand is a register
3. Base or displacement addressing, where the operand is at the memory location whose
address is the sum of a register and a constant in the instruction.
4. PC-relative addressing, where the branch address is the sum of the PC and a constant in the
instruction.
5. Pseudodirect addressing, where the jump address is the 26 bits of the instruction
concatenated with the upper bits of the PC.

8. What is an embedded system?


An embedded system is a microprocessor-based computer hardware system with software that
is designed to perform a dedicated function. Examples: Household appliances, such as
microwave ovens, washing machines and dishwashers.

9. Steps involved in the execution of an instruction by CPU:


1. Fetch instruction
2. Decode information
3. Perform ALU operation
4. Access memory
5. Update register file
6. Update the Program Counter (PC)

10. What is SaaS and how it is related to computer architecture?


Software as a service (SaaS) Rather than selling software that is installed and run on customers’
own computers, software is run at a remote site and made available over the Internet typically
via a Web interface to customers.
The idea of clusters which are parallel hardware with separate memory and its ability to be
applied to task-level parallelism and applications with little communication like Web search and
mail servers, that do not require shared addressing to run well. This leads to the idea of SaaS.
Software as a Service (SaaS), have millions of independent users of interactive Internet services.
Reads and writes are rarely dependent in SaaS, so SaaS rarely needs to synchronize. We call this
type of easy parallelism Request-Level Parallelism, as many independent efforts can proceed in
parallel naturally with little need for communication or synchronization.
11. How to represent a floating point?
The floating number representation has two part: the first part represents a signed fixed point
number called mantissa. The second part of designates the position of the decimal (or binary)
point and is called the exponent. The fixed point mantissa may be fraction or an integer

You might also like