0% found this document useful (0 votes)
132 views

Reconfigurable Computing

This document provides an overview of reconfigurable computing. It discusses how reconfigurable computing can fill the gap between hardware and software by providing higher performance than software while maintaining a higher level of flexibility than hardware. It describes reconfigurable computing architectures including how a reconfigurable processing fabric can be integrated with a processor. It also covers reconfiguration management challenges and approaches like single-context, multi-context, and partially reconfigurable architectures. Programming reconfigurable systems and compiling C for spatial computing are also summarized.

Uploaded by

Denise Nelson
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views

Reconfigurable Computing

This document provides an overview of reconfigurable computing. It discusses how reconfigurable computing can fill the gap between hardware and software by providing higher performance than software while maintaining a higher level of flexibility than hardware. It describes reconfigurable computing architectures including how a reconfigurable processing fabric can be integrated with a processor. It also covers reconfiguration management challenges and approaches like single-context, multi-context, and partially reconfigurable architectures. Programming reconfigurable systems and compiling C for spatial computing are also summarized.

Uploaded by

Denise Nelson
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

Reconfigurable Computing

Sherif Abou Zied Mohammad


[email protected]

Reconfigurable Computing

OUTLINE
INTRODUCTION RECONFIGURABLE COMPUTING ARCHITECTURES RECONFIGURATION MANAGEMENT PROGRAMMING RECONFIGURABLE SYSTEMS COMPILING C FOR SPATIAL COMPUTING HW/SW Partitioning BEE2:A High-End Reconfigurable Computing System REFERENCES

Reconfigurable Computing

INTRODUCTION Conventional Computing


Software-programmed microprocessors
Processors execute a set of instructions. Performance can suffer, if not in clock speed then in work rate. Lower performance than ASICs.

Reconfigurable Computing

INTRODUCTION Conventional Computing


Hardwired (ASICs)
Special purpose. Very fast and efficient. Circuit cannot be altered after fabrication.(Redesign!)

Reconfigurable Computing

INTRODUCTION Reconfigurable Computing


Fill the gap between hardware and software.
Much higher performance than software. Higher level of flexibility than hardware.

Reconfigurable Computing

INTRODUCTION Reconfigurable Computing


Uses FPGAs or other programmable hardware for compute-intensive calculations. Usually coupled with a general-purpose microprocessor that is responsible for
Controlling the reconfigurable logic . Executing program code that cannot be efficiently accelerated.

Reconfigurable Computing

INTRODUCTION Reconfigurable devices


Contain an array of computational elements. Functionality is determined through configuration bits.

Reconfigurable Computing

INTRODUCTION Reconfigurable devices


Most current FPGAs and reconfigurable devices are SRAM-programmable
Control routing. Control multiplexers, LUT, Control signals for a computational units.

3-input LUT

D flip-flop with optional bypass


Reconfigurable Computing 8

INTRODUCTION Reconfigurable devices


Reconfigurable Processing Fabric (RPF)
Fine-grained Coarse-grained

Reconfigurable Computing

INTRODUCTION Reconfigurable devices


Fine-grained RPF
Bit manipulation tasks For complex calculations, numerous fine-grained PEs are required.
slower clock rates

Reconfigurable Computing

10

INTRODUCTION Reconfigurable devices


Coarse-grained RPF
Use bus interconnect and PEs Performs more than just bitwise operations, such as ALUs and multipliers.

Reconfigurable Computing

11

OUTLINE
INTRODUCTION RECONFIGURABLE COMPUTING ARCHITECTURES RECONFIGURATION MANAGEMENT PROGRAMMING RECONFIGURABLE SYSTEMS COMPILING C FOR SPATIAL COMPUTING HW/SW Partitioning BEE2:A High-End Reconfigurable Computing System REFERENCES

Reconfigurable Computing

12

RECONFIGURABLE COMPUTING ARCHITECTURES


RPF integration

S. Goldstein, H. Schmit, M. Moe, M. Budiu, S. Cadambi, R. R. Taylor, R. Laufer. PipeRench: A coprocessor for streaming multimedia acceleration.
Reconfigurable Computing

13

RECONFIGURABLE COMPUTING ARCHITECTURES


RPF integration
Separate processor (coprocessor)
Data communication takes place through main memory Limited bandwidth between CPU and RPF

Reconfigurable Computing

14

RECONFIGURABLE COMPUTING ARCHITECTURES


RPF integration
Loosely coupled RPF and processor architecture
RPF with the host processor on the same chip Direct interaction between RPF and processor RPF with direct memory access
Chameleons architecture
Reconfigurable Computing 15

RECONFIGURABLE COMPUTING ARCHITECTURES


RPF integration
Tightly coupled RPF and processor
RPF integrated as functional unit such as ALU, Multipliers. RFU access input data through register files.

The datapath of the processor + RFU architecture

Reconfigurable Computing

16

RECONFIGURABLE COMPUTING ARCHITECTURES


RPF integration
Tightly coupled RPF and processor
Virtual Instruction Configurations(VICs ) in the RFU typically run during the execute stage (and possibly the memory stage) of the pipeline.

An example of a pipeline of a processor with an RFU


Reconfigurable Computing 17

OUTLINE
INTRODUCTION RECONFIGURABLE COMPUTING ARCHITECTURES RECONFIGURATION MANAGEMENT PROGRAMMING RECONFIGURABLE SYSTEMS COMPILING C FOR SPATIAL COMPUTING HW/SW Partitioning BEE2:A High-End Reconfigurable Computing System REFERENCES

Reconfigurable Computing

18

RECONFIGURATION MANAGEMENT
Problem Definition
Reconfigurability allows hardware to perform different tasks at different times. Applications configurations can be swapped Reconfiguring the hardware at runtime is called Runtime Reconfiguration (RTR).

Reconfigurable Computing

19

RECONFIGURATION MANAGEMENT
Problem Definition
RTR
Run-time reconfiguration is based upon the concept of virtual hardware, which is similar to virtual memory.
physical hardware is much smaller than the sum of the resources required. swap configurations in and out of the actual hardware.

Reconfigurable Computing

20

RECONFIGURATION MANAGEMENT
Problem Definition
RTR
Increases hardware utilization Introduces significant reconfiguration overhead Time consuming
Can require of hundreds of milliseconds

Reconfigurable Computing

21

RECONFIGURATION MANAGEMENT
Problem Definition
Computation and reconfiguration are mutually exclusive
time spent reconfiguring is time lost in terms of application acceleration.

Reconfiguration occupies approximately 25 to 98 percent of total execution time

Reconfigurable Computing

22

RECONFIGURATION MANAGEMENT
Configuration Architectures
What is Configuration architectures? Architectures
Single-context Multi-context Partially Reconfigurable Others

Reconfigurable Computing

23

RECONFIGURATION MANAGEMENT
Configuration Architectures
Single-context configurations are grouped into contexts, and each full context is swapped in and out of the FPGA as needed.

Reconfigurable Computing

24

RECONFIGURATION MANAGEMENT
Configuration Architectures
Single-context
Configuration information is loaded into the programmable array through a serial shift chain

Reconfigurable Computing

25

RECONFIGURATION MANAGEMENT
Configuration Architectures
Single-context
require few pins for configuration, potentially simplifying board-level design Entire chip must be reprogrammed for any change to the configuration data because the data cannot be selectively reused on the chip.

Reconfigurable Computing

26

RECONFIGURATION MANAGEMENT
Configuration Architectures
Single-context Configuration cycles can be reduced by widening the configuration path
Virtex-5 allow a configuration data bus up to 32 bits wide

Reconfigurable Computing

27

RECONFIGURATION MANAGEMENT
Configuration Architectures
Multi-context Providing storage for multiple configurations
facilitating configuration prefetching and fast reconfiguration Contains multiple planes (contexts) of configuration data

Reconfigurable Computing

28

RECONFIGURATION MANAGEMENT
Configuration Architectures
Multi-context Multiplexer chooses between the context planes

Reconfigurable Computing

29

RECONFIGURATION MANAGEMENT
Configuration Architectures
Multi-context advantage Background loading of configuration data Fast switching between stored configurations
some in a single clock cycle

Overlapping computations with configuration

Reconfigurable Computing

30

RECONFIGURATION MANAGEMENT
Configuration Architectures
Multi-context drawbacks Area overhead
Additional configuration data Multiplexing

Single cycle configuration


Dynamic power?

Reconfigurable Computing

31

RECONFIGURATION MANAGEMENT
Configuration Architectures
Partially Reconfigurable Not all configurations require the entire chip area Reconfigure utilized resources only Use addressable configuration memory

Reconfigurable Computing

32

RECONFIGURATION MANAGEMENT
Configuration Architectures
Partially Reconfigurable
Decrease reconfiguration time Decrease configuration data Configuration occupying large area (time issue) Independent configurations with overlapping hardware?

Reconfigurable Computing

33

OUTLINE
INTRODUCTION RECONFIGURABLE COMPUTING ARCHITECTURES RECONFIGURATION MANAGEMENT PROGRAMMING RECONFIGURABLE SYSTEMS COMPILING C FOR SPATIAL COMPUTING HW/SW Partitioning BEE2:A High-End Reconfigurable Computing System REFERENCES

Reconfigurable Computing

34

PROGRAMMING RECONFIGURABLE SYSTEMS


Reconfigurable systems can be ignored by application programmers unless they are able to easily incorporate its use into their systems. Software design environment that aids in the creation of configurations for the reconfigurable hardware is required.

Reconfigurable Computing

35

PROGRAMMING RECONFIGURABLE SYSTEMS


Software design environment
Manual Powerful method for the creation of high-quality circuit designs. Requires a great deal of background knowledge of the particular reconfigurable system employed. Significant amount of design time.

Reconfigurable Computing

36

PROGRAMMING RECONFIGURABLE SYSTEMS


Software design environment
Fully automatic Quick and easy. Makes the use of reconfigurable hardware more accessible to general application programmers. Quality may suffer.

Reconfigurable Computing

37

PROGRAMMING RECONFIGURABLE SYSTEMS

Reconfigurable Computing

38

OUTLINE
INTRODUCTION RECONFIGURABLE COMPUTING ARCHITECTURES RECONFIGURATION MANAGEMENT PROGRAMMING RECONFIGURABLE SYSTEMS COMPILING C FOR SPATIAL COMPUTING HW/SW Partitioning BEE2:A High-End Reconfigurable Computing System REFERENCES

Reconfigurable Computing

39

Compiling C for spatial computing Why C?


There are many more C programmers than hardware designers. Writing an algorithm in C is typically faster than in an HDL. Large existing code base. Allows both hardware (HW) and software (SW) versions to be created
operating system can choose at runtime which is better

Reconfigurable Computing

40

Compiling C for spatial computing Why C?


Easy for the designer or compiler to quickly explore the tradeoffs between different hardware/software partitioning. The code can be easily tested on a conventional microprocessor.

Reconfigurable Computing

41

Compiling C for spatial computing


How C runs on spatial hardware (overview)
In a C program, the statements execute in order. With spatial computation, each operation is implemented as a function unit

Reconfigurable Computing

42

Compiling C for spatial computing


How C runs on spatial hardware (overview)
Memory loads and stores Memory access operations must be scheduled
allow sharing among memory operations. preserve sequential C semantics.

Reconfigurable Computing

43

Compiling C for spatial computing


How C runs on spatial hardware (overview)
If-then-else Using Multiplexers

Reconfigurable Computing

44

Compiling C for spatial computing


How C runs on spatial hardware (overview)
More than just simple if-then-else control flow
Use sub-circuits

Reconfigurable Computing

45

Compiling C for spatial computing


How C runs on spatial hardware (overview)
Optimizing the Common Path

Reconfigurable Computing

46

Compiling C for spatial computing


How C runs on spatial hardware (overview)
What about
Parallelism? Pipelining? Memory dependencies? Operator size?

Reconfigurable Computing

47

Compiling C for spatial computing


Automatic Compilation
Overall compiler flow
Control Flow Graph C source code Hyperblocks Data Flow Graph Circuit Generation

Reconfigurable Computing

48

Compiling C for spatial computing


Automatic Compilation
Control Flow Graph (CFG) Breaking code into basic blocks of simple instructions. Blocks are connected by control edges indicating a possible branch. All instructions inside a given block execute once the block is entered.
Reconfigurable Computing 49

Compiling C for spatial computing


Automatic Compilation
Hyperblocks
CFG basic blocks are quite small and limit our opportunities for parallelism. Compiler combines blocks along commonly taken paths. Hyperblocks have a single entry point at the top and one or more exits.
Reconfigurable Computing 50

Compiling C for spatial computing


Automatic Compilation
Data Flow Graph (DFG) The DFG is composed of nodes and edges. Nodes
Inputs, constants, operations, memory access and exit nodes

Edges
Data transfer edges, ordering edge, exit edge
Reconfigurable Computing 51

Compiling C for spatial computing


Automatic Compilation
Data Flow Graph (DFG)

Reconfigurable Computing

52

Compiling C for spatial computing


Automatic Compilation
DFG optimizations Strength reduction
replacing one operator with another operator(s) having less overall latency/area.
replace x*2 with x+x or x<<1 x*7 can be expressed as (x<<2)+(x<<1)+x, but even better as (x<<3)-x.

Reconfigurable Computing

53

Compiling C for spatial computing


Automatic Compilation
DFG optimizations Boolean value identification
ISO C does not contain a Boolean data type Although the result of a comparison is defined to be either 0 or 1, the type of the result is a signed integertypically 32 bits. Use only one bit

Reconfigurable Computing

54

Compiling C for spatial computing


Automatic Compilation

DFG optimizations
Type-based operator size reduction
ISO C semantics dictate that arithmetic and logical operations involving type char and/or short operands must be performed at the precision of type int. Thus, a 16-bit adder will give the same result as a 32-bit adder

Reconfigurable Computing

55

Compiling C for spatial computing


Automatic Compilation

DFG optimizations
Type-based operator size reduction
Analyze number of bits actually required by variables and operators.
Example
Integer i within the loop for (i = 0; i < 100; i++)

Reconfigurable Computing

56

Compiling C for spatial computing


Automatic Compilation
DFG to Reconfigurable Fabric
Mapping DFG nodes to modules Scheduling each module to a specific timestep. Then, finally, connections are made between modules from different hyperblocks sub-circuits to complete the overall circuit.

Reconfigurable Computing

57

OUTLINE
INTRODUCTION RECONFIGURABLE COMPUTING ARCHITECTURES RECONFIGURATION MANAGEMENT PROGRAMMING RECONFIGURABLE SYSTEMS COMPILING C FOR SPATIAL COMPUTING HW/SW Partitioning BEE2:A High-End Reconfigurable Computing System REFERENCES

Reconfigurable Computing

58

HW/SW Partitioning
For systems that include both reconfigurable hardware and a traditional microprocessor. program must first be partitioned into
Sections to be executed on the reconfigurable hardware
ex. fixed datapath operations

Sections to be executed in software on the microprocessor


ex. complex control sequences such as variable-length loops
Reconfigurable Computing 59

HW/SW Partitioning
Partitioning
Manually
Program developed ends up tuned to a specific machine Alternative solution is to use compiler directives
The NAPA C language [Gokhale and Stone 1998] provides pragma statements to allow a programmer to specify whether a section of code is to be executed in software on the Fixed Instruction Processor (FIP), or in hardware on the Adaptive Logic Processor (ALP).

Reconfigurable Computing

60

HW/SW Partitioning
Partitioning
Automatically
compiler and runtime system take full responsibility for determining the right code and granularity to move to the reconfigurable fabric. reconfigurable hardware transparent to the designer Cost functions based upon acceleration gained
to determine whether the cost of configuration is overcome by the benefits of hardware execution or not.

Reconfigurable Computing

61

OUTLINE
INTRODUCTION RECONFIGURABLE COMPUTING ARCHITECTURES RECONFIGURATION MANAGEMENT PROGRAMMING RECONFIGURABLE SYSTEMS COMPILING C FOR SPATIAL COMPUTING HW/SW Partitioning BEE2:A High-End Reconfigurable Computing System REFERENCES

Reconfigurable Computing

62

BEE2:A High-End Reconfigurable Computing System


BEE: Berkeley Emulation Engine
BEE2 can provide over 10 times more computing throughput than a DSP-based system with similar power consumption and cost. Over 100 times that of a microprocessor-based system.

Reconfigurable Computing

63

BEE2:A High-End Reconfigurable Computing System


BEE: Berkeley Emulation Engine
Applications
Emulation and design of novel wireless communications systems. High-performance real-time digital signal processing. Real-time scientific computation and simulation. The acceleration of CAD tools.

Reconfigurable Computing

64

BEE2:A High-End Reconfigurable Computing System


BEE: Berkeley Emulation Engine
BEE2 system uses Xilinx Virtex-2 Pro FPGAs Virtex-2 Pro embeds PowerPC 405 processor cores into the reconfigurable fabric. BEE2 has no hardware-managed caches, hence all data transfers within the system have tightly bounded latency.
BEE2 is therefore well suited for real-time applications

Reconfigurable Computing

65

BEE2:A High-End Reconfigurable Computing System


BEE: Berkeley Emulation Engine
Programming environment
High-level block diagram design environment based on Mathworks Simulink and the Xilinx System Generator library. Uses automatic compilation tools

Reconfigurable Computing

66

BEE2:A High-End Reconfigurable Computing System


Compute modules:
Compute modules: consists of five Xilinx Virtex 2 Pro 70 FPGA chips directly connected to four Dual Datarate2(DDR2)- 240-pin DRAM DIMMs, with a maximum capacity of 4 Gbytes per FPGA. The local mesh connects the four compute FPGAs on a 2D grid.

Reconfigurable Computing

67

BEE2:A High-End Reconfigurable Computing System


Compute modules:
Each link between the adjacent FPGAs on the grid provides over 40 Gbps of data throughput per link. The four down links from the control FPGA to each of the computing FPGAs provide up to 20 Gbps per link
Reconfigurable Computing 68

REFERNCES
Scott Hauck and Andre Dehon, Reconfigurable Computing The Theory and Practice of FPGA Based Computing Katherine Compton, Reconfigurable Computing: A Survey of Systems and Software, Northwestern University. Chen Chang, John Wawrzynek, and Robert W. Brodersen, Berkeley BEE2: A High-End Reconfigurable Computing System, University of California.

Reconfigurable Computing

69

Thank You

Reconfigurable Computing

70

You might also like