0% found this document useful (0 votes)
2 views

Section_1HLS_Overview_Powerpoint

The document outlines a Video Processing Masterclass with FPGA, focusing on High-Level Synthesis (HLS) tools, specifically the VIVADO HLS tool. It covers the design flow, libraries, and practical applications of HLS in converting C/C++ code into hardware description languages like VHDL and Verilog. The course includes lectures on HLS introduction, design flow, and a lab session for counter design and synthesis in VIVADO HLS.

Uploaded by

Ahmdullah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Section_1HLS_Overview_Powerpoint

The document outlines a Video Processing Masterclass with FPGA, focusing on High-Level Synthesis (HLS) tools, specifically the VIVADO HLS tool. It covers the design flow, libraries, and practical applications of HLS in converting C/C++ code into hardware description languages like VHDL and Verilog. The course includes lectures on HLS introduction, design flow, and a lab session for counter design and synthesis in VIVADO HLS.

Uploaded by

Ahmdullah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Video Processing with FPGA’s

Course Prepared by
Digitronix Nepal
www.digitronixnepal.com

Video Processing Masterclass with FPGA


Section 1. Overview of High Level Synthesis Tool
Lecture 1 : HLS Introduction, VIVADO HLS Overview
Lecture 2 : HLS Design Flow, review of C/C++ on HLS and HLS Libraries
Lecture 3 : Lab1: Counter Design and Synthesizing in VIVADO HLS

Video Processing Masterclass with FPGA


Objective of the Section
After Completing this section you will be able to:
• Describe and explain about High Level Synthesis Tools
• Explain about HLS design flow, HLS Constructs and Libraries
• Design Counter (basic circuit) in HLS.

Video Processing Masterclass with FPGA


Lecture 1 : HLS Introduction, VIVADO HLS
Overview
• This is a tool for synthesis of digital hardware directly from a high level
description developed in C, C++, and can generate(Create Synthesis)
VHDL/Verilog/ and System C Source form the C/C++ source.
• The defining aspect of HLS is that the designed functionality and its
hardware implementation are kept separate — the C-based description
does not implicitly fix the hardware architecture, as is inherently true in
RTL-level design — and this provides great flexibility.
• the HLS process provides an integrated mechanism for generating and
assessing variations on the hardware implementation, making it easy and
convenient to find the best architecture.

Video Processing Masterclass with FPGA


Need for High-Level Synthesis
• Algorithmic-based approaches are getting popular due to accelerated
design time and time to market (TTM)
• Larger designs pose challenges in design and verification of hardware at HDL level
• Industry trend is moving towards hardware acceleration to enhance
performance and productivity
• CPU-intensive tasks can be offloaded to hardware accelerator in FPGA
• Hardware accelerators require a lot of time to understand and design
• Vivado HLS tool converts algorithmic description written in C-based design
flow into hardware description (RTL)
• Elevates the abstraction level from RTL to algorithms
• High-level synthesis is essential for maintaining design productivity for
large designs
Video Processing Masterclass with FPGA
High-Level Synthesis: HLS
➢High-Level Synthesis ………………
• C/C++/OpenCL source converts into RTL ………………
C, C++,
(VHDL/Verilog) or IP format. SystemC
Constraints/ Directives

• Extracts control and dataflow from the


source code.
• User can apply directives for implementing Vivado HLS
the design, setting up the pragma for
optimization or control.

➢HLS: ………………
………………
VHDL
• Creating project on modules, integrating Verilog
System C
the library files (header files)
• Enables design optimization for resources
utilization and latency of the project. RTL Export
IP-XACT Sys Gen PCore

Video Processing
Masterclass with FPGA
VIVADO HLS Tool Overview:
• HLS is developed for implementing complex signal processing and
mathematical implementation on FPGA, while this implementation is
quite complex on HDL.
• HLS converts the C/C++ source in to HDL source , i.e
VHDL/Verilog/SystemC
• There are many libraries and functions for signal processing and math
computation on HLS.

Video Processing Masterclass with FPGA


Invoke Vivado HLS from Windows Menu

The first step is to open or create a


project

Video Processing Masterclass with FPGA


Vivado HLS GUI

Information
Auxiliary Pane
Pane

Project
Explorer
Pane

Console
Pane

12- 9
Video Processing Masterclass with FPGA
Vivado HLS Projects and Solutions
• Vivado HLS is project based
• A project specifies the C/C++/OpenCL code which will be synthesized
• Each project is based on one set of source code or main module and
project can have user defined name Source

• A project can contain multiple solutions


• Solutions are different implementations of the same source code
• Solution auto-named as solution1, solution2, etc. or can have user
defined names
• Each Solutions can have different clock frequencies, target boards,
synthesis directives
Project Level Solution Level
• Projects and solutions are stored in a hierarchical directory
structure
• Top-level is the project directory
• The local storage disk directory structure is identical to the structure
shown in the GUI project explorer (except for source code location) Reference: Xilinx
12- 10
Video Processing Masterclass with FPGA
Vivado HLS Step 1: Create or Open a project
• Start a new project
• The GUI will start the project wizard to guide you through all the steps

Optionally use the Toolbar Button to


Open New Project

• Open an existing project


• All results, reports and directives are automatically saved/remembered
• Use “Recent Project” menu for quick access
12- 11
Video Processing Masterclass with FPGA
Lecture 2 : HLS Design Flow, review of C/C++
on HLS and HLS Libraries
• All the C/C++ standard libraries can be invoked(call/used) on the
C/C++ sources.
• Aside of Standard C/C++ library there are HLS libraries for
• Video Processing
• Signal Processing
• Mathematical calculations
• HLS is highly preferred for algorithm implementation of signal
processing, Machine Vision/Neural Net etc.

Video Processing Masterclass with FPGA


HLS Design Flow:
………………
………………
C, C++, Constraints/ Directives
SystemC

Vivado HLS

………………
………………
VHDL
Verilog
System C

RTL Export
IP-XACT Sys Gen PCore

Reference: Xilinx
Video Processing Masterclass with FPGA
The Key Attributes of C code
Functions: functions in the source code represent the design hierarchy: the same
void fir ( in hardware
data_t *y,
coef_t c[4],
data_t x Top Level IO : The arguments of the top-level function of source code determine
){ the hardware RTL (VHDL/Verilog) interface ports of input, output or in/out.
static data_t shift_reg[4];
acc_t acc; Data-Types: All variables are of a defined type. Different types of datatype can influence
int i; the area and performance. As some data type are 8 bit some are 16 bit or more.
acc=0;
loop: for (i=3;i>=0;i--) { Loops: Functions on the source model may contain loops. Handling of loops can have a
if (i==0) {
acc+=x*c[0]; major impact on area and performance as loops take large number of LUT and FF..
shift_reg[0]=x;
} else {
shift_reg[i]=shift_reg[i-1]; Arrays: Arrays on source code or module can influence the device IO and become
acc+=shift_reg[i] * c[i];
}
performance bottlenecks. Array must be defined the specific size, undefined array
} wont support on HLS.
*y=acc;
}
Operators: Operators in the source code or module may require sharing to control
area or specific hardware implementations to meet performance. Operations
consume LUT, so the use of operator for operation also play role on resource
consumption and performance standards.
The resource or control sharing can be planned as well as pipelined on HLS.

Reference: Xilinx
Video Processing Masterclass with FPGA
Functions & RTL Hierarchy
• Each function is translated into an RTL block
• Verilog module, VHDL entity
Source Code RTL hierarchy
void A() { ..body A..}
void B() { ..body B..} foo_top
void C() { C
B(); B
} A
void D() {
B();
} D
B
void foo_top() {
A(…);
C(…);
D(…)
} my_code.c

• Functions may be inlined to dissolve their hierarchy


• Small functions may be automatically inlined Reference: Xilinx
Video Processing Masterclass with FPGA
Types = Operator Bit-sizes
Code Operations Types
void fir (
data_t *y, Standard C types
coef_t c[4],
data_t x
){ RDx long long (64-bit) short (16-bit) unsigned types
RDc int (32-bit) char (8-bit)
static data_t shift_reg[4];
acc_t acc; float (32-bit) double (64-bit)
int i; >=
acc=0; -
loop: for (i=3;i>=0;i--) {
if (i==0) { == Arbitary Precision types
acc+=x*c[0];
shift_reg[0]=x; + C: ap(u)int types (1-1024)
} else {
shift_reg[i]=shift_reg[i-1]; * C++: ap_(u)int types (1-1024)
acc+=shift_reg[i]*c[i];
+ ap_fixed types
} C++/SystemC: sc_(u)int types (1-1024)
}
*y=acc;
* sc_fixed types
} WRy Can be used to define any variable to be a specific bit-
width (e.g. 17-bit, 47-bit etc).
From any C code example Operations are The C types define the size of the hardware
... extracted… used: handled automatically
Video Processing Masterclass with FPGA
Reference: Xilinx
Loops
• By default, loops are rolled
• Each C loop iteration ➔ Implemented in the same state N
• Each C loop iteration ➔ Implemented with same resources

void foo_top (…) {


... foo_top
Add: for (i=3;i>=0;i--) {
b = a[i] + b;
...
} Synthesis

+
b
a[N]

Loops require labels if they are to be referenced by Tcl


directives
(GUI will auto-add labels)

• Loops can be unrolled if their indices are statically determinable at elaboration time
• Not when the number of iterations is variable
• Unrolled loops result in more elements to schedule but greater operator mobility
• Let’s look at an example …. Reference: Xilinx
Video Processing Masterclass with FPGA
Arrays in HLS
• An array in C code is implemented by a memory in the RTL
• By default, arrays are implemented as RAMs, optionally a FIFO
foo_top
N-1 SPRAMB
void foo_top(int x, …) A[N]
{ N-2 A_in DIN DOUT A_out
int A[N];
L1: for (i = 0; i < N; i++) … Synthesis ADDR
A[i+x] = A[i] + i;
1 CE
}
0 WE

• The array can be targeted to any memory resource in the library


• The ports (Address, CE active high, etc.) and sequential operation (clocks from address to data
out) are defined by the library model
• All RAMs are listed in the Vivado HLS Library Guide
• Arrays can be merged with other arrays and reconfigured
• To implement them in the same memory or one of different widths & sizes
• Arrays can be partitioned into individual elements
• Implemented as smaller RAMs or registers
Reference: Xilinx
Video Processing Masterclass with FPGA
Top-Level IO Ports
• Top-level function arguments
• All top-level function arguments have a default hardware port type
• When the array is an argument of the top-level function
• The array/RAM is “off-chip”
• The type of memory resource determines the top-level IO ports
• Arrays on the interface can be mapped & partitioned
• E.g. partitioned into separate ports for each element in the array

void foo_top( int A[3*N] , int x) DPRAMB


{ foo_top
L1: for (i = 0; i < N; i++) DIN0 DOUT0
A[i+x] = A[i] + i; Synthesis ADDR0

+
}
CE0
WE0

Number of ports defined by the RAM DIN1 DOUT1


resource ADDR1

• Default RAM resource CE1


WE1
• Dual port RAM if performance can be improved otherwise Single Port RAM
Reference: Xilinx
Video Processing Masterclass with FPGA
The following libraries are included with Vivado HLS:
Name Description
Arbitrary Precision Data
Integer and fixed-point (ap_cint.h, ap_int.h and systemc.h)
Types
HLS Stream Models for streaming data structures. Designed to obtain best performance and area (hls_stream.h)
Extensive support for the synthesis of the standard C (math.h) and C++ (cmath.h) math libraries. The support
includes floating point and fixed-point functions: abs, atan, atanf, atan2, atan2, ceil, ceilf, copysign, copysignf,
HLS Math cos, cosf, coshf, expf, fabs, fabsf, floorf, fmax, fmin, logf, fpclassify, isfinite, isinf, isnan, isnormal, log, log10,

Video Processing Masterclass with FPGA


modf, modff, recip, recipf, round, rsqrt, rsqrtf, 1/sqrt, signbit, sin, sincos, sincosf, sinf, sinhf, sqrt, tan, tanf,
trunc
Video library to implement several aspects of modeling video design in C++ with video Functions, specific
data types, memory line buffer and memory window (hls_video.h). Through a data type hls::Mat, Vivado HLS
is also compatible with existing OpenCV functions: AXIvideo2cvMat, AXIvideo2CvMat, AXIvideo2IplImage,
cvMat2AXIvideo, CvMat2AXIvideo, cvMat2hlsMat, CvMat2hlsMat, CvMat2hlsWindow, hlsMat2cvMat,
HLS Video hlsMat2CvMat, hlsMat2IplImage, hlsWindow2CvMat, IplImage2AXIvideo, IplImage2hlsMat, AbsDiff, AddS,
AddWeighted, And, Avg, AvgSdv, Cmp, CmpS, CornerHarris, CvtColor, Dilate, Duplicate, EqualizeHist, Erode,
FASTX, Filter2D, GaussianBlur, Harris, HoughLines2, Integral, InitUndistortRectifyMap, Max, MaxS, Mean,
Merge, Min, MinMaxLoc, MinS, Mul, Not, PaintMask, PyrDown, PyrUp, Range, Remap, Reduce, Resize, Set,
Scale, Sobel, Split, SubRS, SubS, Sum, Threshold, Zero
HLS IP Integrate the LogiCORE IP FFT and FIR Compiler (hls_fft.h, hls_fir.h, ap_shift_reg.h)
HLS Linear Algebra Support for the following functions: cholesky, cholesky_inverse, matrix_multiply, qrf, qr_inverse, svd
(hls_linear_algebra.h)
Support for the following functions: atan2, awgn, cmpy, convolution_encoder, nco, qam_demod, qam_mod,
HLS DSP
sqrt, viterbi_decoder (hls_dsp.h)
Reference: Xilinx
Lecture 3 : Lab1: Counter Design and
Synthesizing in VIVADO HLS
#include<iostream>
#include<stdlib.h>//#include<conio.h>
using namespace std;
int main()
{ int count = 0;
bool reset=false;
while(1) {
Counter C++ Module: cout<<""<<count<<endl;
if (reset==true)
count = 0;
count++;
if(count > 15)
count = 0;
for(int i=0; i<450000000;i++);
}
return 0;}
Video Processing Masterclass with FPGA
Lecture 3 : Lab1: Counter Design and
Synthesizing in VIVADO HLS
Design Steps:
• Open VIVADO HLS, create new project “counter”
• Insert the C++ Source and Target ZedBoard FPGA
• For Synthesizing the design→Go to Run C Synthesis (Active Solution)
• Now expand Syn Folder there must have VHDL/Verilog and System C
Generated.
• For Simulating C/C++ source we need to have separate source; we
will see simulate design process in next section (lab 2).

Video Processing Masterclass with FPGA


Video Processing Masterclass with FPGA
HLS Design References:

Video Processing Masterclass with FPGA


Key Documents
For the most current links to Vivado High-Level Synthesis resources, use the
Design Hub View in Vivado Document Navigator and select "High-Level
Synthesis".

Name Description

UG1197 UltraFast High-Level


Productivity Design Methodology Methodology guide
Guide

WP416 Vivado Design Suite Vivado Design Suite Backgrounder

High-Level Synthesis Tutorial


UG871 Vivado Design Suite Tutorial

UG902 Vivado Design Suite User High-Level Synthesis User Guide


Guide

UG958 Vivado Design Suite


Model-based DSP Design using System Generator
Reference Guide

Video Processing Masterclass with FPGA


Application Notes

XAPP599 Floating Point Design with Vivado HLS

XAPP745 Processor Control of Vivado HLS Designs

Implementing Memory Structures for Video


XAPP793
Processing

Zynq All Programmable SoC Sobel Filter


XAPP890
Implementation

Accelerating OpenCV Applications with Zynq-7000


XAPP1167
AP SoC using Vivado HLS Video Libraries

Video Processing Masterclass with FPGA


Let’s Go to
VIVADO HLS for the project

Video Processing Masterclass with FPGA


Thank You!

Video Processing Masterclass with FPGA

You might also like