Design and Simulation of 5G Massive MIMO Kernel Algorithm On SIMD Vector Processor
Design and Simulation of 5G Massive MIMO Kernel Algorithm On SIMD Vector Processor
Abstract — In cellular communications, recently Multi In network allows the transmission and reception of multiple data
Multi Out (MIMO) and Massive MIMO research is getting signals simultaneously over a single radio channel as shown in
attention for the need of high data rates in Long Term Evolution figure 1. Standard MIMO networks generally use two are four
Advanced(LTE-A) and 5G Communications. In MIMO baseband antennas but massive MIMO uses more number of antennas.
signal processing at physical layer, both the channel estimation
and the detection algorithms play a crucial role. In this paper it is
discussed the estimation algorithms least square (LS) and
minimum mean square error (MMSE) and the channel detection 1 1
1
algorithms Zero Forcing (ZF) and MMSE. Currently none of the 1
channel estimation algorithms of LTE-A offers twin advantages 1
2
requirement of 5G. It is expected the massive MIMO with 128 or Base
more antennas will be a norm at 5G base stations. To achieve the
station
ultra-low latency, the matrix computations for massive MIMO
are the very big bottleneck in realizing the channel estimation 2
and massive MIMO detection algorithms. For the optimization of
the 5G Massive MIMO channel estimation and detection
algorithms, the prerequisite is massive complex matrix inversion
speed. In this paper, a parallel processing based coding scheme is
proposed by using Gauss-Jordan elimination kernel algorithm on
a single instruction multiple data (SIMD) stream vector 1
N
speed which is the need of 5G channel estimation and detection.
Fig. 1. Massive MIMO system
Keywords—5G, MIMO, Massive MIMO, LS, MMSE, ZF, Already standard MIMO principles are used in numerous
SIMD
Wi-Fi and LTE standards, and massive MIMO is one of the
I. INTRODUCTION key technologies for the success of 5G cellular
communications. The main challenge of MIMO signal
The mobile communication technology has developed from
processing is the complexity involved in the channel
the First Generation (1G) mobile phone networks, only
estimation and detection.
analogue voice to Fourth Generation(4G), both digital voice,
data, video and IoT Traffic[1]. 5G Wireless communication Single Input and Multiple Data (SIMD) instruction
systems are getting developed and the big challenges for the processing is one of the best type of parallel processing [4].
design and deployment of 5G cellular system are reducing The main scheme of SIMD processor is to apply the same
power consumption, ultra low latency, ultra high data rates sequence of data to a huge number of distinct data streams.
and increased compatibility between the IoT devices. The SIMD processor, each instruction uses the number of
processing elements (PEs) as shown in figure 2.
LTE Advanced is the one of the major step in the evolution
of our LTE networks towards 5G. The introduced key SIMD processors have mainly two types one is array
technologies in LTE-A are carrier aggregation, enhanced use processor and second one is vector processor. An array
of multiple antenna elements mapped. In the current paper, the processor is works on time based, that means multiple data
main focus is on Multiple-input multiple-output elements simultaneously. A vector processor operates
(MIMO)[2][3] for 5G. multiple data elements in successive time steps.
Massive MIMO: Generally MIMO systems utilize multiple
antennas that are located at both the source and destination. This paper is presented in four sections. The section II,
While it involves multiple technologies, MIMO can describes channel estimation and detection for the MIMO
essentially consist of a simple principle where wireless system, Section III talks about the Implementation of Massive
53
SPACES-2018, Dept. of ECE, K L Deemed to be UNIVERSITY
MIMO matrix inversion, Section IV discusses the results and where is the channel autocorrelation matrix at the
finally conclusions are made in section V pilot symbol position and is the cross correlation matrix
between the channel at the data symbol position and pilot
symbol position.
Instruction
B. MIMO Detection Algorithms
In MIMO detection, the detector compares the estimation of
PE the transmitted signal based on the received signal and
calculates the estimated channel matrix. The transmitted signal
PE is recovered from the receiver signal as an output of the
PE
detector [9], once the estimation and calculation of the channel
Data Pool matrix is done. Two algorithms are used for detect the signal
PE they are ZF detection and MMSE detection.
ZF detection: This algorithm is the easiest one with the least
computational complexity. ZF detection starts with
PE multiplication of the received symbol vector and the channel
matrix[10], [11].
Fig.2. SIMD processor
54
SPACES-2018, Dept. of ECE, K L Deemed to be UNIVERSITY
x Select pivot, note the positioned row and column of the processor carry out an ergodic access which means every
pivot. element from the earliest row to the last row is accessed by the
x Interchange the column and row processor, with the aim of selecting the pivot of each row. In
x Calculate the reciprocal of the pivot, and then make the mode2, to interchange the row and column depending on
linear transformation of row/column exact pivot position. In mode3, every iteration of the complex
matrix inversion algorithm analyzes the outmost loop of
x Exchange row and column, and recommence pivot
computation and not access the row of the current pivot. In
location selection mode4 same operation of mode 3(row access) performs but it
A. SIMD instruction mapping: will in column access. Mode5 will performs data will allocate
After examined the working of matrix inversion algorithm and
to SIMD.
verifying the precision of algorithm's, to map this algorithm to SIMD B. Overall data allocation
instructions. For each computation of the matrix inverse algorithm
the SIMD instruction mapping as given below: Two vector memories of the SIMD processor are allocated
for the calculation of matrix inversion. Some of the
Start intermediate steps are present they performs reciprocal,
complex number multiplication, and subtraction needed to be
stored in vector registers. Data allocation in SIMD is as shown
in figure 4.
Analysis of the algorithm
Main memory
Select pivot: The algorithm selects the maximum value of the In the figure 4, the main memory has store the data of
complex element as the pivot in each row. The maximum value original input matrix and output matrix. After out of use
permutation the vector memory1 is used to store the input
of the pivot, which can be resulted from using TMAC2
matrix. In the input matrix to calculate the square of the
instruction repeatedly. complex number, select pivot and reciprocal values are stored
Reciprocal: By using the parallel polynomial estimation in polynomial coefficients. To exchange the row/column
method to calculate the reciprocal of the complex number. exchange in data buffer and register buffer are used to calculate
Here also use TMAC2 the reciprocal of complex number. In the vector memory2 the
Linear transformation of row and column: CMAC and reference memory is used for gauss-Jordan elimination and
CMUL instructions are used, to swap the row and column by data buffer is storing the results of elimination of every row.
means of the multiplication and subtraction of the parallel Finally the output matrix is in vector memory2 in SIMD
complex numbers. processor.
Data access modes: In the matrix inversion algorithm we
design five types of accessing modes are used. In the mode1
55
SPACES-2018, Dept. of ECE, K L Deemed to be UNIVERSITY
56
SPACES-2018, Dept. of ECE, K L Deemed to be UNIVERSITY
57