Design and Testing of Systolic Array Multiplier Using Fault Injecting Schemes
Design and Testing of Systolic Array Multiplier Using Fault Injecting Schemes
Kurada Verra Bhoga Vasantha Rayudu1, Dhananjay Ramachandra Jahagirdar1, Patri Srihari Rao2
1
Scientist’G’Head, Reliability Engineering Division, Research Centre Imarat, Kurmalguda, India
2
Department of Electronics and Communication Engineering, NIT Warangal, Hanamkonda, India
Corresponding Author:
Kurada Verra Bhoga Vasantha Rayudu
Scientist’G’Head, Reliability Engineering Division, Research Centre Imarat
Vignyanakancha Po, Hyderabad-500069, Kurmalguda, India
Email: [email protected]
1. INTRODUCTION
Many multipliers are used to achieve low power and high-speed performance, In DSP systems, most
of the DSP applications are designed for power dissipation and components used as multipliers [1]–[5] and to
perform various high-speed operations multiplications play a major role in winding up the design. Mainly
multiplication is an algorithm used at a structural level. Multi-dimension multiplication is done by the
systolic array multipliers, those multipliers are a sequence of channels and it’s a pipe lining process with a
linear arrangement. When the multiplication process happens, it stores the information itself and processes it
to the next pipeline level, and maintains a pipelining process, each block of the systolic array multiplier is
fixed and looks similar. The simultaneous process performs in systolic arrays which increases the speed of
the system and reduces the processing time with perfect efficiency of the output. Systolic array Multipliers
are used for sorting and convolution techniques.
In this paper, we developed a systolic array design with the new model gate which decreases the
delay and increases the speed of the operation, first of the multiplicand and multiplier are arranged in an
array structure, and from the both of each bit is collected and do multiplicand, and its processes to the later
pipeline stage, partial products, and carry generation done in the later stages. From the statement of the great
scientist Landauer energy is dissipated at each bit of lost when transmits data with a particular amount of
energy, the basic formula for calculating the loss of each bit of energy dissipated as KT*log2, T defines
absolute temperature and K Defines Boltzmann’s Constant. Reversible logic proved that we can minimize the
dissipation of the heat by Charles Bennet [6], [7]. Reversible design is the future for developing circuits for
low power and high-speed operations with very few system designs used. The main structures of the
reversible gates are designed in such a way that the number of inputs is equal to the number of outputs. By
this, it improves the overall performance of the systems [8]–[10]. In this paper systolic array multiplier is
designed using reversible technology; it means all the components of the design use reversible gates to
achieve the low power targets. Most of the system designs are being developed by reversible gates but testing
was more complex and to reach the time to market it depends on the way of testing.
In the existing paper [11], [12] developed the systolic array multiplier with reversible gates, and
proposed a multiplier for 4x4 systolic array design which calculates partial products and passes the partial
products for carrying select generation, the testing to be done but simulated the design using the design tools
and verified only parts of the design through simulations. In this paper [13], [14] they have proposed a new
level of testing using BILBO logic where we can find the number of faults, but they have tested for Baugh
Wooley multiplier designs. Most of the Baugh Wooley designs are used for high-speed operations, and also
when we change the increased number of the bits for operations, we required more logic for the testing and
implementation. The researchers [15], [16] addressed fault analysis techniques for computing multipliers by
reviewing different methodologies of converting matrix algorithms to a predefined systolic array designs and
then introduces array structure of the systolic part designs which was originally designed by the Lang and
Moreno. Morghade et al. [17] Proved the design was correct by using the simulations and all the logic that
implemented was algorithms for multiplication, division and direct multiplications methods, have examined
various methods of testing they come up with LFSR technique which generates the random number of values
for testing and applied and got succeeded and then moved for shift register designs which actually increases
the area of the chip. The researchers [18]–[20] proposes a new method of approach for reducing the power
consumption on an irreversible array multiplier and also using the reversible logic designs for the systolic
array multiplier designs, which they expected to get high-end of the efficiency of the output in which
compared with existing they end up with good results and also tested with 90ns CMOS nanometer
technology. The researchers [21]–[24] which comes over a GF has made a bright application over the
security of the multiplications and developed systolic array multiplier design over GF multiplier designs with
full pattern generator using a six-bit counter and generate number of patterns required for the testing of the
system designs for GF multiplier designs where it increases delay in the circuit and in the proposed system,
we have overcome the issue of the delay removal of GF in the proposed system. The Proposed system of the
research is to design an advanced systolic array multiplier with a new modified gate and test using fault
injection method using BILBO logic for generating different patterns of test vectors.
2. RESEARCH METHOD
Nowadays many low-power applications use reversible gate designs for low area and power.
Because the logic present in reversible gates like no of input variables is equal to the number of output
variables [25] where the utilization of power is used equally for fan outs, it is used for low power relevance
designs. Quantum cost also reduces with the main logic involved in reversible designs. The majority plays an
important role in reducing power dissipation due to the garbage and constant inputs used, when the circuit
has garbage outputs power utilization is reduced due to which power loss is less. Reversible logic design
selected for the project for low power dissipation and the reversible gate has been modified and is used for
full adder design circuit, namely modified Islam gate shown in Figures 1 and 2. Modified Islam gate has 4
inputs and 4 outputs which output reflect as full adder model designs usage.
We have used controlled operational gate design which is used for getting full adder to carry select
block, COG gate has now inputs and outs are equal i.e., 3, where logic completely depends on the second and
third input variables, based on the status of that variables logic changes and works for full adder carrier
output. Mostly COG reversible designs used for low power circuits in DSP Application for having the
number of multiplier designs to get partial products intern to get resultant carry generation blocks, in our
project we defined for the usage of carrier output.
Figure 1. Reversible modified Islam gate Figure 2. Reversible controlled operational gate
By Integrating the above two reversible gates MIG and COG we get a complete adder and
subtractor, which is used for systolic array multiplication, in systolic array multiplication it is used for
multiplication, the process will be briefed in section 2.2. Mostly reversible full adder in Figure 3 plays a
major role in any of the applications like video, medical, and many digital world systems
testing and verification SOC designs, GRM the coding part can be user-defined and it can be of any
language, but it should work exactly as DUT, In this project reference model is taken as VHDL model for
easy understanding of the flow of the multiplier at each stage of the block, when BIBLO generates the
patterns, GRM also picks up the values and used for generating the outputs, the main focus is to get injected
with the sum of the faults into the reference design, with the BILBO logic to get compared with the signature
values to get the exact faults where has injected.
Figure 5. The proposed system with DUT and all required components
DUT Circuit which is used for testing could be placed in middle of the BILBOs, which are mostly
working in the relevant modes as Linear feedback shift register and MISR modes. To test the circuit of SAM,
a 4-bit multiplier design and an 8-bit BILBO were used. YAG gate design [26] is used for generating sum
and product terms simultaneously. Input signal always in SCAN Mode If the BILBO uses LFSR mode, it
generates the no of patterns required for the multiplier and the multiplier takes the inputs and intakes the
output to the BILBO, which performs the operations to generate the signature like MISR Mode. If there is a
signature produced for no-fault injection circuits called a good signature. Now the process begins will inject
the faults in the design and generate the LFSR mode and gets patterns and generates the signature and that
signature compares with the existing signature. If both matches, it proves testing did not happen correctly or
fault is not identified by BILBO, if not BILBO detected fault. Checkers are most common in verification
areas; checkers are named as scoreboard logics in which the two different data received from two blocks are
to be compared and verified whether matched or mismatched to get the resultant of usage of DUT. Checkers
are coded in the environment and tested the SAM circuit by injecting faults and by not injecting faults. In this
project, a comparison is done between GRM and DUT outputs and storing the resultant for future usage.
As the process starts BILBO starts generating the patterns using modes, those patterns carried out
within the environment and given to reversible systolic array multiplier, it processes the number of patterns it
receives as it works as a pipeline stage multiplier, it generates the resultant and gives to the checker logic
whereas simultaneous process happens in reference model used and also BILBO starts generating patterns at
the same time, from the environment we are injecting the faults, one time stuck at 0/1 fault injected, and we
see resultant is wrong than expected as in the Same BILBO logic gives a significant value as false, then the
design will be corrected if BILBO passes as good signature it is failed to verify the design, hence the design
should be modified depends on logic preferred.
Hence, the process of testing continues with various injections of faults, and results are compared
using a checker. According to the research, many BIST architectures had been proposed but BILBO has
played a vital role in the present generation as in SAM Project, we can configure it as an input generation of
patterns in a full environment as shown in Figure 6, and also can be configured as output analyzer.
Depending on the selection of inputs like b1 and b2, the mode can be selected. Various fault models
discussed in [27]–[29] Compare to all techniques BIST technique is more popular because of its low power
and less time of execution, complex designs also get testing done very fast, BILBO called LOGIC BIST
because of using BIST as the main component in it and used for operating modes. Mainly in this project, a
reversible multiplier is used for testing using the reversible BILBO logic applied for finding two main faults
SAF, MSAF, and MGF faults of the design. Stuck at faults are rare faults that occur in designs and can be
more complexes to find the faults whether to zero or one, Multiple stuck at faults also a rare finding of faults
in conventional designs and Missing gate fault changes the output of the design, finding these types of faults
are the most important nowadays to make fault free system designs [30], [31].
Figure 6. Full environment and testing with proposed systolic array multiplier using fault injection schemes
Figure 7. Resultant of reversible systolic array multiplier using pattern generator from BILBO logic design
Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places
x1=1, x2=0, x3=0, x4=1, x5=0, scan_in=1, out=1, 3100
Finding test vector of the resultant at stuck at 0/1 is PASSED
x1=0, x2=0, x3=1, x4=0, x5=0, scan_in=0, out=0, 3200
Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places
x1=1, x2=0, x3=0, x4=0, x5=0, scan_in=0, out=0, 3300
Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places
x1=0, x2=1, x3=1, x4=0, x5=0, scan_in=0, out=1, 3400
Finding test vector of the resultant at stuck at 0/1 is PASSED
x1=1, x2=0, x3=0, x4=1, x5=0, scan_in=1, out=0, 3500
Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places
x1=0, x2=0, x3=1, x4=0, x5=0, scan_in=0, out=1, 3600
Finding test vector of the resultant at stuck at 0/1 is PASSED
x1=1, x2=0, x3=0, x4=0, x5=0, scan_in=0, out=0, 3700
Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places
x1=0, x2=1, x3=1, x4=0, x5=0, scan_in=0, out=0, 3800
Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places
Table 1 and Table 2 have been shown a comparison of different multipliers for fault analysis of
conventional and proposed design and also fault analysis at stuck-at faults, table values are collected using
synthesis process of Xilinx ISE, where we have used vertex family for FPGA designs and improved the
execution time unit.
Table 2. Comparison of multipliers after synthesizing the design using XILINX ISE 14.7
Local utilization Conventional multiplier [10] Proposed multiplier
No of slices 76.11% 70.2%
No of 4 input LUTs 26% 25%
Time delays 28.24% 28%
Area covered 75% 68%
4. CONCLUSION
Compared to the existing system designs, we proved that the design of the modified gate of systolic
array multiplier design works faster because of reversible gate which has equal no of inputs and outputs
which process the information faster and used for many low power high-speed applications. There is much
scope to optimize the designs using the new reversible gates implementation. The proposed MIG gate
reduces the gate count by 10% compared to the conventional designs and all other parameters to optimization
mark. Most efficient testing was also done for SAM circuit to find the convenient faults as SAF and MGF
preferably, we achieved coverage of patterns generation tested as 100%. Moreover, BILBO logic is
implemented and is used for finding various faults for various system designs. Fault coverage using BILBO
logic achieved 97% higher than the convention system designs. Future designs of SOC or subsystems can
integrate and use for the detection of fault blocks of the design.
ACKNOWLEDGEMENTS
The authors would like to thank Shri B H V S N Murthy, DS &Director, RCI and Dr. Bheema Rao,
HOD, ECE Dept., Present HOD Dept ECE Prof L Anjaneyulu and also DRC members NITW for their
constant encouragement, valuable suggestions, and support for carrying out this work as a part of my Ph.D.
work.
REFERENCES
[1] C. Madhulika, V. S. P. Nayak, C. Prasanth, and T. H. S. Praveen, “Design of systolic array multiplier circuit using reversible
logic,” 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology
(RTEICT). IEEE, 2017, doi: 10.1109/rteict.2017.8256883.
[2] V. Shiva Prasad Nayak, G. Prasad, K. Dedeepya Chowdary, and K. Manjunatha Chari, “Design of compact and low power
reversible comparator,” 2015 International Conference on Control, Instrumentation, Communication and Computational
Technologies (ICCICCT). IEEE, 2015, doi: 10.1109/iccicct.2015.7475241.
[3] R. Landauer, “Irreversibility and Heat Generation in the Computing Process,” IBM Journal of Research and Development, vol. 5,
no. 3, pp. 183–191, 1961, doi: 10.1147/rd.53.0183.
[4] C. H. Bennett, “Logical Reversibility of Computation,” IBM Journal of Research and Development, vol. 17, no. 6, pp. 525–532,
1973, doi: 10.1147/rd.176.0525.
[5] M. A. C. K. S. Ganesh Kumar1, J. Deva Prasannam2, “Analysis of Low Power, Area and High Speed Multipliers for DSP
Applications,” International Journal of Emerging Technology and Advanced Engineering, vol. 4, no. 3, 2014.
[6] S. A. Mozhi and P. Ramya, “Efficient bit-parallel systolic multiplier over GF (2m),” 2016 International Conference on Electrical,
Electronics, and Optimization Techniques (ICEEOT). IEEE, 2016, doi: 10.1109/iceeot.2016.7755632.
[7] S. E. Mathe and L. Boppana, “Bit‐parallel systolic multiplier over for irreducible trinomials with ASIC and FPGA
implementations,” IET Circuits, Devices & Systems, vol. 12, no. 4, pp. 315–325, 2018, doi: 10.1049/iet-cds.2017.0426.
[8] M. F. Abdulla, C. P. Ravikumar, and A. Kumar, “Hybrid testing schemes based on mutual and signature testing,” Proceedings
Eleventh International Conference on VLSI Design. IEEE Comput. Soc, doi: 10.1109/icvd.1998.646621.
Design and testing of systolic array multiplier using fault … (Kurada Verra Bhoga Vasantha Rayudu)
8 ISSN: 2722-3221
[9] M. F. Abdulla, C. P. Ravikumar, and A. Kumar, “A novel BIST architecture with built-in self check,” Proceedings of 9th
International Conference on VLSI Design. IEEE Comput. Soc. Press, doi: 10.1109/icvd.1996.489455.
[10] Y. R. Babu and Y. Syamala, “Implementation and testing of multipliers using reversible logic,” 3rd International Conference on
Advances in Recent Technologies in Communication and Computing (ARTCom 2011). IET, 2011, doi: 10.1049/ic.2011.0073.
[11] Z. Cai, Y. Wang, S. Liu, K. Lv, and Z. Wang, “A Novel BIST Algorithm for Low-Voltage SRAM,” 2019 IEEE International Test
Conference in Asia (ITC-Asia). IEEE, 2019, doi: 10.1109/itc-asia.2019.00036.
[12] I. Pomeranz, “Storage Based Built-In Test Pattern Generation Method for Close-to-Functional Broadside Tests,” 2020 IEEE 26th
International Symposium on On-Line Testing and Robust System Design (IOLTS). IEEE, 2020, doi:
10.1109/iolts50870.2020.9159705.
[13] A. Vuksic and K. Fuchs, “A new BIST approach for delay fault testing,” Proceedings of European Design and Test Conference
EDAC-ETC-EUROASIC. IEEE Comput. Soc. Press, doi: 10.1109/edtc.1994.326863.
[14] P. M. K. B. K. Saptalakar, Deepak kale, Mahesh Rachannavar, “Design and Implementation of VLSI Systolic Array Multiplier for
DSP Applications,” International Journal of Scientific Engineering and Technology, vol. 2, no. 3, pp. 156–159, 2013.
[15] V. S. P. Nayak, N. Ramchander, T. Marandi, and A. V. Krishna, “Analysis and design of low-power reversible BILBO,” 2016
IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT). IEEE,
2016, doi: 10.1109/rteict.2016.7807794.
[16] C.-I. H. Chen and R. Smith, “A self-testing and self-diagnostic systolic array cell for signal processing,” 1991 Proceedings,
International Conference on Wafer Scale Integration. IEEE Comput. Soc. Press, doi: 10.1109/icwsi.1991.151699.
[17] K. Morghade and P. Dakhole, “Design of fast vedic multiplier with fault diagnostic capabilities,” 2016 International Conference
on Communication and Signal Processing (ICCSP). IEEE, 2016, doi: 10.1109/iccsp.2016.7754169.
[18] L. Wang and I. Hartimo, “Systolic array for binary multiplier,” Proceedings of ICSIPNN ’94. International Conference on Speech,
Image Processing and Neural Networks. IEEE, doi: 10.1109/sipnn.1994.344804.
[19] P. K. Meher and X. Lou, “Low-latency, low-area, and scalable systolic-like modular multipliers for GF (2m) based on irreducible
all-one polynomials,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 64, no. 2, pp. 399–408, 2017, doi:
10.1109/TCSI.2016.2614309.
[20] S. Talapatra, H. Rahaman, and J. Mathew, “Low Complexity Digit Serial Systolic Montgomery Multipliers for Special Class of
${\rm GF}(2^{m})$,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 18, no. 5, pp. 847–852, 2010, doi:
10.1109/tvlsi.2009.2016753.
[21] J.-J. Lee and G.-Y. Song, “Implementation of a bit-level super-systolic FIR filter,” Proceedings of 2004 IEEE Asia-Pacific
Conference on Advanced System Integrated Circuits. IEEE, doi: 10.1109/apasic.2004.1349450.
[22] F. Gang, “Design of Modular Multiplier Based on Improved Montgomery Algorithm and Systolic Array,” First International
Multi-Symposiums on Computer and Computational Sciences (IMSCCS’06). IEEE, 2006, doi: 10.1109/imsccs.2006.209.
[23] J. Xie, J. J. He, and P. K. Meher, “Low latency systolic montgomery multiplier for finite field GF(2m) based on pentanomials,”
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 21, no. 2, pp. 385–389, 2013, doi:
10.1109/TVLSI.2012.2185257.
[24] A. Reyhani-Masoleh, “Comments on ‘low-latency digit-serial systolic double basis multiplier over GF(2m) using subquadratic
toeplitz matrix-vector product approach,’” IEEE Transactions on Computers, vol. 64, no. 4, pp. 1215–1216, 2015, doi:
10.1109/TC.2015.2401024.
[25] M. O. Esonu, A. J. Al-Khalili, and D. Al-Khalili, “Variations on the theme for designing fault-tolerant systolic array architectures,”
[1991] IEEE Pacific Rim Conference on Communications, Computers and Signal Processing Conference Proceedings. IEEE, doi:
10.1109/pacrim.1991.160693.
[26] J. B. Chacko and P. Whig, “Low Delay Based Full Adder/Subtractor by MIG and COG Reversible Logic Gate,” 2016 8th
International Conference on Computational Intelligence and Communication Networks (CICN). IEEE, 2016, doi:
10.1109/cicn.2016.120.
[27] H. Rahaman, J. Mathew, and D. K. Pradhan, “Test generation in systolic architecture for multiplication over GF (2m),” IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, vol. 18, no. 9, pp. 1366–1371, 2010, doi:
10.1109/TVLSI.2009.2023381.
[28] M. Psarakis, D. Gizopoulos, A. Paschalis, and Y. Zorian, “An effective BIST architecture for sequential fault testing in array
multipliers,” Proceedings 17th IEEE VLSI Test Symposium (Cat. No.PR00146). IEEE Comput. Soc, doi:
10.1109/vtest.1999.766673.
[29] P. Martha, N. Kajal, P. Kumari, and R. Rahul, “An efficient way of implementing high speed 4-Bit advanced multipliers in
FPGA,” 2018 2nd International Conference on Electronics, Materials Engineering & Nano-Technology (IEMENTech). IEEE,
2018, doi: 10.1109/iementech.2018.8465375.
[30] K. Shirane, T. Yamamoto, I. Taniguchi, Y. Hara-Azumi, S. Yamashita, and H. Tomiyama, “Maximum Error-Aware Design of
Approximate Array Multipliers,” 2019 International SoC Design Conference (ISOCC). IEEE, 2019, doi:
10.1109/isocc47750.2019.9078488.
[31] K. Rajesh and G. U. Reddy, “FPGA Implementation of Multiplier-Accumulator Unit using Vedic multiplier and Reversible gates,”
2019 Third International Conference on Inventive Systems and Control (ICISC). IEEE, 2019, doi:
10.1109/icisc44355.2019.9036345.
BIOGRAPHIES OF AUTHORS
Patri Srihari Rao is working as Assoc Prof at NIT Warangal in the dept. of ECE.
His Research interests include RFIC Design, VLSI Testing, Fault Diagnosis Analog/digital IC
design, VLSI Testing Fault Diagnosis Analog/Digital IC Design, DSP Architecture, Analog
LDO’s. He has published numerous technical papers in Reputed international
journals/Presented in Conferences. He has conducted various training courses and
Workshops/Seminars at National/Internal Level and guided many PhD students. He can be
contacted at email: [email protected]
Design and testing of systolic array multiplier using fault … (Kurada Verra Bhoga Vasantha Rayudu)