An Efficient Reconfigurable Encoder for the IEEE 1901 Standard
An Efficient Reconfigurable Encoder for the IEEE 1901 Standard
9, SEPTEMBER 2022
Transactions Brief
An Efficient Reconfigurable Encoder for the IEEE 1901 Standard
Yuxing Chen , Hangxuan Cui , and Zhongfeng Wang
Abstract— The IEEE 1901 standard for power line communica- LDPC-CC encoders for a single code rate are presented in [10]–[12].
tion (PLC) enables simple connection among Internet of Things devices. The LDPC-CC encoder for 1/2 code rate is described in [10]
The forward error correction (FEC) codes specified in the IEEE
and improved single-rate encoders with a higher throughput-to-area
1901 standard include low-density parity-check convolutional codes
(LDPC-CCs) and Reed-Solomon convolutional concatenated (RSCC) ratio (TAR) are proposed in [11] and [12]. However, not an encoder
codes. This work introduces an efficient reconfigurable encoder in fully compatible with the IEEE 1901 standard has been proposed.
full compliance with the IEEE 1901 standard. First, we propose a The design of standard-compatible encoders is the topic of ongoing
reconfigurable LDPC-CC encoder to fulfill the multirate requirement works. Lots of encoders (only encoders, no decoders) compatible with
and improve the architecture by fine-tuned parallelization, which takes
full advantage of the characteristics of the codeword structure. Then, established standards or potentially for next-generation standards are
for area reduction, the optimization regarding the RSCC encoder is proposed recently, for example, [1], [3], [13].
extensively exploited. Moreover, the commonality between the encoders In this work, we design an power- and area-efficient reconfigurable
is discovered, and some circuitries are shared to reduce the hardware encoder in full compliance with the IEEE 1901 standard. The
complexity. Equipped with these techniques, an efficient reconfigurable
encoder for the IEEE 1901 standard is developed and implemented with
contributions are summarized as follows.
28-nm technology. Implementation results demonstrate that the proposed 1) LDPC-CC encoder: A reconfigurable LDPC-CC encoder sup-
encoder can meet the throughput requirement of the IEEE 1901 standard porting all required code rates is proposed. By exploiting
and is both power- and area-efficient. the features of the IEEE 1901 standard, we improve the
Index Terms— Fine-tuning, IEEE 1901 standard, paralleliza- encoder by fine-tuned parallelization techniques. The imple-
tion, power line communication (PLC), reconfigurable hardware. mentation result shows that both the speed and the area are
improved.
I. I NTRODUCTION 2) RSCC encoder: For the RS encoder, we utilize common subex-
Standard-compatible forward error correction (FEC) coding imple- pression sharing (CSS) techniques to minimize the number of
mentations have been in high interest due to their practical value. XOR -gates. The number of XOR -gates is reduced by more than
Existing works focus on FEC designs for wireless communication 70%. In terms of the CC encoder, the registers in the puncturer
standards, for example, low-density parity-check encoders/decoders are reduced by 28.6% through step-by-step optimizations.
for IEEE 802.11 [1], 5G [2], and polar encoders/decoders 3) Reconfigurable encoder: A first reconfigurable multirate
for 5G [3], [4]. encoder complying with the IEEE 1901 standard is proposed
As for the power line communication (PLC), IEEE has released and optimized. Compared with the combination of individual
its PLC protocol, the IEEE 1901 standard [5], which has been encoders, the hardware complexity of the proposed reconfig-
widely applied in mainstream PLC devices [6]. PLC does not require urable encoder is improved.
complicated cabling and features in low-cost deployment [7]. Besides, The rest of the brief is organized as follows. Section II introduces
it is a simple way to link a large variety of Internet of Things the background. In Section III, the detailed design of the proposed
devices for the usage of sensing and controlling, in enterprise or encoder is described. Section IV presents implementation results.
home environments [8]. At last, Section V draws the conclusion.
The FEC codes specified in the IEEE 1901 standard include
low-density parity-check convolutional codes (LDPC-CCs) and
II. BACKGROUND
Reed-Solomon convolutional concatenated (RSCC) codes [5]. The
outer and inner codes of the RSCC code are the Reed-Solomon (RS) A. Overview of the LDPC-CC Encoder
code and the convolutional code (CC), respectively. Apart from vari- The codeword u of an LDPC-CC is denoted as [u(0),
ous codeword types, multiple code rates are required. The LDPC-CC u(1), . . . , u(t), . . .], where t is the time index. Each u(t) is a
needs to support four code rates, and the RSCC is required to support c-bit symbol, that is, u(t) = [u 0 (t), u 1 (t), . . . , u (c−1) (t)]. u(t) is
seven code rates [5]. Thus, to design FEC coding implementations in composed of c − b information bits, w(t), and b parity bits, p(t).
full support of the IEEE 1901 standard is challenging. The multirate The code rate R is (c − b/c). The memory size of the encoder
decoder for the IEEE 1901 standard LDPC-CC was introduced [9]. is denoted as m s . The LDPC-CC is with time period t p . The
time stamp ts is calculated by ts = t mod t p , where mod denotes
Manuscript received 11 January 2022; revised 29 March 2022 and 26 April the modulo operation. For the LDPC-CC specified in the IEEE
2022; accepted 19 May 2022. Date of publication 30 May 2022; date of
current version 1 September 2022. This work was supported in part by the 1901 standard, t p = 3, c ∈ {2, 3, 4, 5}, and b = 1, that is,
National Natural Science Foundation of China under Grant 62174084 and u(t) = [w0 (t), w1 (t), . . . , w(c−2) (t), p(t)]. The polynomial form of
Grant 62104097, in part by the High-Level Personnel Project of Jiangsu wi (t) and p(t) is Wi (D) and P(D), respectively. The calculation of
Province under Grant JSSCBS20210034, and in part by the Key Research P(D) for c = 2 defined in IEEE 1901 standard is
Plan of Jiangsu Province of China under Grant BE2019003-4. (Corresponding
t
author: Zhongfeng Wang.) Q Ws (D) W0 (D)
The authors are with the School of Electronic Science and Engineer- P (D) = 0
t (1)
ing, Nanjing University, Nanjing 210023, China (e-mail: yxing.chen@ Q Ps (D)
outlook.com; [email protected]; [email protected]).
t t
Color versions of one or more figures in this article are available at where delay polynomials Q Ws (D), Q Ps (D) are listed in Table I. The
0
https://doi.org/10.1109/TVLSI.2022.3177239.
Digital Object Identifier 10.1109/TVLSI.2022.3177239 calculation of P(D) for c ∈ {3, 4, 5} can be referred to [5]. Fig. 1(a)
1063-8210 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 02,2022 at 10:16:09 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 30, NO. 9, SEPTEMBER 2022 1369
TABLE I
D ELAY P OLYNOMIALS Q tWs (D), Q tPs (D) OF THE LDPC-CC W ITH c = 2
0
S PECIFIED IN THE IEEE 1901 S TANDARD
Fig. 2. Puncturing process in the IEEE 1901 standard. (a) Rcc = 1/2.
(b) Rcc = 2/3. (c) Rcc = 3/4.
TABLE III
B INARY VALUES OF CR U NDER D IFFERENT C ODE R ATES
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 02,2022 at 10:16:09 UTC from IEEE Xplore. Restrictions apply.
1370 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 30, NO. 9, SEPTEMBER 2022
Fig. 3. (a) Proposed reconfigurable LDPC-CC encoder. (b) DFG of SRw0 . (c) Unfolded DFG. (d) Fine-tuned parallelized LDPC-CC encoder.
TABLE IV
C OMPARISON B ETWEEN THE P ROPOSED R ECONFIGURABLE LDPC-CC E NCODER W ITH AND W ITHOUT F INE -T UNED PARALLELIZATION
other shift registers. Fig. 3(d) illustrates the fine-tuned parallelized TABLE V
LDPC-CC encoder. C OMPARISONS W ITH O THER LDPC-CC E NCODERS
The number of XOR-gates, AND-gates, 3-to-1 multiplexers, D flip-
flops are defined as NXOR , NAND , NMUX , and NDFF , respectively.
The critical paths are marked with dashed blue lines in Fig. 3, and
the critical path delay is denoted as Tc . The maximum throughput θm
−1
is calculated by (5J /Tc ), where J is the parallelization factor. θm is
the reciprocal of θm , whose meaning is the processing time required
per bit. We regard TAR as a figure-of-merit, which is calculated
by (θm /Area). A comparison between the proposed reconfigurable
design with and without fine-tuned parallelization is shown in
Table IV.1 The improved reconfigurable LDPC-CC encoder reduces
the area by 10.0% and enhances the TAR by 323.3%, compared
proposed encoder improves the TAR by more than 105.3%, and the
with the design without fine-tuned parallelization. Table V∗ compares
TARu by more than 21.0%.
the proposed encoder with previous designs. Areau denotes unit
area, calculated by (Area/m s ). TARu is the unit TAR, calculated by
B. RSCC Encoder Design
(θm /Areau ). The proposed fine-tuned parallelized encoder supports
The Galois field multiplication dominates the majority of arith-
various code rates, while other encoders are 1/2-rate. Besides, the
metic operations in the RSCC encoder. We utilize CSS techniques [1]
to minimize the number of XOR-gates. Table VI lists the number of
1 The implementation details will be discussed in Section IV. XOR -gates before and after CSS. The number of XOR -gates is also
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 02,2022 at 10:16:09 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 30, NO. 9, SEPTEMBER 2022 1371
TABLE VI
N UMBER OF XOR -G ATES B EFORE AND A FTER CSS
Fig. 4. (a) Original puncturer. (b) Proposed puncturer. C. Reconfigurable Encoder Design
The IEEE 1901 standard requires both LDPC-CC and RSCC
TABLE VII as channel coding schemes. To meet the need, a straightforward
A DDRESSING M ODES OF THE P UNCTURER implementation is to combine the LDPC-CC and RSCC encoders.
Differently, we propose the codesign method of individual encoders.
The proposed method reduces the hardware complexity by circuitries
sharing. The RS encoder and the LDPC-CC encoder both contain D
flip-flops. However, in the RS encoder, the adders and D flip-flops are
over GF(28 ). To explore the commonality, we investigate the detailed
circuits of the shift registers module of the RS encoder, as shown in
Fig. 5. It can be seen that the shift registers module consists of 16
8-bit D flip-flops and 15 GF(28 ) adders in between them. An 8-bit D
flip-flop is composed of eight 1-bit D flip-flops, and a GF(28 ) adder
is made up of eight GF(2) adders. To reduce the overall area, the first
to 16th 1-bit registers for w0 (3k), w0 (3k + 1), w0 (3k + 2), w1 (3k),
w1 (3k + 1), w1 (3k + 2), w2 (3k + 1), and w2 (3k + 2) of LDPC-CC
encoder are reused for the RS encoder, in which the GF(2) adders are
normalized by dividing the number of the original XOR-gates. It can inserted. When the reconfigurable encoder is in the LDPC-CC mode,
be seen that more than 70% of the XOR-gates are saved in total, the input of the multiplication module is selected as zero. Therefore,
compared with [15]. the outputs of the adders are the data from the Q-ports of the D
The puncturer has three parts, the 3-to-7 decoder, the buffer, and flip-flops. Through the proposed codesign method, all registers in
the multiplexer, as shown in Fig. 4(a). The inputs (outputs) of the the RS encoder cost no hardware overhead. Thus, the total hardware
3-to-7 decoder are binary (one-hot). If wa = 000, only the topmost complexity is reduced.
output values 1. Thus, only D flip-flops 0 and 1 are enabled. The
outputs from the CC encoder in Fig. 1(c), y0 (t) and y1 (t), are taken IV. I MPLEMENTATION R ESULTS
as inputs to the buffer. The architecture of the puncturer assumes The hardware architecture is described in RTL and synthesized
two different clock sources. One clock source is the input data clock under the TSMC 28-nm CMOS technology using the Synopsys
(denoted as clkpi ), and the other is the output data clock (denoted as Design Compiler. The synthesis results of the encoders in compliance
clkpo ). The clock ports of the D flip-flops in the buffer are connected with the IEEE 1901 standard are shown in Table VIII, where f
to clkpi . The sel signal varies each clkpo cycle. The clock frequency of denotes the clock frequency, θ is the encoding throughput, and L&R
clkpo is 1/Rcc of that of clkpi . The traditional puncturer needs 14 D denotes the combination of the LDPC-CC and RSCC encoders. The
flip-flops and a 14-to-1 multiplexer for CC with 7/8 code rate [16]. average time to process a bit is denoted as Tb and is calculated by
We optimize the puncturer step by step as follows. First, the Tb = (1/θ). The area–time product (ATP) and power–time prod-
addressing modes for each code rate are carefully investigated, uct (PTP) is defined as Area × Tb and Power × Tb , respectively. For
as listed in Table VII. In the original addressing mode, the data from a fair comparison and the speed requirement of IEEE 1901 standard
D flip-flops 2 and 13 are not chosen as the output. Therefore, they (500 Mbps), the throughput is set as 600 Mbps. The throughputs of
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 02,2022 at 10:16:09 UTC from IEEE Xplore. Restrictions apply.
1372 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 30, NO. 9, SEPTEMBER 2022
R EFERENCES
[1] A. Mahdi, N. Kanistras, and V. Paliouras, “A multirate fully parallel
LDPC encoder for the IEEE 802.11n/ac/ax QC-LDPC codes based on
reduced complexity XOR trees,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 29, no. 1, pp. 51–64, Jan. 2021.
[2] J. Nadal and A. Baghdadi, “Parallel and flexible 5G LDPC decoder
architecture targeting FPGA,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 29, no. 6, pp. 1141–1151, Jun. 2021.
[3] W. Song, Y. Shen, L. Li, K. Niu, and C. Zhang, “A general construction
and encoder implementation of polar codes,” IEEE Trans. Very Large
Scale Integr. (VLSI) Syst., vol. 28, no. 7, pp. 1690–1702, Jul. 2020.
[4] C. Ji, Y. Shen, Z. Zhang, X. You, and C. Zhang, “Autogeneration of
pipelined belief propagation polar decoders,” IEEE Trans. Very Large
Scale Integr. (VLSI) Syst., vol. 28, no. 7, pp. 1703–1716, Jul. 2020.
[5] IEEE Standard for Broadband over Power Line Networks: Medium
Access Control and Physical Layer Specifications, IEEE Standard 1901-
the LDPC-CCs are calculated by the 1/2 code rate, as to take the worst 2020, 2021.
case into concern, that is, θ = 2J f , where J is the parallelization [6] R. Nieto, R. Mateos, and A. Hernandez, “HW/SW architecture for a
factor. Only the powers of the 4/5-rate LDPC-CCs and 1/2-rate broadband power-line communication system with LS channel estimator
RSCCs are listed since they are the most power-consuming. In terms and ASCET equalizer,” IEEE Trans. Ind. Informat., vol. 16, no. 11,
pp. 6740–6749, Nov. 2020.
of the LDPC-CC encoder mode, the ATP is reduced by 10.1% and the [7] K. Ali, A. X. Liu, I. Pefkianakis, and K.-H. Kim, “Distributed spectrum
PTP is reduced by 68.2% through proposed optimization techniques. sharing for enterprise powerline communication networks,” IEEE/ACM
Therefore, the improved LDPC-CC encoder is more efficient in both Trans. Netw., vol. 29, no. 3, pp. 1032–1045, Jun. 2021.
area and power metrics than the conventional design. As for the [8] M. Li and H. J. Lin, “Design and implementation of smart home control
systems based on wireless sensor networks and power line communi-
RSCC encoders, the frequency of the RS and CC encoders is 75 and
cations,” IEEE Trans. Ind. Electron., vol. 62, no. 7, pp. 4430–4442,
600 MHz, respectively. Since a byte (bit) is outputted in each cycle, Jul. 2015.
the throughput of the RS (CC) encoder is calculated by θ = 8 f [9] I. Yoo and I.-C. Park, “Low-power LDPC-CC decoding architecture
(θ = f ). The improved RSCC encoder mode reduces the ATP by based on the integration of memory banks,” IEEE Trans. Circuits Syst. II,
13.5% and the PTP by 12.1%, compared with the original mode. Exp. Briefs, vol. 64, no. 9, pp. 1057–1061, Sep. 2017.
[10] R. Swamy et al., “Design and test of a 175-Mb/s, rate-1/2 (128,3,6) low-
Thus, the proposed RSCC encoder is favorable in area and power. density parity-check convolutional code encoder and decoder,” IEEE J.
The total area of the improved L&R is less than the addition of the Solid-State Circuits, vol. 42, no. 10, pp. 2245–2256, Oct. 2007.
areas of individual encoders, thanks to the co-design method. 44.2% [11] S. Bates and R. Swamy, “Parallel encoders for low-density parity-check
of the RSCC encoder is used for hardware sharing. Compared with convolutional codes,” in Proc. IEEE Int. Symp. Circuits Syst., May 2006,
p. 4.
the original L&R, the reconfigurable encoder reduces the hardware [12] Z. Chen, T. L. Brandon, D. G. Elliott, S. Bates, W. A. Krzymien,
complexity by 20.0%. and B. F. Cockburn, “Jointly designed architecture-aware LDPC convo-
lutional codes and high-throughput parallel encoders/decoders,” IEEE
V. C ONCLUSION Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 4, pp. 836–849,
Apr. 2010.
In this work, we propose a reconfigurable multirate encoder [13] S. Li, K. El-Sankary, A. Karami, and D. Truhachev, “Area- and power-
compliant with the IEEE 1901 standard. The VLSI optimizations for efficient staircase encoder implementation for high-throughput fiber-
the proposed encoder are carried out from three perspectives. First, optical communications,” IEEE Trans. Very Large Scale Integr. (VLSI)
Syst., vol. 28, no. 3, pp. 843–847, Mar. 2020.
to support all code rates specified by the standard, a reconfigurable [14] K. K. Parhi, “A systematic approach for design of digit-serial signal
LDPC-CC encoder is presented. Besides, the encoder is optimized by processing architectures,” IEEE Trans. Circuits Syst., vol. 38, no. 4,
the fine-tuned parallelization technique. Second, we carry out step- pp. 358–375, Apr. 1991.
by-step optimizations on the RSCC encoder to lower the hardware [15] A. Chalil and K. N. Sreehari, “VLSI implementation of Reed Solomon
codes,” in Proc. 4th Int. Conf. Comput. Methodol. Commun. (ICCMC),
complexity. Third, the first reconfigurable multirate encoder in full
Mar. 2020, pp. 280–284.
compliance with the IEEE 1901 standard is designed and optimized. [16] E. Garda, M. Guzmán, and D. Torres, “A hardware implementation
The implementation results show that not only does our proposed of punctured convolutional codes to complete a Viterbi decoder core,”
encoder meet the throughput requirement, but it is also area- and J. Appl. Res. Technol., vol. 3, no. 2, pp. 77–88, Aug. 2005.
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 02,2022 at 10:16:09 UTC from IEEE Xplore. Restrictions apply.