Table_of_Contents
Table_of_Contents
REGULAR PAPERS
3D Design & Optimization
FSPDA: A Full Sequence Program Data Allocation Scheme for Boosting 3-D NAND Flash Read Performance . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Pang, Y. Deng, Z. Wu, G. Zhang, J. Li, and X. Qin 4336
TREAD-M3D: Temperature-Aware DNN Accelerators for Monolithic 3-D Mobile Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Shukla, V. F. Pavlidis, E. Salman, and A. K. Coskun 4350
PcGC: A Parity-Check Garbage Collection for Boosting 3-D NAND Flash Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Pang, Y. Deng, G. Zhang, Y. Zhou, X. Qin, Z. Wu, and J. Li 4364
Analog, Mixed-Signal, and RF Circuits
Automatic Op-Amp Generation From Specification to Layout . . . . . . . . . . J. Lu, L. Lei, J. Huang, F. Yang, L. Shang, and X. Zeng 4378
Analog RF Circuit Sizing by a Cascade of Shallow Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P.-O. Beaulieu, E. T. Dumesnil, F. Nabki, and M. Boukadoum 4391
LAYGO2: A Custom Layout Generation Engine Based on Dynamic Templates and Grids for Advanced CMOS Technologies . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. Shin, D. Lee, D. Kim, G. Sung, W. Shin, Y. Jo, H. Park, and J. Han 4402
Highly Efficient Automatic Synthesis of a Millimeter-Wave On-Chip Deformable Spiral Inductor Using a Hybrid Knowledge-Guided
and Data-Driven Technique . . . . . . . . . . . . . . . . J. Wei, W. Chen, Y. Gong, Q. Wu, G. Lu, W. Gao, L. Wang, M. Li, and H. Wang 4413
Approximate Computing
REX-SC: Range-Extended Stochastic Computing Accumulation for Neural Network Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. Li, W. Romaszkan, S. Pamarti, and P. Gupta 4423
Design Automation for Cyber Physical Systems and Internet of Things
Quantitative Robustness for Signal Temporal Logic With Time-Freeze Quantifiers . . . . . . . . . . . . . . . . B. Ghorbel and V. S. Prabhu 4436
Embedded Security
Scalable Detection of Hardware Trojans Using ATPG-Based Activation of Rare Events . . . . . . . . . . . . . A. Jayasena and P. Mishra 4450
Analytical Side Channel EM Models, Extending Simulation Abilities for ICs, and Linking Physical Models to Cryptographic
Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Katz, M. Avital, Y. Weizman, and I. Levi 4463
(Contents Continued on Page 4334)
(Contents Continued from Front Cover)
S PEC WANDS: An Efficient Priority-Based Scheduler Against Speculation Contention Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Tang, C. Wu, P.-C. Yew, Y. Zhang, M. Xie, Y. Lai, Y. Kang, W. Wang, Q. Wei, and Z. Wang 4477
Mex+Sync: Software Covert Channels Exploiting Mutual Exclusion and Synchronization . . . . . . . . . J. Zhang, C. Shen, and G. Qu 4491
CoTree: A Side-Channel Collision Tool to Push the Limits of Conquerable Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Ou, D. He, K. Qiao, S. Zheng, S.-K. Lam, and F. Zhang 4505
Joint Protection Scheme for Deep Neural Network Hardware Accelerators and Models . . . . . . . . . . . . . . . . . . J. Zhou and X. Zhang 4518
Security-Aware Resource Binding to Enhance Logic Obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . M. Zuzak, Y. Liu, and A. Srivastava 4528
iPROBE: Internal Shielding Approach for Protecting Against Front-Side and Back-Side Probing Attacks . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Gao, M. S. Rahman, N. Varshney, M. Tehranipoor, and D. Forte 4541
MAFIA: Protecting the Microarchitecture of Embedded Systems Against Fault Injection Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. Chamelot, D. Couroussé, and K. Heydemann 4555
Old School, New Primitive: Toward Scalable PUF-Based Authenticated Encryption Scheme in IoT . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X. Zhang, D. Gu, T. Wang, and Y. Huang 4569
Improved EM Side-Channel Analysis Attack Probe Detection Range Utilizing Coplanar Capacitive Asymmetry Sensing . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D.-H. Seo, M. Nath, D. Das, S. Ghosh, and S. Sen 4583
Embedded Systems
Retention-Aware Read Acceleration Strategy for LDPC-Based NAND Flash Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T.-Y. Wang, C.-W. Tsao, Y.-H. Chang, and T.-W. Kuo 4597
Efficient FPGA-Based Sparse Matrix–Vector Multiplication With Data Reuse-Aware Compression . . . . . S. Li, D. Liu, and W. Liu 4606
Resource- and Workload-Aware Model Parallelism-Inspired Novel Malware Detection for IoT Devices . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Kasarapu, S. Shukla, and S. M. P. Dinakarrao 4618
NPRC-I/O: An NoC-Based Real-Time I/O System With Reduced Contention and Enhanced Predictability . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Z. Jiang, X. Dai, R. Wei, I. Gray, Z. Gu, Q. Zhao, and S. Zhao 4629
Access Characteristic Guided Partition for NAND Flash-Based High-Density SSDs . . . . . . . . . Y. Lv, L. Shi, Y. Song, and C. J. Xue 4643
EdgeCompress: Coupling Multidimensional Model Compression and Dynamic Inference for EdgeAI . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H. Kong, D. Liu, S. Huai, X. Luo, R. Subramaniam, C. Makaya, Q. Lin, and W. Liu 4657
An Efficient Gustavson-Based Sparse Matrix–Matrix Multiplication Accelerator on Embedded FPGAs . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .S. Li, S. Huai, and W. Liu 4671
Defense Against On-Chip Trojans Enabling Traffic Analysis Attacks Based on Machine Learning and Data Augmentation . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Dhavlle, M. M. Ahmed, N. Mansoor, K. Basu, A. Ganguly, and S. M. P. Dinakarrao 4681
Emerging Technologies and Applications
IMGA: Efficient In-Memory Graph Convolution Network Aggregation With Data Flow Optimizations . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y. Wei, X. Wang, S. Zhang, J. Yang, X. Jia, Z. Wang, G. Qu, and W. Zhao 4695
A Novel Implementation Methodology for Error Correction Codes on a Neuromorphic Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Hassan, P. Dattilo, and A. Akoglu 4706
P IPETTE: Efficient Fine-Grained Reads for SSDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Bai, H. Wan, Y. Huang, X. Sun, F. Wu, C. Xie, H.-C. Hsieh, T.-W. Kuo, and C. J. Xue 4721
Cryptensor: A Resource-Shared Co-Processor to Accelerate Convolutional Neural Network and Polynomial Convolution . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J.-C. See, H.-F. Ng, H.-K. Tan, J.-J. Chang, K.-M. Mok, W.-K. Lee, and C.-Y. Lin 4735
FPGAs and Reconfigurable Systems
Poseidon-NDP: Practical Fully Homomorphic Encryption Accelerator Based on Near Data Processing Architecture . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y. Yang, H. Lu, and X. Li 4749
Algorithm/Hardware Co-Optimization for Sparsity-Aware SpMM Acceleration of GNNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y. Gao, L. Gong, C. Wang, T. Wang, X. Li, and X. Zhou 4763
MAPD: An FPGA-Based Real-Time Video Haze Removal Accelerator Using Mixed Atmosphere Prior . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y. Tan, Y. Zhu, Z. Huang, H. Tan, and K. Li 4777
Logic Synthesis
Quantum Circuit Design for Integer Multiplication Based on Schönhage–Strassen Algorithm . . . J. Nie, Q. Zhu, M. Li, and X. Sun 4791
Machine Learning/AI Automation
SCV-GNN: Sparse Compressed Vector-Based Graph Neural Network Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. K. Unnikrishnan, J. Gould, and K. K. Parhi 4803
TransCODE: Co-Design of Transformers and Accelerators for Efficient Training and Inference . . . . . . . . . . . S. Tuli and N. K. Jha 4817
Distributed Deep Learning Optimization of Heat Equation Inverse Problem Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Z. Wang, L. Yang, H. Lin, G. Zhao, Z. Liu, and X. Song 4831
A Unified Engine for Accelerating GNN Weighting/Aggregation Operations, With Efficient Load Balancing and Graph-Specific
Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Mondal, S. D. Manasi, K. Kunal, Ramprasath S., Z. Zeng, and S. S. Sapatnekar 4844
QANS: Toward Quantized Neural Network Adversarial Noise Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Lin, F. Cheng, P. Yang, T. He, J. Liang, and Q. Wang 4858
Self-Supervised On-Device Federated Learning From Unlabeled Streams . . . . . . . J. Shi, Y. Wu, D. Zeng, J. Tao, J. Hu, and Y. Shi 4871
CoGNN: An Algorithm-Hardware Co-Design Approach to Accelerate GNN Inference With Minibatch Sampling . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Zhong, S. Zeng, W. Hou, G. Dai, Z. Zhu, X. Zhang, S. Xiao, H. Yang, and Y. Wang 4883
Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network Accelerators . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Azamat, F. Asim, J. Kim, and J. Lee 4897
LoCoExNet: Low-Cost Early Exit Network for Energy Efficient CNN Accelerator Design . . . J. Jo, G. Kim, S. Kim, and J. Park 4909
SecureVolt: Enhancing Deep Neural Networks Security via Undervolting . . . . . . . . M. S. Islam, I. Alouani, and K. N. Khasawneh 4922
Modeling and Simulation
Routing and Wavelength Assignment for Multiple Multicasts in Optical Network-on-Chip (ONoC) . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Yang, Y. Chen, Z. Huang, H. Zhang, and H. Gu 4934
A Triple-Memristor Hopfield Neural Network With Space Multistructure Attractors and Space Initial-Offset Behaviors . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H. Lin, C. Wang, F. Yu, Q. Hong, C. Xu, and Y. Sun 4948
Analytical Post-Voiding Modeling and Efficient Characterization of EM Failure Effects Under Time-Dependent Current Stressing
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. Hou, N. Wong, Q. Chen, Z. Ji, and H.-B. Chen 4959
Accelerating Static Timing Analysis Using CPU–GPU Heterogeneous Parallelism . . . . . . . . . . . . . Z. Guo, T.-W. Huang, and Y. Lin 4973
Accelerating Loop-Oriented RTL Simulation With Code Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Mao, Y. Guo, X. Liao, H. Jin, W. Zhang, H. Liu, L. Zheng, X. Liu, Z. Jiang, and X. Zheng 4985
A Combined N/PFET CFET-Based Design and Logic Technology Framework for CMOS Applications . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X. Zhu, R. Ding, O. Tao, Y. Zhao, P. Tang, D. W. Zhang, Y. Lu, and S. Yu 4999
Physical Design
Aging-Aware Critical Path Selection via Graph Attention Networks . . . . . . . . . . . Y. Ye, T. Chen, Y. Gao, H. Yan, B. Yu, and L. Shi 5006
DevelSet: Deep Neural Level Set for Instant Mask Optimization . . . . . . . . . . . . . . . . . . . . . G. Chen, Z. Yu, H. Liu, Y. Ma, and B. Yu 5020
CircuitNet: An Open-Source Dataset for Machine Learning in VLSI CAD Applications With Improved Domain-Specific Evaluation
Metric and Learning Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Z. Chai, Y. Zhao, W. Liu, Y. Lin, R. Wang, and R. Huang 5034
System-Level Design
EBIO: An Efficient Block I/O Stack for NVMe SSDs With Mixed Workloads . . . . . J. Zhu, L. Wang, L. Xiao, L. Liu, and G. Qin 5048
In-Memory Set Operations on Memristor Crossbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Kishori and S. Pyne 5061
A High-Flexibility CAN-TSN Gateway With a Low-Congestion TSN-to-CAN Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Xie, Y. Zhang, N. Chen, and W. Chang 5072
MCM-GPU Voltage Noise Characterization and Architecture-Level Mitigation . . . J. Tan, K. Chen, W. Wang, K. Yan, and X. Wei 5084
Test
DRAM Bender: An Extensible and Versatile FPGA-Based Infrastructure to Easily Test State-of-the-Art DRAM Chips . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . A. Olgun, H. Hassan, A. G. Yağlıkçı, Y. C. Tuğrul, L. Orosa, H. Luo, M. Patel, O. Ergin, and O. Mutlu 5098
Verification
r-map: Relating Implementation and Specification in Hardware Refinement Checking . . . . . . . . . . W. Fang, G. Hu, and H. Zhang 5113
Automated Synthesis of Safe Timing Behaviors for Requirements Models Using CCSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Hu, J. Xia, M. Zhang, X. Chen, F. Mallet, and M. Chen 5127
Realization and Hardware Implementation of Gating Units for Long Short-Term Memory Network Using Hyperbolic Sine Functions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. Joseph and T. S. Bindiya 5141
Exploiting the Single-Symbol LLR Variation to Accelerate LDPC Decoding for 3-D NAND Flash Memory . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y. Li, G. Han, C. Liu, M. Zhang, and F. Wu 5146
A Novel Read Scheme Using GIDL Current to Suppress Read Disturbance in 3-D NAND Flash Memories . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H. Jo, J. Kim, and H. Shin 5151
Statistical Compact Modeling With Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Dai, Y. Li, Z. Rong, B. Peng, L. Zhang, R. Wang, and R. Huang 5156
Unified Wear-Leveling Technique for NVM-Based Buffer of SSD . . . . . . . . . . . . . . . Y. M. Park, J. Yeom, D. Kim, and E.-Y. Chung 5161
2023 I NDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Available online at http://ieeexplore.ieee.org