0% found this document useful (0 votes)
2 views

Li et al. - 2022 - DeepGate learning neural representations of logic gates

The document presents DeepGate, a novel approach for learning neural representations of logic gates in electronic design automation (EDA) using graph neural networks (GNNs). It addresses the challenge of obtaining a general and effective circuit representation by transforming circuits into and-inverter graph format and using signal probabilities as supervision. Experimental results demonstrate DeepGate's efficacy and generalization capability in representing logic circuits and solving various EDA tasks.

Uploaded by

yangkunkuo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Li et al. - 2022 - DeepGate learning neural representations of logic gates

The document presents DeepGate, a novel approach for learning neural representations of logic gates in electronic design automation (EDA) using graph neural networks (GNNs). It addresses the challenge of obtaining a general and effective circuit representation by transforming circuits into and-inverter graph format and using signal probabilities as supervision. Experimental results demonstrate DeepGate's efficacy and generalization capability in representing logic circuits and solving various EDA tasks.

Uploaded by

yangkunkuo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

DeepGate: Learning Neural Representations of Logic Gates

Min Li∗ Sadaf Khan∗ Zhengyuan Shi


The Chinese University of Hong Kong The Chinese University of Hong Kong The Chinese University of Hong Kong

Naixing Wang Huang Yu Qiang Xu


Huawei Technologies Co., Ltd. Huawei Technologies Co., Ltd. The Chinese University of Hong Kong

ABSTRACT Recently, a notable trend in the deep learning community is


Applying deep learning (DL) techniques in the electronic design to employ pre-trained models for many downstream tasks rather
automation (EDA) field has become a trending topic. Most solutions than learning a specific model for each task from scratch [11]. For
apply well-developed DL models to solve specific EDA problems. example, a series of convolutional neural networks (CNNs) are
While demonstrating promising results, they require careful model pre-trained on the ImageNet dataset. They perform well on other
tuning for every problem. The fundamental question on "How to computer vision (CV) tasks such as image segmentation and object
obtain a general and effective neural representation of circuits?" has detection by fine-tuning with a small amount of task-specific data.
not been answered yet. In this work, we take the first step towards Similarly, pre-trained Transformer-based language models (e.g.,
solving this problem. We propose DeepGate, a novel representation GPT [6] and BERT [8]) have achieved unparalleled performance on
learning solution that effectively embeds both logic function and various natural language processing (NLP) tasks.
structural information of a circuit as vectors on each gate. Specifi- Whereas in the EDA domain, despite all the recent efforts in
cally, we propose transforming circuits into unified and-inverter learning-based solutions [13], obtaining a general and effective cir-
graph format for learning and using signal probabilities as the su- cuit representation that serves as the basis for solving various EDA
pervision task in DeepGate. We then introduce a novel graph neural tasks has not been addressed yet. In this work, we take the first step
network that uses strong inductive biases in practical circuits as towards this direction by introducing a novel GNN-based solution
learning priors for signal probability prediction. Our experimental for the representation learning of logic gates, namely DeepGate,
results show the efficacy and generalization capability of DeepGate. which is aware of the logic computation procedure and the struc-
tural information of combinational circuits.
KEYWORDS Naturally, logic circuits can be modeled as directed acyclic graphs
(DAGs), in which logic gates appear in a specific topological order.
Representation Learning, Graph Neural Networks, Logic Gates
Therefore, one could collect many logic circuits and resort to exist-
ACM Reference Format: ing DAG-GNN architecture [21, 27] to learn the node embedding
Min Li, Sadaf Khan, Zhengyuan Shi, Naixing Wang, Huang Yu, and Qiang
for each logic gate with some supervision tasks (e.g., Boolean satisfi-
Xu. 2022. DeepGate: Learning Neural Representations of Logic Gates. In
ability [3]). However, we argue that such straightforward solutions
Proceedings of the 59th ACM/IEEE Design Automation Conference (DAC) (DAC
’22), July 10–14, 2022, San Francisco, CA, USA. ACM, New York, NY, USA, cannot effectively extract information from circuit graphs.
6 pages. https://doi.org/10.1145/3489517.3530497 Firstly, logic circuits could follow different design styles and use
diverse technology libraries containing various logic gate types,
1 INTRODUCTION leading to heterogeneous circuit graphs with mixed distributions
The rise of deep learning (DL) has aroused much interest in apply- that are challenging to learn. In DeepGate, we propose to con-
ing it to solve various electronic design automation (EDA) prob- duct learning on a general intermediate representation of logic
lems [13]. The most natural representation of circuits and netlists circuits, i.e., and-inverter graph (AIG), with the help of logic syn-
is a graph. With the recent success of graph neural networks thesis tools [5]. The benefits are twofold: (i). Such a unified format
(GNNs) [10, 14] in modeling non-structured data, various works constrains the circuit graph distribution without changing circuit
have explored its potential on EDA problems such as congestion functionalities. All the transformed circuits only feature two types
prediction [15] and testability analysis [16]. These works focus of logic gates (i.e., 2-input AND gate and inverter); (ii). The logic
on learning a particular function that takes the circuit graph as synthesis procedure naturally introduces a strong inductive bias of
input and directly maps it to output for desired EDA tasks, without practical circuits for effective learning with GNN models.
considering the internal computational process in the circuits. Secondly, the effectiveness of representation learning heavily
relies on supervision tasks. For example, when pre-training CV
∗ Both authors contributed equally to this research. models, the image class labels of the ImageNet dataset serve as the
Permission to make digital or hard copies of all or part of this work for personal or cornerstone. In contrast, self-supervised tasks are used instead in
classroom use is granted without fee provided that copies are not made or distributed NLP pre-trained model development. For effective circuit represen-
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
tation learning, we propose to use the signal probability (i.e., the
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, probability of being logic ‘1’) for every node as rich supervision
to post on servers or to redistribute to lists, requires prior specific permission and/or a because it embeds the genuine logic relationship of each node in the
fee. Request permissions from [email protected].
DAC ’22, July 10–14, 2022, San Francisco, CA, USA
circuits. To be specific, we perform logic simulations with a large
© 2022 Association for Computing Machinery. amount of random patterns to obtain faithful probability values for
ACM ISBN 978-1-4503-9142-9/22/07. . . $15.00 supervision.
https://doi.org/10.1145/3489517.3530497

667
1 2 3 wherein 𝑁 (𝑣) denotes neighboring nodes of node 𝑣 and 𝐿 is the
6
number of layers. The parameterized function AGGREGATEℓ ag-
3 gregates messages from neighboring nodes 𝑁 (𝑣), and COMBINEℓ
4 8 5 4
2 obtains an updated hidden state after aggregation. Finally, the func-
tion READOUTℓ retrieves the states of all nodes V and produces
7 7 6
1 5 the graph neural representation. A notable GNN architecture is the
graph attention network (GAT) [23] that considers the importance
8 of different neighbors during aggregation.
Circuit DAG Directed acyclic graphs (DAGs) are a special type of graphs, yet
broadly seen across many domains, including circuit modeling (see
Figure 1: The Circuit Representation as DAG
Fig. 1). Recently, few studies have been dedicated to DAG-GNN
designs [21, 27], which propagate the message following the topo-
Last but most important, existing GNN models are general solu- logical ordering between nodes and only consider the predecessors
tions designed to extract information from all kinds of graphs, while in the AGGREGATEℓ function, as demonstrated in Equation (3).
circuit graphs are a unique type of graph with logic relationships
between nodes. In this work, we design a dedicated GNN model for hℓ𝑣 = COMBINEℓ (hℓ𝑣−1 , AGGREGATEℓ ( {h𝑢ℓ |𝑢 ∈ P (𝑣) })), ℓ = 1, ..., 𝐿 (3)
circuit graphs, significantly enhancing the learning effectiveness.
The major difference between Eq. (3) and Eq. (1) is that in DAG-
We summarize the contributions of this work as follows:
GNN, the aggregation function for 𝑣 will be only executed after all
• To the best of our knowledge, DeepGate is the first work for of its predecessors’ hidden states have already been computed.
the general and effective circuit representation learning prob- Besides stacking 𝐿 layers to increase the depth of the network,
lem. Specially, we propose a novel design flow to tackle this one can also apply the same model for 𝑇 times in the recurrent
problem: (i). circuit transformation in AIG form; (ii). supervi- fashion to generate the final embedding [3]:
sion with logic-simulated probabilities; (iii). representation
learning with a dedicated GNN model for circuit graphs. h𝑡𝑣 = COMBINE(h𝑡𝑣−1 , AGGREGATE( {h𝑢𝑡 |𝑢 ∈ P (𝑣) })), 𝑡 = 1, ...,𝑇 (4)

• We propose a novel GNN model for circuit graphs that ex- Using the taxonomy defined in [25], we name the two variants
ploits unique circuit properties, including attention mech- of DAG-GNNs described in Equations (3)–(4) as DAG-ConvGNNs
anisms that mimic the logic computation procedure and and DAG-RecGNNs, respectively.
reversed propagation layers that consider logic implication
effects. 2.2 GNN-Based Solutions for EDA Problems
• Reconvergence structures are inevitable due to logic sharing Existing GNN-based EDA solutions use an end-to-end flow for
in multi-level logic networks, and they are the main chal- specific EDA tasks wherein the labels are usually extracted from
lenges for logic analysis [17]. We treat them as first-class commercial EDA tools.
citizen and introduce novel solutions in our GNN model. To the best of our knowledge, the first GNN-based EDA tech-
We learn the representations of logic gates with many small sub- nique is applied to the test point insertion (TPI) problem, which is
circuits extracted from benchmark circuits. Experimental results formulated as a node binary classification problem and solved with
performed on large circuits show the efficacy and generalization a graph convolutional network [16]. The ground-truth labels are
capability of DeepGate. We organize the remainder of this paper as collected from commercial TPI tools, revealing whether a partic-
follows. We review related works in Section 2. Section 3 introduces ular node is "easy to observe" or not. CongestionNet [15] models
the DeepGate architecture, while in Section 4, we present the ex- the circuit as an undirected graph and trains a GAT model to pre-
perimental results on various circuits. Finally, Section 5 concludes dict the congestion of the final physical design on a per-cell basis.
this paper. GRANNITE [28] conducts power estimation using a DAG-GNN
model. Gate netlists are mapped onto graphs with per-node (gate)
2 RELATED WORKS and per-edge (net) features. They achieved good accuracy (less than
5.5% error across a diverse set of benchmarks) for fast (<1 second)
2.1 Graph Neural Networks average power estimation on designs up to 50k gate. Recently, [26]
Graph neural networks [10, 14] have received a lot of attention for proposes a GAT-based model named 𝑁 𝑒𝑡 2 for pre-placement net
their effectiveness in modeling non-structured data. By learning vec- length estimation.
torial representations on graphs via feature propagation and aggre- To solve a particular EDA problem, the above techniques typi-
gation, GNNs show convincing results in various domains [9, 12, 24]. cally pre-compute many node/edge features (e.g., SCOAP testability
The most popular GNN model employs a message-passing neural measures in [16]) and use existing GNN models to aggregate these
network architecture, which computes representation/hidden states features for solution findings. Consequently, the learned node fea-
hℓ𝑣 for node 𝑣 in a graph G at every layer ℓ and a final graph repre- tures cannot be transferred among related tasks, despite using the
sentation h G , as in [9]: same circuit graphs as inputs. More importantly, an effective rep-
resentation for circuits should be aware of their logic functions.
hℓ𝑣 = COMBINEℓ (hℓ𝑣−1 , AGGREGATEℓ ( {h𝑢ℓ −1 |𝑢 ∈ N (𝑣) })), ℓ = 1, ., 𝐿 (1)
However, existing solutions ignore it and only consider the struc-
hG = READOUT( {h𝐿𝑣 , 𝑣 ∈ V }) (2) tural information in their learning procedure.

668
Simulated
Logic Simulation Probability
𝑦1 , …, 𝑦5

𝑥2 | ℎ20
𝑥1 |ℎ10 ℎ1𝑇
𝑥1 | ℎ10 Forward Reverse
𝑦1 1 2 𝑦2 Layer Layer 𝑦ො 1 𝑦ො 2
𝑥2 |ℎ20 ℎ2𝑇
𝑥3 | ℎ30 𝑥4 | ℎ40
Mapping to Logic 𝑥3 |ℎ30 ℎ3𝑇 𝑦ො 3 𝑦ො 4
𝑦 3
3
4 𝑦4
AIG Optimization
𝑥4 |ℎ40 ℎ4𝑇
5 𝑥5 | ℎ50 𝑦ො 5
𝑦5 𝑥5 |ℎ50 ℎ5𝑇
DeepGate Model Regressor Predicted
Optimized Values
Circuits Collection Circuit Graphs
Circuits Collection Initialized Node
with labels
Vector Embeddings

(a) Circuit Data Preparation (b) Probabilities Prediction with DeepGate

Figure 2: The Overview of DeepGate.

Motivated by the above, we propose to learn a general and ef- There are many possibilities to annotate a circuit, e.g., the satis-
fective circuit representation without pre-computing any specific fiability of the circuit [3]. However, a good supervision task should
features, as detailed in the following section. satisfy the following condition: the labels should be easily obtained
while retaining rich information for both the logic function and
the structural information of the circuits. In DeepGate, we propose
3 PROPOSED SOLUTION
to use the signal probability on every node as supervision, which
3.1 Overview of DeepGate satisfies the above requirements: (i). It is relatively easy to obtain
Figure 2 presents the overview of the proposed DeepGate solution, highly-accurate probability values by running logic simulations on
consisting of two stages for the neural representation learning of many random input patterns, especially when the circuit size is
logic gates: limited; (ii). A unique yet important property of logic circuits that
makes circuit analysis challenging is the reconvergence structures,
• Circuit Data Preparation: Given a pool of circuit designs, and logic simulation is arguably the only way to obtain the actual
we use logic synthesis tools to transform them into a uni- value for such structures; (iii). The logic probability of each gate
fied AIG format. We then perform logic simulations on the itself plays an essential role in many EDA tasks.
circuits with sufficient random patterns to obtain the sig-
nal probability (i.e., the probability of node being logic ‘1‘) 3.3 GNN Model in DeepGate
on every node as supervision. We elaborate the details in
Given circuit graphs in AIG form, the objective of our GNN model
Section 3.2.
is to estimate the probability of every node such that it would be
• Probabilities Prediction with DeepGate: Given a circuit as close to the genuine signal probability as possible. Different
dataset and the logic-simulated probabilities as the supervi- from existing DAG-ConvGNNs [21, 27] and DAG-RecGNNs [3]
sion task, we introduce a novel GNN model dedicated for models that focus on learning the topological information in the
circuit graph analysis to learn the neural representations of graph, DeepGate is designed to learn both the circuit structural
logic gates, as detailed in Section 3.3. information and the computational behaviour of logic circuits, and
embed them as vectors on every logic gate.
We now elaborate on the detailed GNN model design in Deep-
3.2 Circuit Data Preparation
Gate. Given a circuit graph G, we embed the gate type of each
Some circuits are at the register-transfer level, while others are gate- node 𝑣 with one-hot encoding in x𝑣 . To be specific, as only primary
level netlists mapped with various libraries. Such heterogeneity inputs (PIs), And gates and Not gates are present in AIGs, we assign
across circuits is a challenge for GNN model development. To tackle a 3-d vector for each node according to its gate type. It should be
this problem, we resort to the logic synthesis tool ABC [5] and noted that instead of relying on the probability-based measure-
transform all circuits into the unified AIG format. If the original ments in previous works [16, 28], our model only requires gate type
circuit is too large, we extract small sub-circuits with circuit sizes information for the representation learning. We also have hidden
ranging from 30 to 3𝑘 gates. Note that, we test the effectiveness of states h𝑣 for every node, which is initialized randomly. Given these,
DeepGate on much larger circuits for its generalization capabilities. DeepGate resorts to attention-based aggregation design [21, 23]
The benefits of such circuit pre-processing flow include: (i). Only and the gated recurrent unit (GRU) [27] as the update function.
two logic gate types (i.e., 2-input AND gate and 1-input Not gate)
are considered, which would dramatically reduce representation Aggregation. We use the attention mechanism in the additive
learning difficulty; (ii). Applying logic synthesis introduces strong form to instantiate the AGGREGATE function in Equation (3),
relational inductive bias into the resulting circuit graphs; (iii). The wherein the aggregated message m𝑡𝑣 for a node 𝑣 at 𝑡 𝑡ℎ iteration is
constraint on circuit size facilitates efficient GNN training with computed by:
both reduced sizes of circuit graphs and less time for preparing
∑︁
m𝑡𝑣 = 𝑡
𝛼𝑢𝑣 h𝑢𝑡 𝑡
where 𝛼𝑢𝑣 = 𝑠𝑜 𝑓 𝑡𝑚𝑎𝑥 (𝑤1⊤ h𝑡𝑣−1 + 𝑤2⊤ h𝑢𝑡 ) (5)
supervision labels. 𝑢∈P (𝑣) 𝑢∈P (𝑣)

669
1 2 3 1 2 3 3.4 Skip Connection for Reconvergence
Fan-out
Node
Structure with Positional Encoding
5 4 5 4
In the previous section we have described the core components
Reconvergence 7
of DeepGate necessary for predicting the logic probabilities of
6 7 6
Node nodes. However, the logic inference on reconvergence nodes is
Skip
8 Connection 8
different from normal nodes and such structures are inevitable due
to logic sharing in multi-level logic networks. Hence, they are the
main challenge for logic probability analysis. To accommodate their
Figure 3: Information Propagation at Reconvergence Node
impact, we introduce the improvement into DeepGate to enable
special processing for reconvergence nodes as shown in Figure 3.
Firstly, we maintain the information of reconvergence nodes
𝑡 is a weighting coefficient that is computed by following during circuit data preparation, including its corresponding source
where 𝛼𝑢𝑣
fan-out node, and the logic level difference between the source
the query-key design as in usual attention mechanisms. To be spe-
nodes and reconvergence nodes. Secondly, we add direct edges
cific, h𝑡𝑣−1 serves as query, and representation of predecessors from
between the fan-out node and the reconvergence node, named skip
current iteration 𝑡, h𝑢𝑡 , serves as key. The intuition behind using
connection here. The new edges facilitate the information exchange
the attention mechanism for aggregation is that when we do the
from fan-out nodes to reconvergence nodes. Last but not least, we
logic computation in digital circuits, the controlling value of a logic
leverage the positional encoding technique [22] to differentiate the
gate determines the output of that gate. Therefore, controlling val-
skip connection and the normal connection. To be specific, the func-
ues are far more important than non-controlling values. To mimic
tion 𝛾 (𝐷) is a mapping of logic level difference 𝐷 between source
this behaviour, the attention mechanism can learn to assign high
fan-out node and reconvergence node into a higher dimensional
weights for controlling inputs of gates and give less importance to
space R2𝐿 :
the rest of the inputs.
Combine. We then use the GRU to instantiate the COMBINE 𝛾 (𝐷) = (sin(20 𝜋𝐷), cos(20 𝜋𝐷), · · · , sin(2𝐿−1 𝜋𝐷), cos(2𝐿−1 𝜋𝐷)) (7)
function for updating the hidden state of target node 𝑣:
The impact of the fanout node on the reconvergence nodes depends
h𝑡𝑣 = 𝐺𝑅𝑈 ([m𝑡𝑣 , x𝑣 ], h𝑡𝑣−1 ) (6) upon the distance between them. Generally speaking, the longer
the distance is, the lesser impact it has on the reconvergence node.
wherein m𝑡𝑣 , x𝑣 are concatenated together and treated as input, The above function can induce the knowledge of how much fanout
while h𝑡𝑣−1 is the past state of GRU. node can impact the result of reconvergence node into the model.
On the one hand, DeepGate adopts the recursive DAG-GNNs We assign the encoded vector as the edge attributes to skip connec-
functional defined in Equation (4). The reasons for using the re- tion and incorporate it into the coefficient calculation described in
current architecture are two-fold: (i). it is unrealistic for GNNs to Equation (5) as the third input.
capture the circuit’s functional and structural information with a
single forward propagation; (ii). the recurrent learning procedure
4 EXPERIMENTS
facilitates reaching stabilized node embeddings quickly.
On the other hand, our proposed GNN model differs from pre- 4.1 Datasets
vious DAG-GNNs [3, 21, 27] that initialize h0𝑣 as x𝑣 and treat the We extract many sub-circuits from four circuit benchmark suites:
aggregated message as the state of recurrent function. In contrast, ITC’99 [7], IWLS’05 [1], EPFL [2] and OpenCore [20], and follow the
we fix the gate type information of nodes x𝑣 as inputs for all iter- circuit data preparation flow described in Section 3.2 to transform
ations. Such employment can avoid the information vanishing of all circuits into a unified AIG format. We conduct logic simulations
gate properties during the long-term recursive propagation. with up to 100𝑘 random input patterns to obtain an accurate signal
Reversed Propagation Layer. In DeepGate, we also consider back- probability on every node.
ward information propagation, i.e., processing the graph in reversed Table 1 presents the statistics of the circuit dataset. #Subcircuits
topological order. One of the main reasons to introduce the back- shows the total number of subcircuits extracted from each bench-
ward layers in our framework is that logic implication and back- mark. As shown in the table, the constructed circuit dataset covers
tracking in the reversed order can be highly useful for predicting circuit sizes ranging from tens to thousands of nodes with different
the states of nodes. It also helps stabilize training, as proved in logic levels. Finally, there are 10, 824 circuits in total, and we create
sequence-to-sequence learning tasks [19]. 90/10 training/test splits for model training and evaluation.

Regressor. After 𝑇 iterations, we pass the hidden states of nodes


h𝑇𝑣 into a multi-layer perceptron (MLP), which computes a single 4.2 Evaluation Metric and Baselines
scalar for every node to regress the simulated probabilities. The To evaluate the performance of different GNN models, we calculate
weights of MLP are shared for nodes with the same gate types. We the average value of the absolute differences between the simulated
train the network to minimize the L1 loss between the prediction probability 𝑦 𝑣 and the predicted 𝑦ˆ𝑣 from DeepGate for all nodes V
𝑦ˆ𝑣 and the true probability 𝑦 𝑣 . in the circuits, as shown in equation (8). The smaller the value is,

670
Table 1: The Statistics of Circuit Training Dataset Table 2: The Performance Comparison of DeepGate with
other GNN models for Logic Probability Prediction
Benchmark #Subcircuits #Node #Level
Model Aggregator Avg. Prediction Error
EPFL 828 [52–341] [4–17] GCN Conv. Sum 0.1386
ITC99 7,560 [36–1,947] [3–23] Attention 0.1840
IWLS 1,281 [41–2,268] [5–24] DeepSet 0.2541
Opencores 1,155 [51–3,214] [4–18] GatedSum 0.1995
Total 10,824 [36–3,214] [3–24] DAG-ConvGNN Conv. Sum 0.2215
Attention 0.2398
DeepSet 0.2431
the better the model performs. GatedSum 0.2333
DAG-RecGNN Conv. Sum 0.0328
1 ∑︁ (T=10) DeepSet 0.0302
𝐴𝑣𝑔. 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝐸𝑟𝑟𝑜𝑟 = |𝑦 𝑣 − 𝑦ˆ𝑣 | (8)
𝑁
𝑣 ∈V GatedSum 0.0329
DeepGate Attention w/o SC 0.0234
We consider three GNN models: GCN, DAG-ConvGNN, and (T=10) Attention w/ SC 0.0204
DAG-RecGNN. GCN model treats the circuit graphs as undirected
graphs in representation learning. DAG-ConvGNN model follows Table 3: The Performance Comparison of DeepGate and
the settings defined in Equation (3). For DAG-RecGNN model, we DeepSet on Five Large Circuits
adopt the same COMBINE function and the reversed propagation
layer design in DeepGate, as depicted in Section 3.3. As for the GNN Design #Nodes Levels DeepSet DeepGate Reduction
model in DeepGate, it contains additional attention mechanism and Arbiter 23.7K 173 0.0277 0.0073 73.56%
skip connection (SC). Under every setting, we evaluate 4 different Squarer 36.0K 373 0.0495 0.0346 30.16%
aggregator designs, which include representative works for DAG Multiplier 47.3K 521 0.0220 0.0159 27.94%
learning, i.e., Convolutional Sum (abbreviated as Conv. Sum) [18], 80386 Processor 13.2K 122 0.0534 0.0387 27.56%
Attention [21, 23], GatedSum [27] and DeepSet [3]. Viper Processor 40.5K 133 0.0520 0.0389 25.18%
In order to make the comparison fair, we instantiate all models
with 𝑑 = 64 for the node hidden states h𝑣 and design the other
parameterized functions to have a similar number of tunable param- DeepSet aggregator. Hence, using the dedicated attention mecha-
eters. For DAG-RecGNNs and our DeepGate model, a forward layer nism benefits logic representation learning. Third, equipped with
is followed by a reversed layer, and 𝑇 = 10 iterations of message skip connection design, DeepGate can further reduce the prediction
passing are performed to obtain the final embeddings. We choose error from 0.0234 to 0.0204, which reveals the efficacy of introducing
𝐿 = 8 in Equation (7) for the skip connection setting. For training, the reconvergence knowledge into the model design. To summarize,
all models are optimized for 60 epochs using the ADAM optimizer with only the gate type information and the connectivity between
with a learning rate of 1 × 10−4 . We use the topological batching gates, DeepGate learns to predict highly-accurate probabilities for
technique introduced in [21] to accelerate the training. logic gates.
As we observe that DAG-RecRNN with DeepSet aggregator (ab-
breviated as DeepSet for the following discussion) performs better
4.3 Probability Prediction
than the other baselines, in later results, we only compare DeepGate
(w/ skip connection) with it.
4.3.1 Comparison of DeepGate with Baseline Solutions.
4.3.2 Results on Large Circuits.
Table 2 compares DeepGate with other baseline solutions in
terms of prediction error. From this table, we have several obser- Furthermore, we evaluate DeepGate on five circuit designs that
vations: First, both GCN and DAG-ConvGNN are subject to poor are substantially larger than the circuits it saw during training. The
performance for probability prediction, mainly due to their lack of circuit statistics and the prediction error of both DeepGate and
ability to model the computational behaviours of circuits. For in- DeepSet are shown in Table 3. The number of gates in these designs
stance, the best GCN model, equipped with Conv. Sum, gives 0.1386 is two orders of magnitude more than that of the training circuits.
of prediction error, which in turn is even higher than the worst per- We can observe that DeepGate achieves similar prediction accu-
forming model of DAG-RecGNN. Therefore, only by incorporating racy as that on small circuits, and it outperforms DeepSet in these
the logical ordering into the model design and conducting the prop- large circuits by a large margin. Such results clearly demonstrate
agation recurrently, will make the model perform well. It shows the the generalization capability of DeepGate. In particular, DeepGate
advantage of DAG-RecGNN implementation with dedicated recur- achieves 73.56% prediction error reduction on Arbiter. This is be-
rent scheme and reversed layer design discussed in Section 3.3 over cause, the Arbiter circuit is designed to accommodate access from
simpler GNN architectures. Second, among all models, DeepGate multiple requests, and it contains repetitive logic units with many
with attention alone achieves significant prediction error reduc- reconvergence structures. As DeepGate treats such structures as
tion. It brings 22.76% relative improvement compared with the best a first-class citizen in the GNN model, it can generate much more
baseline solution, which is the DAG-RecGNN model equipped with accurate predictions.

671
Table 4: The Performance of DeepGate with and without unified AIG format and introduce a novel GNN model with circuit
Circuit Transformation knowledge as priors for effective representation learning. Using
informative signal probability as supervision tasks on small sub-
w/o Tran. w/ Tran. Pre-trained circuits, we show DeepGate can generalize to large circuits with
EPFL 0.0442 0.0292 0.0142 accurate predictions without any pre-computed features.
IWLS 0.0447 0.0342 0.0209 While showing promising results, the current DeepGate model
is still in its infancy. For example, we could introduce other infor-
4.4 Discussion mative supervision tasks (e.g., logic inference and Boolean satisfia-
4.4.1 Effectiveness of Circuit Transformation. bility) to achieve better representations for logic gates. We could
also add more circuits for training to build a large-scale founda-
DeepGate uses the logic synthesis tool to transform the circuits tion model for logic circuits [4]. Moreover, in our future work, we
from different sources into unified AIG forms. One may wonder plan to apply the representations learned in DeepGate onto many
the performance of DeepGate if the network is directly trained on downstream EDA tasks (e.g., power estimation, logic reduction, and
the original circuits, wherein other gate types (e.g., XOR, NAND, equivalence checking). These tasks are directly related to signal
NOR, and OR) are also included. To investigate the effectiveness of probability analysis, and we believe DeepGate can achieve satisfac-
the circuit transformation in DeepGate, we conduct the controlled tory results without much effort in finetuning the model.
experiments on EPFL and IWLS benchmarks, as shown in Table 4.
Take EPFL as an example, we extract 375 sub-circuits from the 6 ACKNOWLEDGEMENT
original designs, and develop two versions: the ones with the origi-
This work was supported in part by Huawei Technologies Co. Ltd.
nal 6 gates types and the other with AIG transformation. For each
version, we train the DeepGate model from scratch. The only dif- REFERENCES
ference is that for the former version of the dataset, we assign 7-d [1] Christoph Albrecht. 2005. IWLS 2005 benchmarks. In IWLS.
one-hot encoding for the node feature x𝑣 . As can be observed from [2] Luca Amarú, Pierre-Emmanuel Gaillardon, and Giovanni De Micheli. 2015. The
Table 4, DeepGate trained on AIGs performs better than the one EPFL combinational benchmark suite. In IWLS.
[3] Saeed Amizadeh, Sergiy Matusevych, and Markus Weimer. 2019. Learning To
trained on the original circuits by a large margin (33.94% relative Solve Circuit-SAT: An Unsupervised Differentiable Approach. In ICLR.
prediction error reduction on EPFL). The same observation can be [4] Rishi Bommasani et al. 2021. On the opportunities and risks of foundation models.
obtained from the results on IWLS circuits. arXiv preprint arXiv:2108.07258 (2021).
[5] Robert Brayton and Alan Mishchenko. 2010. ABC: An academic industrial-
Such improvements originate from the benefit of circuit trans- strength verification tool. In CAV. Springer, 24–40.
formation because when only two logic gate types are considered, [6] Tom B Brown et al. 2020. Language models are few-shot learners. arXiv preprint
arXiv:2005.14165 (2020).
representation learning difficulty is reduced dramatically without [7] Scott Davidson. 1999. Characteristics of the ITC’99 benchmark circuits. In ITSW.
any impact on circuit functionalities. Also, we manually check the [8] Jacob Devlin et al. 2018. Bert: Pre-training of deep bidirectional transformers for
usage frequency of different gate types in the original formats, language understanding. arXiv preprint arXiv:1810.04805 (2018).
[9] J. Gilmer et al. 2017. Neural message passing for quantum chemistry. In ICML.
and observe that some gates types (e.g., XOR and NAND) are used [10] William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation
much less frequently. Such in-balanced gate distributions may lead learning on large graphs. In NIPS.
to insufficient training, causing higher prediction errors. [11] Xu Han et al. 2021. Pre-trained models: Past, present and future. AI Open (2021).
[12] Weihua Hu et al. 2020. Open graph benchmark: Datasets for machine learning
Additionally, we directly apply the pre-trained DeepGate model on graphs. arXiv preprint arXiv:2005.00687 (2020).
on the merged AIG dataset, as shown in Section 4.1 for comparison. [13] Guyue Huang et al. 2021. Machine learning for electronic design automation: A
survey. TODAES (2021).
As can be observed, DeepGate trained on the dataset consisting of [14] Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph
different benchmarks can further reduce the prediction errors by convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
51.37%. It supports the claim that unifying different circuit designs [15] Robert Kirby et al. 2019. CongestionNet: Routing congestion prediction using
deep graph neural networks. In VLSI-SoC. IEEE.
into a common intermediate representation can help the model [16] Yuzhe Ma et al. 2019. High performance graph convolutional networks with
learn a better representation of logic gates. applications in testability analysis. In DAC.
[17] MW Roberts and PK Lala. 1987. Algorithm to detect reconvergent fanouts in
4.4.2 Impact of Recurrence Iterations. logic circuits. IEEE Proceedings Computers and Digital Techniques (1987).
[18] Daniel Selsam et al. 2018. Learning a SAT Solver from Single-Bit Supervision. In
During inference, the number of iterations 𝑇 can be set as differ- International Conference on Learning Representations.
[19] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence
ent values. The higher the value, the higher the computational cost. Learning with Neural Networks. arXiv:1409.3215
We enumerate 𝑇 from 1 to 50, and we observe that our GNN model [20] Opencores Team. [n. d.]. Opencores. https://opencores.org/.
is able to decrease the prediction loss as 𝑇 increases. However, [21] V. Thost and J. Chen. 2021. Directed Acyclic Graph Neural Networks. In ICLR.
[22] Ashish Vaswani et al. 2017. Attention is all you need. In NIPS.
the prediction error converges quickly at around 𝑇 = 10, despite [23] Petar Veličković et al. 2017. Graph Attention Networks. ICLR (2017).
the circuit size. Such experimental results further demonstrate the [24] Le Wu et al. 2018. Socialgcn: An efficient graph convolutional network based
scalability of the proposed DeepGate solution. model for social recommendation. arXiv preprint arXiv:1811.02815 (2018).
[25] Zonghan Wu et al. 2020. A comprehensive survey on graph neural networks.
IEEE transactions on neural networks and learning systems (2020).
5 CONCLUSION AND FUTURE WORK [26] Zhiyao Xie et al. 2021. Net2: A Graph Attention Network Method Customized
for Pre-Placement Net Length Estimation. In ASP-DAC. IEEE.
This paper proposes DeepGate, a novel representation learning [27] Muhan Zhang et al. 2019. D-VAE: A Variational Autoencoder for Directed Acyclic
solution that effectively embeds both logic functions and structural Graphs. arXiv:1904.11088
[28] Yanqing Zhang, Haoxing Ren, and Brucek Khailany. 2020. GRANNITE: Graph
information of a circuit as vectors on each gate. In DeepGate, we neural network inference for transferable power estimation. In DAC. IEEE.
construct easy-to-learn circuit graphs by transforming circuits into

672

You might also like